Gaze estimation is a crucial aspect in many applications, from virtual reality to user interaction. This guide illustrates how to set up and run an unofficial PyTorch implementation of the MPIIGaze and MPIIFaceGaze datasets, providing you with a seamless experience in gaze estimation.
Requirements
- Operating System: Linux (Tested on Ubuntu only)
- Python Version: 3.7
To install the required packages, execute the following command:
pip install -r requirements.txt
Downloading the Datasets and Preprocessing
To kick off your gazing adventure, you need to download and preprocess the datasets. Here’s how you can do it:
MPIIGaze
bash scripts/download_mpiigaze_dataset.sh
python tools/preprocess_mpiigaze.py --dataset datasets/MPIIGaze -o datasets
MPIIFaceGaze
bash scripts/download_mpiifacegaze_dataset.sh
python tools/preprocess_mpiifacegaze.py --dataset datasets/MPIIFaceGaze_normalized -o datasets
Usage
This repository utilizes YACS for configuration management. The default parameters are specified in gaze_estimation/config/defaults.py, which should not be modified directly. Instead, you can overwrite the defaults by using a YAML file like the following:
configs/mpiigaze_lenet_train.yaml
Training and Evaluation
To train a model excluding data from the person with ID 0 and testing it on that individual, run the following commands:
python train.py --config configs/mpiigaze_lenet_train.yaml
python evaluate.py --config configs/mpiigaze_lenet_eval.yaml
For comprehensive training and evaluation, use the scripts provided:
bash scripts/run_all_mpiigaze_lenet.sh
bash scripts/run_all_mpiigaze_resnet_preact.sh
Results
Here’s a quick look at the results of the different models:
MPIIGaze Model
| Model | Mean Test Angle Error (degree) | Training Time |
|---|---|---|
| LeNet | 6.52 | 3.5 s/epoch |
| ResNet-preact-8 | 5.73 | 7 s/epoch |
MPIIFaceGaze Model
| Model | Mean Test Angle Error (degree) | Training Time |
|---|---|---|
| AlexNet | 5.06 | 135 s/epoch |
| ResNet-14 | 4.83 | 62 s/epoch |
Demo Program
The demo program enables gaze estimation through a webcam. Follow these steps:
- Download the dlib pretrained model for landmark detection:
- Calibrate the camera and save the results in the same format as the sample file
data/calib/sample_params.yaml. - Run the demo by specifying the model path and camera calibration results in the configuration file
configs/demo_mpiigaze_resnet.yaml:
bash scripts/download_dlib_model.sh
python demo.py --config configs/demo_mpiigaze_resnet.yaml
Troubleshooting
In case you encounter issues or have questions, consider these troubleshooting tips:
- Ensure you have fulfilled all the requirements, including the specified version of Python.
- Double-check if the dataset and models are correctly downloaded and preprocessed.
- If the code isn’t executing as expected, verify your configuration file for any discrepancies.
- Consult the GitHub issues page for common problems faced by other users.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
An Analogy for Understanding the Code
Imagine you’re a chef preparing a multi-course meal. Each course represents a distinct phase of your gaze estimation project:
- The ingredients (data) need to be sourced properly (downloaded) from the market (repositories).
- You must prep the ingredients (preprocess the data) so they are ready for cooking (model training).
- Each cooking step (function calls) requires the right technique and timing (parameters and evaluation) to achieve a delicious outcome (accurate predictions).
- Finally, you must plate your meal (demo) just right, ensuring it looks as good as it tastes (the output from your webcam).
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

