Welcome, fellow AI enthusiasts! Today, we’re diving into an exciting project that explores how to create depth maps and ego-motion estimates from video using unsupervised learning. This codebase, known as SfMLearner, implements the system described in the groundbreaking paper by Tinghui Zhou and his colleagues, presented at CVPR 2017. Let’s unravel the intricacies and get you started!
Prerequisites
Before we dive into the code, make sure you have the following setup:
- Tensorflow 1.0
- CUDA 8.0
- Ubuntu 16.04
Running the Single-View Depth Demo
To run the demo of the single-view depth prediction model, follow these steps:
- Download the pre-trained model from this Google Drive and place the model files under the
models
directory. - Utilize the provided Jupyter Notebook
demo.ipynb
to execute the demo.
Preparing Training Data
To prepare your training data correctly, you need to format it before feeding it into the model. Here’s how:
KITTI Dataset
Start by downloading the KITTI dataset using this script. Once downloaded, run the following command:
bash python dataprepare_train_data.py --dataset_dir=path_to_raw_kitti_dataset --dataset_name=kitti_raw_eigen --dump_root=path_to_resulting_formatted_data --seq_length=3 --img_width=416 --img_height=128 --num_threads=4
For pose experiments, download the KITTI odometry split from here and change the --dataset_name
option to kitti_odom
while preparing the data.
Cityscapes Dataset
For Cityscapes, download the following packages:
- leftImg8bit_sequence_trainvaltest.zip
- camera_trainvaltest.zip
After downloading, run this command:
bash python dataprepare_train_data.py --dataset_dir=path_to_cityscapes_dataset --dataset_name=cityscapes --dump_root=path_to_resulting_formatted_data --seq_length=3 --img_width=416 --img_height=171 --num_threads=4
Keep in mind that the height for Cityscapes is set to 171 because we crop the bottom part of the image containing the car logo.
Training the Model
Once your data is formatted, you can commence training the model. Execute the following command:
bash python train.py --dataset_dir=path_to_the_formatted_data --checkpoint_dir=where_to_store_checkpoints --img_width=416 --img_height=128 --batch_size=4
To visualize your training progress, start a TensorBoard session:
bash tensorboard --logdir=path_to_tensorflow_log_files --port=8888
Open your browser and navigate to localhost:8888. After approximately 100K iterations, you should start seeing promising depth predictions when training on the KITTI dataset.
Evaluation
To evaluate your trained model, you’ll need to follow a particular process detailed below.
Depth Evaluation on KITTI
Download our predictions from this Google Drive and place them into kitti_eval
. Run the command:
bash python kitti_eval/eval_depth.py --kitti_dir=path_to_raw_kitti_dataset --pred_file=kitti_eval/kitti_eigen_depth_predictions.npy
If executed correctly, you’ll retrieve evaluation metrics reported in Table 1 of the paper.
Pose Estimation Evaluation
Download the predictions and ground-truth pose data from this Google Drive. Run the following command:
bash python kitti_eval/eval_pose.py --gtruth_dir=directory_of_ground_truth_trajectory_files --pred_dir=directory_of_predicted_trajectory_files
For instance, to evaluate results for Sequence 10, you’d run:
bash python kitti_eval/eval_pose.py --gtruth_dir=kitti_eval/pose_data/ground_truth/10 --pred_dir=kitti_eval/pose_data/ours_results/10
Testing on KITTI
Once the model is trained, obtain depth predictions for the KITTI eigen test split by executing:
bash python test_kitti_depth.py --dataset_dir path_to_raw_kitti_dataset --output_dir path_to_output_directory --ckpt_file path_to_pre-trained_model_file
For pose predictions, run:
bash python test_kitti_pose.py --test_seq [sequence_id] --dataset_dir path_to_KITTI_odometry_set --output_dir path_to_output_directory --ckpt_file path_to_pre-trained_model_file
Troubleshooting
While working with the SfMLearner codebase, you might encounter a few common challenges:
- Data Formatting Errors: Ensure that the paths specified for data directories are accurate and accessible.
- Dependency Issues: Double-check that all the prerequisites, such as Tensorflow and CUDA, are properly installed. A mismatch may cause runtime errors.
- TensorBoard Not Displaying: Verify that the TensorBoard is pointed to the correct log directory and that you’re using the right port.
- Pre-trained Model Issues: Make certain that you’ve placed the pre-trained model files in the correct
models
folder.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.