How to Implement Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

Jul 29, 2021 | Data Science

Are you ready to dive into the world of stereo matching with cutting-edge techniques? In this guide, we’ll explore how to set up and use the implementation of the research paper **Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation** presented at CVPR 2022. Whether you’re looking to train your models or use pre-trained ones, we’ll walk through everything step-by-step!

Getting Started: Cloning the Repository

First things first! Clone the repository to your local machine.

git clone https://github.com/MegEngine/MegEngine

Datasets: Downloading and Preparing

The proposed dataset is approximately 400GB, and there are two efficient ways to download it:

  • Using shell scripts:
  • bash dataset_download.sh

    This will automatically extract the dataset into stereo_trainset.

  • Downloading from BaiduCloud: Download here (Extraction code: aa3g). Extract the tar files manually.

Understanding Disparity Format

In this research, the disparity data is stored in .png format. You can load this data using OpenCV with the following code:

python
def get_disp(disp_path):
    disp = cv2.imread(disp_path, cv2.IMREAD_UNCHANGED)
    return disp.astype(np.float32)

Think of this function like a key that unlocks the treasure chest, where your treasure is all the rich data you need for processing!

Using Public Datasets

In addition to the proposed dataset, several public datasets are recommended for training, including:

Dependencies: Setting Up Your Environment

Make sure to have the following dependencies installed:

CUDA Version: 10.1
Python Version: 3.6.9
- MegEngine v1.8.2
- opencv-python v3.4.0
- numpy v1.18.1
- Pillow v8.4.0
- tensorboardX v2.1

To install these dependencies seamlessly, use:

bash
python3 -m pip install -r requirements.txt

You can also use Docker for quick code execution:

bash
docker run --gpus all -it -v tmp:tmp ylmegviicrestereoshotwell tmp/disparity.png

Inference: Running the Model

To perform inference, you need a pretrained model. Download it from here and run:

bash
python3 test.py --model_path path_to_mge_model --left img/test/left.png --right img/test/right.png --size 1024x1536 --output disparity.png

Training: Kickstarting Your Model

To train your model, modify the configuration settings in cfgs/train.yaml and execute:

bash
python3 train.py

For monitoring the training progress, launch TensorBoard:

bash
tensorboard --logdir ./train_log

Navigate to the page at localhost:6006 for visualization.

Troubleshooting: Common Issues

If you encounter issues, here are some troubleshooting tips:

  • Ensure the dataset is correctly downloaded and extracted.
  • Check that all dependencies are properly installed without version conflicts.
  • If TensorBoard fails to launch, verify that your log directory is set correctly.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Acknowledgements

This guide acknowledges the contributions of several previous works, including:

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox