How to Implement Self-Similarity Grouping for Person Re-identification

Jul 9, 2022 | Data Science

If you’re interested in cutting-edge techniques in artificial intelligence, specifically in person re-identification, then the Self-Similarity Grouping (SSG) approach is worth exploring. This simple yet effective unsupervised cross-domain adaptation method has made significant strides in improving results across various datasets. In this blog, we’ll guide you through the implementation of SSG, ensuring that even the greenest of developers can leap into this fascinating realm.

Setup: Getting Started

Before you can start implementing the SSG approach, you’ll need to ensure you have the right environment and data in place:

Datasets: You will need both a source dataset and a target dataset.
Pre-trained Model: A model pre-trained on the chosen source dataset is essential for this implementation.

Requirements

Make sure you have the following framework installed:

PyTorch

Running the Experiments

Let’s break down the steps to run the SSG approach in your development environment:

Step 1: Train on the Source Dataset

Start by training the model on your designated source dataset. Use the following command:

shell
python source_train.py --dataset name_of_source_dataset --resume dir_of_source_trained_model --data_dir dir_of_source_data --logs_dir dir_to_save_source_trained_model

You can download pre-trained models for datasets like Market1501, DukeMTMC, and MSMT17 from Google Drive. If you encounter issues with source_train.py, consider checking [DomainAdaptiveReID](https://github.com/LcDog/DomainAdaptiveReID) for a workaround or using the provided pre-trained model.

Step 2: Run Self-Similarity Grouping

The next step involves executing the self-similarity grouping algorithm.

shell
python selftraining.py --src_dataset name_of_source_dataset --tgt_dataset name_of_target_dataset --resume dir_of_source_trained_model --iteration number_of_iteration --data_dir dir_of_source_target_data --logs_dir dir_to_save_model_after_adaptation --gpu-devices gpu_ids --num-split number_of_split

Alternatively, you can use the shell script to run this command:

shell
sh run.sh

Step 3: Run Clustering-guided Semi-Supervised Training

Finally, execute the semi-supervised training:

shell
python semitraining.py --src_dataset name_of_source_dataset --tgt_dataset name_of_target_dataset --resume dir_of_source_trained_model --iteration number_of_iteration --data_dir dir_of_source_target_data --logs_dir dir_to_save_model_after_adaptation --gpu-devices gpu_ids --num-split number_of_split --sample sample_method

Understanding the Code with an Analogy

Think of the SSG approach like a detective solving a case in two different neighborhoods (source and target datasets). The detective (your model) first studies the first neighborhood (source dataset) to learn about the typical suspects (features) operating there. After gathering data, the detective tries to identify these suspects in a similar environment (target dataset) without getting explicit instructions (labels). By utilizing clues from past cases (pre-trained models), the detective enhances their chances of successfully identifying the culprits (recognizing features) in the second neighborhood.

Results

Here are some performance metrics after training and adaptation:

Step 1: After Training on Source Dataset

Source Dataset	Rank-1	mAP
DukeMTMC	82.6	70.5
Market1501	92.5	80.8
MSMT17	73.6	48.6

Step 2: After Adaptation

SRC –> TGT
	Rank-1	mAP	Rank-1	mAP	Rank-1	mAP
Market1501 –> DukeMTMC	30.5	16.1	73.0	53.4	76.0	60.3
DukeMTMC –> Market1501	54.6	26.6	80.0	58.3	86.2	68.7
Market1501 –> MSMT17	8.6	2.7	31.6	13.2	37.6	16.6
DukeMTMC –> MSMT17	12.38	3.82	32.2	13.3	41.6	18.3

Troubleshooting

If you run into some hiccups during implementation, here are a few issues and how to troubleshoot them:

The pre-trained model is trained with PyTorch 0.4.1, so you may encounter errors if loading it with a higher version. A helpful link is provided for guidance.
Be aware that source_training.py might contain some bugs. Using the pre-trained baseline model is recommended.
For best results, it is suggested to use two GPUs with a batch size of 32. Note that experimental results might slightly vary from those reported in the paper (+- 1%).

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox