How to Implement Semantic Segmentation with FCNResNet101 and BiSeNetV2

Jun 18, 2024 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitcomputer_visionreadme_WillBrennan_SemanticSegmentation-1

Semantic segmentation has transformed how we interpret images through the segmentation of various objects within scenes. In this blog, we’ll guide you through the steps needed to implement semantic segmentation using two powerful models: FCNResNet101 for accurate results and BiSeNetV2 for real-time processing. Let’s dive in!

Overview of the Project

This project serves as a modern alternative to the previous Skin Detection project. We utilize advanced deep learning models—FCNResNet101 and BiSeNetV2—to perform multi-label semantic segmentation based on comprehensive labelme annotations. These models are pre-trained and available in the pretrained directory.

Getting Started

Here’s how you can setup your environment and model:

Clone the repository ensuring to pull the pretrained models using:

bash git lfs pull

Once cloned, you’ll need to set up the environment using conda. Create and activate your conda environment as follows:

bash conda env create -f environment.yml
conda activate semantic_segmentation

If you are on Windows, carry out the following:

bash conda init powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Pre-Trained Segmentation Models

This project includes several pretrained models stored in the pretrained directory. Here’s how to utilize them:

To infer segmentation masks on your images, you can run:

python evaluate_images.py --images ~Pictures --model pretrained/model_segmentation_skin_30.pth --model-type FCNResNet101 --display

To save the output, replace –display with –save.

Skin Segmentation

We trained our skin segmentation model on a custom dataset of 150 images from COCO, capturing a vast array of skin colors and lighting conditions. Here’s a glimpse of what it detects:

Pizza Topping Segmentation

This model was trained with a specific dataset of 89 images featuring various pizza toppings. However, it lacks performance due to insufficient data, which means it struggles with some toppings. The toppings it detects are:

Chilli
Ham
Jalapenos
Mozeralla
Mushrooms
Olive
Pepperoni
Pineapple
Salad
Tomato

Training New Projects

To train your own model, you have two options:

Create new annotations using the labelme tool:

bash labelme

Or, utilize existing categories from COCO by converting annotations:

bash python extract_from_coco.py --images ~datasets/coco/val2017 --annotations ~datasets/coco/annotations/instances_val2017.json --output ~datasets/my_cat_images_val --categories cat

To visualize and confirm the images, run:

bash python check_dataset.py --dataset ~datasets/my_cat_images_val

bash python check_dataset.py --dataset ~datasets/my_cat_images_val --use-augmentation

If you are satisfied with the dataset, you can start training your model:

bash python train.py --train ~datasets/my_cat_images_train --val ~datasets/my_cat_images_val --model-tag segmentation_cat --model-type FCNResNet101

This phase may take a while depending on your image volume. Keep an eye on the Tensorboard logs for tracking.

Troubleshooting Ideas

If you encounter issues while following along, consider the following troubleshooting steps:

Ensure that git-lfs is correctly set up to handle your model files.
Verify your conda environment is activated properly.
Check that your Python version is compatible with the packages used in this project.
For errors related to missing files, ensure you have pulled the latest version of the repo and its large files.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

In conclusion, semantic segmentation is a powerful tool in the realm of computer vision. By leveraging pre-trained models like FCNResNet101 and BiSeNetV2, we can achieve remarkable accuracy and efficiency. Understanding these models can be akin to becoming a master chef in the kitchen—know your ingredients (data) well, and you can create an exquisite dish (segmentation) every time!

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox