Semantic segmentation has transformed how we interpret images through the segmentation of various objects within scenes. In this blog, we’ll guide you through the steps needed to implement semantic segmentation using two powerful models: FCNResNet101 for accurate results and BiSeNetV2 for real-time processing. Let’s dive in!
Overview of the Project
This project serves as a modern alternative to the previous Skin Detection project. We utilize advanced deep learning models—FCNResNet101 and BiSeNetV2—to perform multi-label semantic segmentation based on comprehensive labelme annotations. These models are pre-trained and available in the pretrained directory.
Getting Started
Here’s how you can setup your environment and model:
- Clone the repository ensuring to pull the pretrained models using:
- Once cloned, you’ll need to set up the environment using conda. Create and activate your conda environment as follows:
- If you are on Windows, carry out the following:
bash git lfs pull
bash conda env create -f environment.yml
conda activate semantic_segmentation
bash conda init powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
Pre-Trained Segmentation Models
This project includes several pretrained models stored in the pretrained directory. Here’s how to utilize them:
- To infer segmentation masks on your images, you can run:
- To save the output, replace –display with –save.
python evaluate_images.py --images ~Pictures --model pretrained/model_segmentation_skin_30.pth --model-type FCNResNet101 --display
Skin Segmentation
We trained our skin segmentation model on a custom dataset of 150 images from COCO, capturing a vast array of skin colors and lighting conditions. Here’s a glimpse of what it detects:
Pizza Topping Segmentation
This model was trained with a specific dataset of 89 images featuring various pizza toppings. However, it lacks performance due to insufficient data, which means it struggles with some toppings. The toppings it detects are:
- Chilli
- Ham
- Jalapenos
- Mozeralla
- Mushrooms
- Olive
- Pepperoni
- Pineapple
- Salad
- Tomato
Training New Projects
To train your own model, you have two options:
- Create new annotations using the labelme tool:
bash labelme
bash python extract_from_coco.py --images ~datasets/coco/val2017 --annotations ~datasets/coco/annotations/instances_val2017.json --output ~datasets/my_cat_images_val --categories cat
To visualize and confirm the images, run:
bash python check_dataset.py --dataset ~datasets/my_cat_images_val
bash python check_dataset.py --dataset ~datasets/my_cat_images_val --use-augmentation
If you are satisfied with the dataset, you can start training your model:
bash python train.py --train ~datasets/my_cat_images_train --val ~datasets/my_cat_images_val --model-tag segmentation_cat --model-type FCNResNet101
This phase may take a while depending on your image volume. Keep an eye on the Tensorboard logs for tracking.
Troubleshooting Ideas
If you encounter issues while following along, consider the following troubleshooting steps:
- Ensure that git-lfs is correctly set up to handle your model files.
- Verify your conda environment is activated properly.
- Check that your Python version is compatible with the packages used in this project.
- For errors related to missing files, ensure you have pulled the latest version of the repo and its large files.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
In conclusion, semantic segmentation is a powerful tool in the realm of computer vision. By leveraging pre-trained models like FCNResNet101 and BiSeNetV2, we can achieve remarkable accuracy and efficiency. Understanding these models can be akin to becoming a master chef in the kitchen—know your ingredients (data) well, and you can create an exquisite dish (segmentation) every time!
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.