Real-Time Semantic Segmentation in Mobile Devices

Nov 27, 2020 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitdeep_learningreadme_akirasosa_mobile-semantic-segmentation

In today’s world, mobile applications are becoming increasingly sophisticated, offering features that were once thought possible only on high-powered computing systems. One such advanced capability is semantic segmentation, which involves classifying each pixel in an image into various categories. In this article, we will delve into a project that demonstrates semantic segmentation specifically designed for real-time applications on mobile devices.

Project Overview

This project is a stellar example of implementing semantic segmentation for mobile apps, primarily focusing on detecting hair segments with a commendable accuracy of 0.89 Intersection over Union (IoU). By leveraging advanced architectures such as MobileNetV2 and U-Net, this model strikes an effective balance between accuracy and speed on mobile devices.

Dataset

The dataset used in this project is Labeled Faces in the Wild (LFW). This facial dataset provides the necessary images to train our model effectively.

Example Application

iOS Application
Android Application (TODO)

Requirements

Python 3.8
Install dependencies: pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
CoreML for iOS application.

About the Model

The primary model in this repository is named MobileNetV2_unet. It’s designed as a typical U-Net architecture, which consists of encoder and decoder parts that utilize depthwise convolutional blocks provided by MobileNets. An input image is processed to a size of 132 pixels before being reconstructed back to its original dimensions after scoring.

Steps to Training

Data Preparation

Data is sourced from the LFW dataset. For creating mask images necessary for training, refer to issue #11. Once you have both images and masks, organize them as follows:

data/
   lfw/
       raw/
           images/
               0001.jpg
               0002.jpg
           masks/
               0001.ppm
               0002.ppm

Training

For training with an input size of 224 x 224 pixels, you can utilize the pre-trained weights of MobileNetV2, which will automatically download during the training process. Use the following command to initiate the training procedure:

cd src
python run_train.py params002.yaml

The training process uses the Dice coefficient as the loss function to ensure the model learns effectively.

Pre-trained Model

Here are the details about the pre-trained model:

Input Size: 224
IoU: 0.89
Download Pre-trained Model

Converting the Model

This project includes scripts to adapt models for mobile applications. The notable script for iOS conversion is:

run_convert_coreml.py – Converts the trained PyTorch model into a CoreML model for iOS applications.

TBD (To Be Done)

[x] Report speed vs accuracy on mobile devices.
[ ] Convert PyTorch model to Android using TensorFlow Lite.

Troubleshooting

If you encounter any issues during the implementation, consider the following troubleshooting ideas:

Ensure all dependencies are correctly installed as per the requirements.
Double-check the data paths for images and masks.
Look for any error messages during training in the console and search for solutions online.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox