Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Aug 10, 2024 | Educational

This article serves as a step-by-step guide to utilizing the BiRefNet library for high-resolution dichotomous image segmentation, background removal, camouflaged object detection, and more. We’ll explore how to install it, load the model, and perform inference with it. By the end of this guide, you’ll be equipped to enrich your image processing tasks using this powerful library!

Installation
Loading BiRefNet
Using BiRefNet for Inference
Troubleshooting

1. Installation

To get started, you’ll need to install the necessary packages. Simply run the following command in your terminal:

pip install -qr https://raw.githubusercontent.com/ZhengPeng7/BiRefNet/main/requirements.txt

2. Loading BiRefNet

Now that you’ve installed the required packages, it’s time to load the BiRefNet model. You have multiple options to do this:

Option 1: Load Codes + Weights from HuggingFace

This method is straightforward but may not always have the latest code updates:

from transformers import AutoModelForImageSegmentation
birefnet = AutoModelForImageSegmentation.from_pretrained("zhengpeng7/BiRefNet-portrait", trust_remote_code=True)

Option 2: Use Codes from GitHub + Weights from HuggingFace

You can maintain the latest code by cloning the GitHub repository:

!git clone https://github.com/ZhengPeng7/BiRefNet.git
cd BiRefNet
from models.birefnet import BiRefNet
birefnet = BiRefNet.from_pretrained("zhengpeng7/BiRefNet-portrait")

Option 3: Use Both Locally

If you prefer downloading the code and weights locally:

import torch
from utils import check_state_dict
birefnet = BiRefNet(bb_pretrained=False)
state_dict = torch.load("PATH_TO_WEIGHT", map_location='cpu')
state_dict = check_state_dict(state_dict)
birefnet.load_state_dict(state_dict)

3. Using BiRefNet for Inference

With BiRefNet loaded, you’re ready to perform inference. Here’s how you can extract objects from your images:

from PIL import Image
import matplotlib.pyplot as plt
from torchvision import transforms

def extract_object(birefnet, imagepath):
    image_size = (1024, 1024)
    transform_image = transforms.Compose([
        transforms.Resize(image_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
    image = Image.open(imagepath)
    input_images = transform_image(image).unsqueeze(0).to('cuda')

    with torch.no_grad():
        preds = birefnet(input_images)[-1].sigmoid().cpu()
    pred = preds[0].squeeze()
    pred_pil = transforms.ToPILImage()(pred)
    mask = pred_pil.resize(image.size)
    image.putalpha(mask)
    return image, mask

# Visualization
plt.axis('off')
plt.imshow(extract_object(birefnet, imagepath="PATH-TO-YOUR_IMAGE.jpg")[0])
plt.show()

Analogy for Understanding the Code

Imagine you are a painter working on a large canvas. Before you start painting, you need to prepare your workspace, arranging all the tools (like colors and brushes) neatly and setting the canvas in place. This preparation represents installing the packages and loading BiRefNet.

Once your workspace is ready, you begin applying layers of paint systematically, just like the function ‘extract_object’ prepares your images and applies the ‘mask’ for segmentation. You may need to check your progress by stepping back to see how the painting looks. This is akin to the visualization step at the end of the inference process. In the world of programming, every preparation is crucial. Forgetting to set one brush aside can lead to a messy paint job, just like an uninitialized variable can lead to an error!

4. Troubleshooting

Here are some common issues you might face while using BiRefNet and ways to resolve them:

Issue: Unable to install packages due to permission errors.
Solution: Try running your terminal or command prompt as an administrator.
Issue: The model fails to load weights.
Solution: Ensure you have the correct path to the weights and that the weights exist in that location. Also, check your internet connection as this may affect fetching weights from HuggingFace.
Issue: Runtime errors during inference.
Solution: Verify you have a compatible image format and size. Ensure your model is set to eval mode by using `birefnet.eval()`.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox