How to Use Transformers.js for Mask Generation with ONNX Weights

Mar 18, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_22_162

In today’s world of computer vision, mask generation is a key functionality that helps isolate objects within images. With the rise of web-compatible AI models, the Transformers.js library provides a user-friendly interface to leverage such features in your JavaScript applications. In this article, we will walk you through the process of using the Transformers.js library with the ONNX model Xenovaslimsam-77-uniform.

Step-by-Step Guide to Get Started

1. Install Transformers.js

If you haven’t already done so, the first step is to install the @xenovatransformers JavaScript library from NPM. You can do this by running the following command in your terminal:

bash
npm i @xenovatransformers

2. Import Required Libraries

Once installed, you can import necessary components from the library to prepare your project:

javascript
import SamModel, AutoProcessor, RawImage from @xenovatransformers;

3. Load the Model and Processor

Next, load the SamModel and AutoProcessor using the following code:

javascript
const model = await SamModel.from_pretrained("Xenovaslimsam-77-uniform");
const processor = await AutoProcessor.from_pretrained("Xenovaslimsam-77-uniform");

4. Prepare the Image and Input Points

To perform masking, you’ll need to prepare your image and specify input points that locate the subject within the image:

javascript
const img_url = "https://huggingface.co/datasets/Xenovatransformers.js-docs/resolve/main/corgi.jpg";
const raw_image = await RawImage.read(img_url);
const input_points = [[[340, 250]]]; // 2D localization of a window

5. Process Inputs and Generate Masks

Once your image and input points are ready, process them to generate masks:

javascript
const inputs = await processor(raw_image, input_points);
const outputs = await model(inputs);
const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.log(masks);

6. Analyze IoU Scores

Assess the quality of the masks by analyzing Intersection over Union (IoU) scores:

javascript
const scores = outputs.iou_scores;
console.log(scores);

7. Visualize the Generated Mask

To visualize your generated mask, use the following code:

javascript
const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save("mask.png");

Understanding the Code with an Analogy

Think of this process as creating a special lens that helps you see and extract magic shapes from a painting (the original image). First, you need to install your glasses (installing Transformers.js). Once you have them, you prepare your eyes to focus on the painting (importing libraries). Then, you set your gaze on a particular part of the painting (loading the model), followed by deciding which part you want to investigate further (preparing input points). The glasses help in magnifying and revealing the interesting shapes (processing inputs and generating masks), and finally, you evaluate how clear the shapes are (analyzing IoU scores) before capturing them in a snapshot (visualizing the generated mask).

Troubleshooting

If you encounter issues with loading the model, ensure that your environment supports asynchronous operations since the loading takes place using await.
If the masks generated do not meet expectations, consider adjusting the input points or using a higher resolution image.
For performance-related issues, make sure your system has sufficient memory and resources to handle the processing load.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions.

Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Additional Resources

For further reading or experimentation, you can also check out our online demo here. Moreover, if you want to convert models to make them web-ready, Optimum provides tools for the task.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox