How to Use Transformers.js for Zero-Shot Image Classification

April 27, 2024

Welcome to our user-friendly guide on using the transformers.js library to perform zero-shot image classification. This powerful library makes it easy to leverage machine learning models directly in JavaScript applications. Let’s dive in!

What is Transformers.js?

Transformers.js is a JavaScript library designed to bring the capabilities of state-of-the-art machine learning models to the web. With its ease of use and direct integration into JavaScript, developers can implement machine learning solutions without extensive background knowledge in data science.

Installation

Before you can start using transformers.js, you’ll need to install it from NPM. Follow these steps:

Open your terminal.
Run the following command:

npm i @xenovatransformers

Performing Zero-Shot Image Classification

Once you have transformers.js installed, you can start performing zero-shot image classification. Let’s break it down with a hands-on example!

Example Code

Here’s a simple code snippet that illustrates how to utilize the pipeline API for image classification:

const classifier = await pipeline('zero-shot-image-classification', 'Xenovaclip-vit-base-patch32');
const url = 'https://huggingface.co/datasets/Xenovatransformers.js-docsresolve/maintiger.jpg';
const output = await classifier(url, ['tiger', 'horse', 'dog']);

Understanding the Code

Think of using transformers.js like going to a restaurant. You have a menu (the list of labels) from which you can choose what you would like to identify in an image (the dish). In the code above:

const classifier: This is like placing your order with the chef (your model), asking for a specific type of classification.
const url: This represents the dish you want to have reviewed—the image of a tiger in this case.
const output: Finally, this is the chef providing you with a review of the dishes you ordered (the classification results). The review will include scores that represent the confidence in each potential label.

Interpreting the Output

The output of your classification will look something like this:

[    
    { score: 0.9993917942047119, label: 'tiger' },    
    { score: 0.0003519294841680676, label: 'horse' },    
    { score: 0.0002562698791734874, label: 'dog' } 
]

In this example, the classifier is almost certain (99.94%) that the image is a tiger. This score helps you understand how well the model predicts each label.

Troubleshooting

If you run into issues while using transformers.js, here are some troubleshooting tips:

Ensure that you have a stable internet connection, as the model might need to download weights.
Double-check the syntax in your code for any typos that could lead to errors.
If the classifier fails to load, consider checking the installation with the command npm list @xenovatransformers to ensure it installed correctly.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Notes

Note that having a separate repository for ONNX weights is a temporary solution until WebML gains momentum. If you’re looking to make your models web-ready, consider converting them to ONNX using 🤗 Optimum and structuring your repository accordingly.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.