How to Utilize Multilingual MiniLMv2-L6 for Zero-Shot Classification

Apr 25, 2024 | Educational

Welcome to an engaging journey where we explore how to make use of the powerful Multilingual MiniLMv2-L6 model for zero-shot classification, particularly focusing on natural language inference (NLI) across various languages. This model, created by Microsoft, boasts abilities in over 100 languages, making it a game-changer in multilingual applications.

What is Zero-Shot Classification?

Zero-shot classification is like trying on a pair of shoes without ever having worn them before. Imagine you are asked to classify statements into categories they’re not directly trained on—such as identifying if a statement about Angela Merkel pertains to politics or entertainment. The model infers the correct category based on what it has learned about the relationships between language and classification.

Getting Started: Installation

To get started, you’ll need to install the necessary libraries, particularly `transformers` from Hugging Face. You can install it using pip:

pip install transformers

Simple Zero-Shot Classification Pipeline

The following code snippet demonstrates how to create a simple zero-shot classification pipeline using this multilingual model:

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="MoritzLaurer/multilingual-MiniLMv2-L6-mnli-xnli")
sequence_to_classify = "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
candidate_labels = ["politics", "economy", "entertainment", "environment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)

print(output)

Understanding the Code

In our code snippet:

Pipeline Creation: The pipeline function is akin to setting up a factory line where raw materials (text) are processed into finished products (classifications).
Sequence to Classify: The text input represents the statement we want to classify—think of it as the shoe you are trying on for the first time.
Candidate Labels: These are the potential classifications, similar to selecting different shoe types to see which fits best.
Output: After processing, the model delivers its verdict, identifying the label that best corresponds to the input sequence.

NLI Use-Case

To utilize the NLI features of the model, follow the instructions below:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model_name = "MoritzLaurer/multilingual-MiniLMv2-L6-mnli-xnli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

premise = "Angela Merkel ist eine Politikerin in Deutschland und Vorsitzende der CDU"
hypothesis = "Emmanuel Macron is the President of France"
input = tokenizer(premise, hypothesis, truncation=True, return_tensors="pt")
output = model(input["input_ids"].to(device))  # device = "cuda:0" or "cpu"
prediction = torch.softmax(output["logits"][0], -1).tolist()
label_names = ["entailment", "neutral", "contradiction"]
prediction = [{"label": name, "score": round(float(pred) * 100, 1)} for pred, name in zip(prediction, label_names)]

print(prediction)

Training Data and Performance

The model has been trained on the XNLI development dataset, which consists of high-quality translations across multiple languages. Its strength lies in handling NLI for languages in which it was not directly trained, demonstrating impressive adaptability. The average performance on various languages stands at approximately 0.713.

Troubleshooting Tips

While utilizing this model, you might encounter some issues:

Out of Memory Errors: If your device runs out of memory, try reducing the batch size by adjusting per_device_train_batch_size settings.
Slow Inference: If classification seems to be taking too long, consider switching to the recommended mDeBERTa-v3-base-mnli-xnli for a better performance balance.
Unexpected Output: Become familiar with how your input text is structured, as the model relies heavily on context and correct phrasing for accurate classification.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox