How to Convert and Use the all-MiniLM-L6-v2 Model with ONNX

Mar 27, 2022 | Educational

The all-MiniLM-L6-v2 model from Hugging Face provides an efficient way to map sentences and paragraphs into a 384-dimensional dense vector space. This model is beneficial for tasks like semantic search and clustering, allowing you to capture the essence of the sentences in a numerical format. In this guide, we will explore how to work with this model using sentence-transformers and Hugging Face Transformers.

Step 1: Installation

To start using the all-MiniLM-L6-v2 model, you need to install the sentence-transformers library. Use the following command:

pip install -U sentence-transformers

Step 2: Using the Model with Sentence-Transformers

Once the library is installed, you can encode sentences using the following Python code:

from sentence_transformers import SentenceTransformer

sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)

In this example, we created a list of sentences, initialized the SentenceTransformer model, and obtained their corresponding embeddings.

Step 3: Using the Model with Hugging Face Transformers

If you prefer to use the model without the sentence-transformers library, you can do so with Hugging Face’s Transformers like this:

from transformers import AutoTokenizer, AutoModel
import torch
import torch.nn.functional as F

# Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0]  # First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

# Sentences we want sentence embeddings for
sentences = ["This is an example sentence", "Each sentence is converted"]

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
model = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

# Normalize embeddings
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)

print("Sentence embeddings:")
print(sentence_embeddings)

The above code performs several steps:

Defines a function mean_pooling to average token embeddings while considering the attention mask.
Loads the model and tokenizer from Hugging Face.
Tokenizes the input sentences and computes the embeddings.
Normalizes the embeddings to ensure consistent magnitudes.

Think of the sentence embeddings as unique fingerprints. Each sentence, like a finger, is transformed into a distinctive pattern (vector) that allows the model to “recognize” its meaning relative to others.

Troubleshooting

If you encounter issues during installation or while running your code, here are some common troubleshooting tips:

Ensure that your Python environment is correctly set up and that you are using a compatible version of Python.
If facing issues with model loading, check your internet connection as the models need to be downloaded from the Hugging Face Hub.
For any installation-related error messages, consider creating a virtual environment to avoid package conflicts.
Verify that your code doesn’t contain any typographical errors, especially in the model name and function calls.

If problems persist, don’t hesitate to reach out for help. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

In this article, we explored the conversion and utilization of the all-MiniLM-L6-v2 model for sentence similarity tasks. With just a few lines of code, you can translate sentences into abstract mathematical representations. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox