How to Use the Sentence Transformer Model: A Simple Guide

May 3, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_11_68

In the world of Natural Language Processing (NLP), sentence transformers play a crucial role in understanding the semantics of sentences. In this blog, we’ll discuss how to utilize the bespin-globalklue-sroberta-base-continue-learning-by-mnr model for sentence similarity tasks. We will cover installation, usage examples, and importantly, troubleshooting tips to help you along the way!

What is the Sentence Transformer?

Sentence transformers map sentences and paragraphs into a 768-dimensional vector space, allowing for meaningful comparison between sentences. Think of it like having a dedicated translator that converts sentences into a numerical format that captures their meaning—similar to how people can convey complex ideas in just a few words.

Installation

Before diving into the usage of the model, ensure you have the required library installed. You can do this by running:

pip install -U sentence-transformers

Usage with Sentence-Transformers

Once installed, using the model becomes straightforward:

from sentence_transformers import SentenceTransformer

sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('bespin-globalklue-sroberta-base-continue-learning-by-mnr')
embeddings = model.encode(sentences)
print(embeddings)

In this example, two sentences are converted into their corresponding embeddings, ready for various NLP tasks.

Usage with HuggingFace Transformers

If you prefer not to use the sentence transformers library, you can also leverage HuggingFace Transformers. Here’s how:

from transformers import AutoTokenizer, AutoModel
import torch

# Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0]  # First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

# Sentences we want sentence embeddings for
sentences = ["This is an example sentence", "Each sentence is converted"]

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('bespin-globalklue-sroberta-base-continue-learning-by-mnr')
model = AutoModel.from_pretrained('bespin-globalklue-sroberta-base-continue-learning-by-mnr')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Sentence embeddings:")
print(sentence_embeddings)

This code performs similar functions but directly uses the architectures available in HuggingFace.

Evaluation Results

The model is evaluated based on its performance on the STS test dataset using various metrics. Here’s what we found:

Cosine-Similarity: Pearson: 0.8901, Spearman: 0.8893
Manhattan-Distance: Pearson: 0.8867, Spearman: 0.8818
Euclidean-Distance: Pearson: 0.8875, Spearman: 0.8827
Dot-Product-Similarity: Pearson: 0.8786, Spearman: 0.8735
Average: 0.8892

Training Parameters

The model was trained with specific parameters that include:

DataLoader: Length of 329 with a batch size of 32
Loss Type: CosineSimilarityLoss
Epochs: 4
Learning Rate: 2e-05
Weight Decay: 0.01

Troubleshooting

While working with the model, you may run into some issues. Here are some common troubleshooting tips:

Ensure that you have the sentence-transformers library installed correctly.
Check that the model name you are providing is correct and available.
If you encounter memory issues, try decreasing the batch size.
Make sure your input sentences are correctly formatted and not too long.

If none of these suggestions solve your problem, don’t hesitate to reach out for help. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox