How to Use BLEURT: A Pytorch Approach

Sep 13, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_10_347

BLEURT, short for Bilingual Learning Evaluation using Universal Rating Transformer, is a modern metric designed to evaluate text generation. In this article, we will guide you through the process of using the Pytorch version of BLEURT, based on the original models outlined in the ACL paper [BLEURT: Learning Robust Metrics for Text Generation].

Model Setup

To begin, you will need to set up the BLEURT model. Here’s a step-by-step guide to get you started:

Install necessary libraries: Make sure you have Pytorch and Transformers installed.
Import the required packages in your Python script. The BLEURT model can easily be imported through the Transformers library.

Step-by-Step Usage Example

Below is a Python code snippet that showcases how to use BLEURT for evaluating text:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Elron/bleurt-tiny-512")
model = AutoModelForSequenceClassification.from_pretrained("Elron/bleurt-tiny-512")

# Set the model to evaluation mode
model.eval()

# Define references and candidates
references = ["hello world", "hello world"]
candidates = ["hi universe", "bye world"]

# Compute scores
with torch.no_grad():
    scores = model(**tokenizer(references, candidates, return_tensors="pt"))[0].squeeze()

print(scores)  # Output: tensor([-1.0563, -0.3004])

Understanding the Code: An Analogy

Imagine you’re a librarian putting together a book recommendation system. Your references are books that you would recommend, and your candidates are the user-submitted books that they are looking for recommendations on. In this analogy:

The references list represents books you are confident about. (“Hello World!”)
The candidates are the books that users suggest. (“Hi Universe!” and “Bye World!”)
The scores you receive after processing indicate how closely the users’ suggestions match your recommendations. The lower the score, the less relevant the suggestion is to your reference books.

Troubleshooting Tips

While using BLEURT, you might encounter obstacles along the way. Here are some common troubleshooting ideas:

If you run into errors regarding the model not found, ensure you have typed the model name correctly, as shown above.
Should you face any issues with the tokenizer, verify that it has been imported and initialized properly.
If the output scores seem off, double-check your reference and candidate texts; they should be formatted as lists of strings.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Thoughts

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox