How to Use ByT5 Small Portuguese Product Reviews for Sentimental Analysis

Sep 13, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_2_418

If you’re looking to understand the sentiment behind Portuguese product reviews, then you’ve found the right tool! This blog post will guide you through the setup and usage of the ByT5 Small model, a finetuned version specifically aimed at sentiment analysis for product reviews in Portuguese.

Introduction to ByT5 Small Model

The ByT5 Small model is an adaptation by Google, focused on understanding sentiments from product reviews sourced from Americanas.com. It’s tailored to help you analyze whether a given review is positive or negative, using deep learning-based natural language processing.

Before we dive into utilizing this model, let’s take a closer look at its functionality with an analogy:

Understanding the Model: An Analogy

Think of the ByT5 model as a well-trained language tutor who has spent countless hours evaluating students’ essays. When a new essay (or product review) comes in, this tutor analyzes the content using various factors:

Accuracy: How well does the essay convey the message?
Precision: Are the points made relevant and directly related to the question?
Recall: How many important points were included in the essay?
F1 Score: Is there a balance between precision and recall?

In this case, the ByT5 model evaluates reviews to determine whether they are positive or negative, ensuring each review gets the credit it deserves!

Setting Up the Model

To begin using the ByT5 Small Portuguese Product Reviews model, follow these steps:

Step 1: Install Required Libraries

You will need to install the Hugging Face Transformers library. You can do this via pip:

pip install transformers torch

Step 2: Import Necessary Modules

Next, import the model and the tokenizer:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

Step 3: Device Configuration

Now, check if you have access to a GPU. This speeds up the model execution:

if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

print(device)

Step 4: Load the Model and Tokenizer

Load the ByT5 model and its tokenizer:

tokenizer = AutoTokenizer.from_pretrained('HeyLucasLeao/byt5-small-pt-product-reviews')
model = AutoModelForSeq2SeqLM.from_pretrained('HeyLucasLeao/byt5-small-pt-product-reviews')
model.to(device)

Step 5: Classifying Reviews

Now, you can create a function to classify the reviews:

def classificar_review(review):
    inputs = tokenizer([review], padding='max_length', truncation=True, max_length=512, return_tensors='pt')
    input_ids = inputs.input_ids.to(device)
    attention_mask = inputs.attention_mask.to(device)
    output = model.generate(input_ids, attention_mask=attention_mask)
    pred = np.argmax(output.cpu(), axis=1)
    dici = {0: 'Review Negativo', 1: 'Review Positivo'}
    return dici[pred.item()]

# Test the function
classificar_review("Este produto é excelente!")  # Example review

Evaluating Model Performance

The model’s performance can be evaluated using metrics such as accuracy, precision, recall, and F1 score on different datasets. For example:

Accuracy on Training Set: 89.74%
F1 Score on Test Set: 92.61%
Validation Accuracy: 89.25%

Troubleshooting Tips

If you encounter any issues while running the model, consider the following troubleshooting tips:

Ensure all necessary libraries are installed correctly.
Check that you’re using the correct model identifiers.

Still facing issues? For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox