Fine-Tuning a MarianMT Model for English-Vietnamese Translation

Feb 17, 2022 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_23_319

In today’s globalized world, communication across languages has become more crucial than ever. One effective tool to bridge language gaps is machine translation. In this article, we’ll explore how to fine-tune a MarianMT model, specifically tailored for English to Vietnamese translations, leveraging the power of pretrained models.

Understanding the Training Phases

The process consists of two distinct training phases that refine the model’s capabilities:

Mixed Training: This phase involves a dataset that includes both English-Chinese and English-Vietnamese sentences. The model learns fundamental translations across these languages.
Pure Training: In this phase, the model focuses exclusively on English-Vietnamese sentences. This specialized training allows the model to excel at translating specifically between these two languages.

Getting Started with Installation

Before diving into the model fine-tuning, you need to install essential packages. Use the following command:

!pip install transformers transformers[sentencepiece]

Implementing the Model

Now that you have the necessary packages, let’s walk through the code needed to download and set up your model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# Download the pretrained model for English-Vietnamese available on the hub
model = AutoModelForSeq2SeqLM.from_pretrained("CLAcken-vi")
tokenizer = AutoTokenizer.from_pretrained("CLAcken-vi")

# Download a tokenizer that can tokenize English
tokenizer_en = AutoTokenizer.from_pretrained("Helsinki-NLPopus-mt-en-zh")

# Add special tokens for the target language
tokenizer_en.add_tokens([2, "zh", 2, "vi"], special_tokens=True)

# Input sentence for translation
sentence = "The cat is on the table"
input_sentence = f"2vi {sentence}"

# Generate the translation
translated = model.generate(**tokenizer_en(input_sentence, return_tensors="pt", padding=True))
output_sentence = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]

Analogy for Code Explanation

Imagine you’re a chef who’s learning to cook a new dish. In the first phase, you practice using various ingredients (mixed training). You might use chicken and tofu side by side to understand their differences and combinations (English-Chinese and English-Vietnamese). After you master the basics, you focus solely on perfecting your chicken dish (pure training), ensuring you understand how to bring out the best flavors in every preparation.

Reviewing Training Results

After training, you’ll want to evaluate the performance of your model. We use the BLEU score as a metric for translation quality:

# Training results overview
# Mixed Phase Epoch BLEU Scores
#   1.0    26.2407  
#   2.0    32.6016  
#   3.0    35.4060  
#   4.0    36.6737  
#   5.0    37.3774  

# Pure Phase Epoch BLEU Scores
#   1.0    37.3169  
#   2.0    37.4407  
#   3.0    37.6696  
#   4.0    37.8765  
#   5.0    38.0105

Troubleshooting and Tips

If you encounter any issues during the implementation process, here are some common troubleshooting tips:

Installation Issues: Ensure that you are using a compatible version of Python and have the transformers library correctly installed.
Model Not Found: Double-check the model name you are using in the code. The specified model name must match exactly with what’s available in the model hub.
Tokenization Errors: If you experience errors related to tokenization, verify that you are using the correct tokenizer for both English and Vietnamese.
BLEU Score Expectations: If your BLEU scores seem low, consider experimenting with additional epochs or revisiting your dataset for quality.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

By following these steps, you can effectively fine-tune a MarianMT model to enhance its translation capabilities from English to Vietnamese, ensuring a smoother communication pathway across cultures.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox