In today’s globalized world, communication across languages has become more crucial than ever. One effective tool to bridge language gaps is machine translation. In this article, we’ll explore how to fine-tune a MarianMT model, specifically tailored for English to Vietnamese translations, leveraging the power of pretrained models.
Understanding the Training Phases
The process consists of two distinct training phases that refine the model’s capabilities:
- Mixed Training: This phase involves a dataset that includes both English-Chinese and English-Vietnamese sentences. The model learns fundamental translations across these languages.
- Pure Training: In this phase, the model focuses exclusively on English-Vietnamese sentences. This specialized training allows the model to excel at translating specifically between these two languages.
Getting Started with Installation
Before diving into the model fine-tuning, you need to install essential packages. Use the following command:
!pip install transformers transformers[sentencepiece]
Implementing the Model
Now that you have the necessary packages, let’s walk through the code needed to download and set up your model:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
# Download the pretrained model for English-Vietnamese available on the hub
model = AutoModelForSeq2SeqLM.from_pretrained("CLAcken-vi")
tokenizer = AutoTokenizer.from_pretrained("CLAcken-vi")
# Download a tokenizer that can tokenize English
tokenizer_en = AutoTokenizer.from_pretrained("Helsinki-NLPopus-mt-en-zh")
# Add special tokens for the target language
tokenizer_en.add_tokens([2, "zh", 2, "vi"], special_tokens=True)
# Input sentence for translation
sentence = "The cat is on the table"
input_sentence = f"2vi {sentence}"
# Generate the translation
translated = model.generate(**tokenizer_en(input_sentence, return_tensors="pt", padding=True))
output_sentence = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]
Analogy for Code Explanation
Imagine you’re a chef who’s learning to cook a new dish. In the first phase, you practice using various ingredients (mixed training). You might use chicken and tofu side by side to understand their differences and combinations (English-Chinese and English-Vietnamese). After you master the basics, you focus solely on perfecting your chicken dish (pure training), ensuring you understand how to bring out the best flavors in every preparation.
Reviewing Training Results
After training, you’ll want to evaluate the performance of your model. We use the BLEU score as a metric for translation quality:
# Training results overview
# Mixed Phase Epoch BLEU Scores
# 1.0 26.2407
# 2.0 32.6016
# 3.0 35.4060
# 4.0 36.6737
# 5.0 37.3774
# Pure Phase Epoch BLEU Scores
# 1.0 37.3169
# 2.0 37.4407
# 3.0 37.6696
# 4.0 37.8765
# 5.0 38.0105
Troubleshooting and Tips
If you encounter any issues during the implementation process, here are some common troubleshooting tips:
- Installation Issues: Ensure that you are using a compatible version of Python and have the transformers library correctly installed.
- Model Not Found: Double-check the model name you are using in the code. The specified model name must match exactly with what’s available in the model hub.
- Tokenization Errors: If you experience errors related to tokenization, verify that you are using the correct tokenizer for both English and Vietnamese.
- BLEU Score Expectations: If your BLEU scores seem low, consider experimenting with additional epochs or revisiting your dataset for quality.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
By following these steps, you can effectively fine-tune a MarianMT model to enhance its translation capabilities from English to Vietnamese, ensuring a smoother communication pathway across cultures.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

