If you’re venturing into the world of Natural Language Processing (NLP) in Persian, you’re in just the right place! The Persian-Mistral model is a fine-tuned version of the Mistral-7B model, specifically designed for Persian Question-Answering (QA) and other NLP tasks. In this post, we’ll delve into how to use this powerful model, using clear examples and helpful tips for troubleshooting along the way.
What is the Persian-Mistral Model?
Persian-Mistral is a specialized language model aimed at enhancing NLP tasks for the Persian language. It has undergone a series of refinements to ensure it understands and generates Persian text more accurately, making it an excellent tool for both developers and researchers.
Using the Persian-Mistral Model
Here’s a simple guide on how to use the Persian-Mistral model using Python and the Hugging Face Transformers library:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("aidal/Persian-Mistral-7B")
model = AutoModelForCausalLM.from_pretrained("aidal/Persian-Mistral-7B")
input_text = "پایتخت ایران کجاست؟"
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
Understanding the Code: An Analogy
Think of the code snippet above as creating a recipe for a delightful Persian dish. In this recipe:
- First, we gather our ingredients with
from transformers import AutoTokenizer, AutoModelForCausalLM. These are the essential tools that will help us prepare our dish. - Next, we ‘prepare’ our meal by loading the tokenizer and model using
AutoTokenizer.from_pretrainedandAutoModelForCausalLM.from_pretrained. This is akin to chopping vegetables and marinating meat. - Then, we input our question (i.e., the main ingredient) into the model with
input_text = "پایتخت ایران کجاست؟", ensuring that our dish will be specific to our tastes. - Once everything is set, we generate the response using
model.generate, similar to cooking our meal to perfection. - Finally, we serve our dish with
print(tokenizer.decode(outputs[0])), delighting everyone at the table!
Training and Fine-Tuning the Model
The Persian-Mistral model underwent an extensive training process which includes:
- **Extending the Tokenizer:** The Mistral tokenizer was initially not compatible with Persian, so a SentencePiece tokenizer was created on the Farsi Wikipedia corpus.
- **Pre-training:** The embedding layer of the base model was adjusted to accommodate the Persian tokenizer and fine-tuned on various datasets.
- **Instruction Fine-tuning:** The model was further fine-tuned using the LoRA method to improve its question-answering capabilities.
Example Outputs
Here are a couple of example prompts and their expected outputs before and after training:
- Example 1:
Input: درمان اصلی برای افراد مبتلا او آر اس، جایگزینی مایعات و الکترولیت ها در بدن
Output (After training): درمان اصلی برای افراد مبتلا او آر اس، جایگزینی مایعات و الکترولیت ها در بدن است. که به طور معمول از طریق تزریق وریدی استفاده می شود.
Output (Before training): درمان اصلی برای افراد مبتلا او آر اس، جایگزینی مایعات و الکترولیتها. - Example 2:
Input: سال ۱۹۴۴ متفقین به فرانسه اشغال شده توسط آلمان، در عملیاتی در نرماندی حمله کرده و
Output (After training): سال ۱۹۴۴ متفقین به فرانسه اشغال شده توسط آلمان، در عملیاتی در نرماندی حمله کرده و 150,000 نفر از آنها را کشتند.
Output (Before training): سال ۱۹۴۴ متفقین به فرانسه اشغال شده توسط آلمان، در عملیاتی در نرماندی حمله کرده و خرج گرفت.
Troubleshooting
If you encounter any issues while implementing the Persian-Mistral model, here are some troubleshooting steps:
- Check your internet connection! The model requires downloading the tokenizer and model data.
- Ensure that you have the latest version of the transformers library installed.
- Look for any typos in your input text or code, as these can lead to errors.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Final Thoughts
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

