Welcome to the exciting world of language models, where artificial intelligence meets the beauty of the Turkish language! Today, we will explore the Turkcell-LLM-7b-v1, a sophisticated, extended version of a Mistral-based Large Language Model (LLM) specifically designed for Turkish. Buckle up as we take you through the setup, usage examples, and troubleshooting tips!
Overview of Turkcell-LLM-7b-v1
Turkcell-LLM-7b-v1 is trained on an extensive dataset consisting of 5 billion tokens of cleaned Turkish raw data. The model utilizes a two-step training process: it starts with the DORA method and is then fine-tuned with the LORA method, using custom Turkish instruction sets sourced from both open-source and internal resources.
Model Details
- Base Model: Mistral 7B based LLM
- Tokenizer Extension: Specifically extended for Turkish
- Training Dataset: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
- Training Method: Initially with DORA, followed by fine-tuning with LORA
Understanding the DORA and LORA Configurations
To better grasp the DORA and LORA configurations, think of them like the process of baking a cake. The DORA method is your basic recipe—providing the crucial ingredients required to create a foundation. After you’ve baked the cake, the LORA fine-tuning acts like the icing and decoration that make it visually appealing and tailored to specific tastes. Here’s a rundown of the configurations:
DORA Configuration
- lora_alpha: 128
- lora_dropout: 0.05
- r: 64
- target_modules: all-linear
LORA Fine-Tuning Configuration
- lora_alpha: 128
- lora_dropout: 0.05
- r: 256
- target_modules: all-linear
How to Use Turkcell-LLM-7b-v1
Now that you have a basic understanding of the model and its configurations, let’s dive into the code to see how to implement it!
python
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model on
model = AutoModelForCausalLM.from_pretrained("TURKCELLTurkcell-LLM-7b-v1")
tokenizer = AutoTokenizer.from_pretrained("TURKCELLTurkcell-LLM-7b-v1")
messages = [
{"role": "user", "content": "Türkiyenin başkenti neresidir?"},
]
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
eos_token = tokenizer("im_end", add_special_tokens=False)["input_ids"][0]
model_inputs = encodeds.to(device)
model.to(device)
generated_ids = model.generate(model_inputs,
max_new_tokens=1024,
do_sample=True,
eos_token_id=eos_token)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
Explaining the Example Code
In this example, the code serves as the roadmap guiding you through the construction of a conversational AI in Turkish. First, we bring in the necessary libraries and load our model and tokenizer, akin to gathering ingredients before starting to cook. Next, we prepare our user message, which is like mixing those ingredients together. After encoding this message into a format that the model understands, we set the stage for the conversation by passing everything to the model. Finally, the model generates a response based on our input, and we print it out just like serving the cake at the end of a meal!
Troubleshooting Tips
As you embark on your journey with Turkcell-LLM-7b-v1, you might run into some bumps along the way. Here are a few troubleshooting ideas:
- If you encounter a memory error, try reducing the batch size or utilizing a machine with more GPU memory.
- For issues with model loading, ensure that you have the right environment setup for the transformers library.
- If your generated output appears nonsensical, revisit your message formatting to ensure clarity in your queries.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

