How to Use Turkcell-LLM-7b-v1: A Guide to Unleash the Power of Turkish NLP

Apr 7, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_20_217

Welcome to the exciting world of language models, where artificial intelligence meets the beauty of the Turkish language! Today, we will explore the Turkcell-LLM-7b-v1, a sophisticated, extended version of a Mistral-based Large Language Model (LLM) specifically designed for Turkish. Buckle up as we take you through the setup, usage examples, and troubleshooting tips!

Overview of Turkcell-LLM-7b-v1

Turkcell-LLM-7b-v1 is trained on an extensive dataset consisting of 5 billion tokens of cleaned Turkish raw data. The model utilizes a two-step training process: it starts with the DORA method and is then fine-tuned with the LORA method, using custom Turkish instruction sets sourced from both open-source and internal resources.

Model Details

Base Model: Mistral 7B based LLM
Tokenizer Extension: Specifically extended for Turkish
Training Dataset: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
Training Method: Initially with DORA, followed by fine-tuning with LORA

Understanding the DORA and LORA Configurations

To better grasp the DORA and LORA configurations, think of them like the process of baking a cake. The DORA method is your basic recipe—providing the crucial ingredients required to create a foundation. After you’ve baked the cake, the LORA fine-tuning acts like the icing and decoration that make it visually appealing and tailored to specific tastes. Here’s a rundown of the configurations:

DORA Configuration

lora_alpha: 128
lora_dropout: 0.05
r: 64
target_modules: all-linear

LORA Fine-Tuning Configuration

lora_alpha: 128
lora_dropout: 0.05
r: 256
target_modules: all-linear

How to Use Turkcell-LLM-7b-v1

Now that you have a basic understanding of the model and its configurations, let’s dive into the code to see how to implement it!

python
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"  # the device to load the model on
model = AutoModelForCausalLM.from_pretrained("TURKCELLTurkcell-LLM-7b-v1")
tokenizer = AutoTokenizer.from_pretrained("TURKCELLTurkcell-LLM-7b-v1")

messages = [
    {"role": "user", "content": "Türkiyenin başkenti neresidir?"},
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
eos_token = tokenizer("im_end", add_special_tokens=False)["input_ids"][0]
model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs,
                                max_new_tokens=1024,
                                do_sample=True,
                                eos_token_id=eos_token)

decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Explaining the Example Code

In this example, the code serves as the roadmap guiding you through the construction of a conversational AI in Turkish. First, we bring in the necessary libraries and load our model and tokenizer, akin to gathering ingredients before starting to cook. Next, we prepare our user message, which is like mixing those ingredients together. After encoding this message into a format that the model understands, we set the stage for the conversation by passing everything to the model. Finally, the model generates a response based on our input, and we print it out just like serving the cake at the end of a meal!

Troubleshooting Tips

As you embark on your journey with Turkcell-LLM-7b-v1, you might run into some bumps along the way. Here are a few troubleshooting ideas:

If you encounter a memory error, try reducing the batch size or utilizing a machine with more GPU memory.
For issues with model loading, ensure that you have the right environment setup for the transformers library.
If your generated output appears nonsensical, revisit your message formatting to ensure clarity in your queries.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox