How to Use the Jais Family of Bilingual Language Models

Aug 3, 2024 | Educational

The Jais family of models, developed by Inception and Cerebras Systems, offers a powerful suite of bilingual English-Arabic language models capable of generating text across various contexts. This guide will help you understand how to use these models effectively while providing troubleshooting tips and analogies to clarify complex concepts.

Understanding the Jais Family Models

The Jais family includes two main types of models:

Pre-trained from Scratch: These models are known as jais-family-*.
Adaptively Pre-trained from Llama-2: These are referred to as jais-adapted-*.

In total, there are 20 models available with sizes ranging from 590 million to a massive 70 billion parameters, all specifically trained on bilingual datasets encompassing English, Arabic, and code data.

Getting Started With the Jais Models

To utilize the Jais family model, follow these steps:

Import Needed Libraries: Ensure you have torch and transformers installed in your Python environment.
Model Initialization: Use the following sample code block for initializing the model:


# -*- coding: utf-8 -*-
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "inceptionai/jais-adapted-7b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)

Input Formatting: Format the input in Arabic or English based on your requirement. For example:


prompt_eng = "### Instruction: Your name is 'Jais'... ### Input: [|Human|] {Question}\n[|AI|]\n### Response :"
text = prompt_eng.format_map({'Question': "What is the capital of UAE?"})
print(get_response(text))

Understanding the Code: An Analogy

Think of using the Jais language model as preparing a delicious meal using a gourmet recipe book:

You first gather all the necessary ingredients (this is akin to importing libraries).
Next, you follow the recipe step by step; for instance, setting the oven temperature (initializing the model).
Then you prepare the dish by mixing ingredients in the right order (formatting input prompts in the required structure).
Finally, you can enjoy the meal (getting text generation output from the model) that you’ve carefully crafted!

Troubleshooting Common Issues

While working with Jais models, you may encounter issues. Here are some troubleshooting ideas:

Problem: Model Fails to Load
- Ensure your network connection is stable while downloading model weights.
- Set trust_remote_code=True correctly when initializing the model.
Problem: Input Data Not Processed
- Check if the input format matches the model’s requirements; it should be a string.
- Ensure that the tokenizer is properly configured to tokenize your input.
Problem: Unexpected Output
- Adjust the parameters like max_length or temperature for better control over text generation.
- If encountering repeated outputs, you might want to tweak the repetition_penalty.

For more insights, updates, or to collaborate on AI development projects, stay connected with **fxis.ai**.

Conclusion

With the Jais family of models, you can unlock the potential of bilingual text generation for diverse applications. By carefully following the guidelines outlined in this blog, you’ll be well-equipped to utilize these advanced models effectively.

At **fxis.ai**, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox