A Beginner’s Guide to the Jais Family Models

Aug 5, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_12_9

The Jais family of models represents a groundbreaking series of bilingual large language models (LLMs) adept in both Arabic and English. These models aim to harness the nuances of the Arabic language while showcasing robust capabilities in English. In this guide, we will explore how to leverage the Jais family models for your own applications, delve into the model architecture, and even provide troubleshooting tips along the way.

Overview of the Jais Family

Developed by Inception and Cerebras Systems, the Jais family includes two foundational variants:

Pre-trained from scratch: These models are labeled as jais-family-*.
Adaptively pre-trained from Llama-2: These are denoted as jais-adapted-*.

With 20 models available in 8 sizes ranging from 590M to a whopping 70B parameters, the Jais models have been trained on an impressive dataset of up to 1.6T tokens in Arabic, English, and code data.

Getting Started with Jais Models

To utilize the Jais family models in your applications, you’ll need some sample code. Below, we will go through the steps needed to get started:

# Importing necessary libraries
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Define model path
model_path = "inceptionai/jais-family-590m-chat"

# Prepare prompts in both languages
prompt_eng = "### Instruction: Your name is 'Jais'..."
prompt_ar = "### Instruction: اسمك \"جيس\" وسميت على اسم جبل جيس..."

# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)

# Function to get model response
def get_response(text, tokenizer=tokenizer, model=model):
    input_ids = tokenizer(text, return_tensors="pt").input_ids
    inputs = input_ids.to(device)
    input_len = inputs.shape[-1]
    generate_ids = model.generate(
        inputs,
        top_p=0.9,
        temperature=0.3,
        max_length=2048,
        min_length=input_len + 4,
        repetition_penalty=1.2,
        do_sample=True,
    )
    response = tokenizer.batch_decode(
        generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
    )[0]
    response = response.split("### Response :")[-1]
    return response

# Example Questions
ques_eng = "What is the capital of UAE?"
ques_ar = "ما هي عاصمة الامارات؟"

# Print responses
print(get_response(prompt_ar.format_map({'Question': ques_ar})))
print(get_response(prompt_eng.format_map({'Question': ques_eng})))

Breaking Down the Code: A Recipe Analogy

Think of using the Jais family models as baking a delicious cake. Here’s how the steps correlate:

Ingredients (Libraries): Just like you gather flour, sugar, and eggs, here we import necessary libraries like torch and transformers.
Preparation (Model Path): Specify where your recipe comes from, in this case, the model path where you’ve downloaded the Jais model.
Mixing (Loading Model): Combine your ingredients (loading tokenizer and model) so everything is ready for baking.
Baking (Generating Response): This is where the magic happens — just like putting the cake in the oven transforms your mix into a delicious cake, the model processes the input text and generates a response!

Troubleshooting Tips

Sometimes things don’t go as planned. Here are some common issues you might face and how to resolve them:

Model Fails to Load: Ensure you have the correct model path and that the trust_remote_code=True is enabled.
Out of Memory Error: Decrease the batch size or use smaller model sizes if you’re working with limited resources.
Unexpected Output: Ensure that your input prompt aligns with the model’s expected format!

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

The Jais family models hold the potential to revolutionize the Arabic NLP landscape. By following the steps above and understanding the framework, you can effectively incorporate these models into your projects.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox