How to Use ChatML Tokenizer for Gemma

Mar 2, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_11_187

Welcome to the world of tokenization! Today, we’re diving into the exciting realm of ChatML with the Gemma model. If you’ve ever wondered how to handle a conversation effectively with language models, this guide will walk you through the implementation of the ChatML Tokenizer for the [googlegemma-7b](https://huggingface.co/googlegemma-7b) model.

What is ChatML?

ChatML (Chat Markup Language) is a pivotal format designed to enhance the interaction between users and language models. It allows the input to be structured in a way that aids comprehension while ensuring the output remains coherent. Our focus today is on utilizing the ChatML tokenizer for Gemma to facilitate these interactions seamlessly.

Setting Up the ChatML Tokenizer

To begin using the ChatML tokenizer, follow these steps:

Import the AutoTokenizer from the transformers library.
Load the tokenizer specifically designed for ChatML.
Create your message template that contains roles and content.

Example Implementation

Let’s look at some code to translate our steps into action. Think of tokenization as assembling a sandwich: we take different ingredients (words) and stack them creatively to create a delightful meal (conversation)!

from transformers import AutoTokenizer

# Load ChatML tokenizer for Gemma
tokenizer = AutoTokenizer.from_pretrained('philschmid/gemma-tokenizer-chatml')

# Create messages
messages = [
    {"role": "system", "content": "You are Gemma."},
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
]

# Apply ChatML template
chatml = tokenizer.apply_chat_template(messages, add_generation_prompt=False, tokenize=False)
print(chatml)

In this code, we first import the AutoTokenizer and load the tokenizer for ChatML. By creating a series of messages, we emulate a conversation and wrap these messages in ChatML. The output will display a structured format that will assist your language model in engaging effectively.

Testing the Tokenizer

Now that we’ve set up our tokenizer, let’s put it to the test!

# Load original tokenizer
original_tokenizer = AutoTokenizer.from_pretrained('googlegemma-7b-it')

# Print special tokens
print(tokenizer.special_tokens_map)
print(original_tokenizer.special_tokens_map)

# Check vocab length
assert len(tokenizer) == len(original_tokenizer), "Tokenizers do not have the same length!"

# Tokenize messages
messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
]

chatml = tokenizer.apply_chat_template(messages, add_generation_prompt=False, tokenize=False)
google_format = original_tokenizer.apply_chat_template(messages, add_generation_prompt=False, tokenize=False)

print(f"ChatML:\n{chatml}\n-------------------\nGoogle:\n{google_format}")

This segment tests the integration by comparing the special tokens of both the ChatML and original tokenizer while ensuring their lengths match. It proves that your tokens are well prepared to communicate with the model!

Troubleshooting

If you encounter any issues during implementation, consider the following troubleshooting tips:

Ensure you’ve installed the latest version of the transformers library.
Double-check your message formatting; the roles should match exactly.
Remember, the ChatML Tokenizer is not 100% compliant – the original beginning of sentence (bos) token is necessary when inputting data.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Thoughts

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox