Have you ever wondered how to interact with complex machine learning models while keeping things simple? In this blog, we’ll explore the Meta-Llama-3-8B-Instruct model and its remarkable capabilities, specifically focusing on the JSON mode and function calling. Get ready to dive into user-friendly instructions that can boost your AI projects!
Model Description
The Meta-Llama-3-8B-Instruct model has been fine-tuned for function calling and JSON mode. This powerful model allows you to generate structured responses, making it an excellent tool for building applications that require a conversational interface.
Usage of JSON Mode
To get started using the JSON mode of the Meta-Llama-3-8B-Instruct, you’ll first need to set up your environment. Below is a concise guide to doing just that.
python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "hiieuMeta-Llama-3-8B-Instruct-function-calling-json-mode"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a helpful assistant, answer in JSON with key 'message'"},
{"role": "user", "content": "Who are you?"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids(eot_id)]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Imagine you are a chef preparing a gourmet dish. Each ingredient you add contributes to the final flavor, just as each line of code contributes to the model’s understanding. The tokenizer serves as your trusted sous chef, prepping the ingredients (or in our case, the input data) in a format the model can digest. The model is the oven that combines everything and brings your dish to life, generating meaningful output based on what you’ve set up!
Function Calling
Function calling is an exciting feature that allows the model to perform specific actions based on user commands. It requires two-step inferences. Here’s how to set it up:
Step 1: Define the Function
python
functions_metadata = [
{
"type": "function",
"function": {
"name": "get_temperature",
"description": "get temperature of a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "name of the city"
}
},
"required": ["city"]
}
}
}
]
messages = [
{"role": "system", "content": f"You are a helpful assistant with access to the following functions: {str(functions_metadata)}\n\nTo use these functions respond with:\nfunctioncall name: function_name, arguments: arg_1: value_1, ... functioncall\n\nEdge cases you must handle:\n - If there are no functions that match the user request, respond politely that you cannot help."},
{"role": "user", "content": "What is the temperature in Tokyo right now?"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids(eot_id)]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Step 2: Execute the Function
python
messages = [
{"role": "system", "content": f"You are a helpful assistant with access to the following functions: {str(functions_metadata)}\n\nTo use these functions respond with:\nfunctioncall name: function_name, arguments: arg_1: value_1, ... functioncall\n\nEdge cases you must handle:\n - If there are no functions that match the user request, respond politely that you cannot help."},
{"role": "user", "content": "What is the temperature in Tokyo right now?"},
# Extract the tag functioncall from the previous prediction and append it:
{"role": "assistant", "content": "functioncall name: get_temperature, arguments: city: Tokyo functioncall"},
{"role": "user", "content": "function_response temperature:30 C function_response"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids(eot_id)]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
In this two-step dance, the model first prepares to ask about the temperature and then executes that question like a performer ready for a grand finale. The intricate communication between steps allows for smooth user interactions and dynamic responses!
Troubleshooting Ideas
If you encounter any issues while implementing the above code, here are some troubleshooting ideas:
- Ensure that you have all required libraries installed, especially
transformersandtorch. - Check if your
model_idis correctly set and accessible. - Verify that
eot_idis defined if you plan to use it for terminators. - If the model returns errors, examine the structure of your messages and functions; a mismatch could lead to confusion.
- Consider experimenting with
temperatureandtop_pparameters for different output variability.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
With the capabilities of the Meta-Llama-3 model at your fingertips, you can create sophisticated applications effortlessly. By leveraging JSON mode and function calling, you can ensure that your interactions are not only efficient but also meaningful.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

