In this article, we’ll explore how to fine-tune the Phi3 instruction model for function calling using the MLX-LM architecture. By following this guide, not only will you learn the steps involved, but you’ll also gain insights into troubleshooting common issues that may arise. This is a fantastic opportunity for developers looking to harness AI capabilities in their projects!
What You Will Need
- Python installed on your machine
- Access to the Hugging Face library
- A basic understanding of machine learning concepts
Step-by-Step Instructions
Ready to dive in? Let’s break down the process into manageable steps!
1. Set Up Your Environment
First, ensure that you have the necessary libraries. You need to install the `transformers` library from Hugging Face. If you haven’t already, you can do this using:
pip install transformers torch
2. Import Libraries
Next, you’ll want to import the necessary libraries to leverage the capabilities of the Hugging Face Transformers.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
3. Load the Model
Here’s where it gets exciting! Load the Phi-3 model and tokenizer:
model_id = "mzbac/Phi-3-mini-4k-instruct-function-calling"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
4. Define Your Tool
In the context of your usage, you need to define a function tool that enables web searching:
tool = {
"name": "search_web",
"description": "Perform a web search for a given search terms.",
"parameters": {
"type": "object",
"properties": {
"search_terms": {
"type": "array",
"items": {"type": "string"},
"description": "The search queries for which the search is performed.",
"required": True
}
}
}
}
5. Prepare the Input Messages
Here, you will format the messages that the model processes:
messages = [
{"role": "user", "content": "You are a helpful assistant with access to the following functions. Use them if required - " + str(tool)},
{"role": "user", "content": "Any news in Melbourne today, May 7, 2024?"}
]
6. Tokenize and Generate the Output
Prepare the input for the model and generate the response:
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
terminators = [tokenizer.eos_token_id]
outputs = model.generate(input_ids, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.1)
response = outputs[0]
print(tokenizer.decode(response))
Understanding the Code with an Analogy
Imagine a chef (your model) who needs to prepare a unique dish (the output). To begin, the chef gathers all the necessary ingredients (the data). Each ingredient is carefully measured and prepared (the input messages are tokenized) before cooking starts. The cooking process itself (model generation) involves following the recipe (the parameters you set) to create a delicious meal (the final output). Once the dish is ready, the chef serves it to the guests (you, the users) for their satisfaction (the response from the model).
Troubleshooting
If you encounter any issues during implementation, here are a few troubleshooting tips:
- Model Not Loading: Check if the model ID is correct and you have an active internet connection to access Hugging Face.
- Output Errors: Ensure your input messages are formatted correctly and are not too verbose.
- Memory Issues: If you face low memory notifications, consider adjusting your training configurations, such as reducing the batch size.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
Congratulations! You’ve successfully learned to fine-tune the Phi3 instruction model for function calling. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

