The world of AI is ever-evolving, and with it comes the need for efficient models that can drive functionality in applications. One such advancement is the fine-tuning of instruction models, specifically the Phi3 model for function calling using MLX-LM. In this article, we’ll explore how to implement this using the Hugging Face library.
Getting Started
Before diving into the code, ensure you have the following prerequisites:
- Python installed on your machine.
- The Transformers library from Hugging Face.
- The llm-quantizer repository for model quantization.
Step-by-Step Guide
Now, let’s break down the steps required to fine-tune the Phi3 model:
1. Import Necessary Libraries
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
This is akin to gathering your tools before starting a DIY project. Here, you’re importing the necessary components for your AI model.
2. Load the Model and Tokenizer
model_id = "mzbac/Phi-3-mini-4k-instruct-function-calling"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map='auto',)
Think of this as setting up the foundation of a house. The model_id is your blueprint, and the tokenizer & model are the raw materials you are using to construct something functional.
3. Define the Tool for Function Calling
tool = {
"name": "search_web",
"description": "Perform a web search for given search terms.",
"parameter": {
"type": "object",
"properties": {
"search_terms": {
"type": "array",
"items": {"type": "string"},
"description": "The search queries for which the search is performed.",
"required": True,
},
}
}
}
Here, you’re defining a ‘tool’ which is like creating a specialized feature in your house designed for a particular utility, in this case, web searching capabilities.
4. Prepare the Messages for Input
messages = [
{"role": "user", "content": "You are a helpful assistant with access to the following functions. Use them if required - str(tool)"},
{"role": "user", "content": "Any news in Melbourne today, May 7, 2024?"}
]
Think of messages as dialogues in a play. Each message has a role that contributes to the overall interaction with your model.
5. Input Processing and Output Generation
input_ids = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_tensors='pt').to(model.device)
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids('end')]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.1,
)
response = outputs[0]
print(tokenizer.decode(response))
This step is like navigating through a maze; you’re processing your inputs and allowing the model to generate a response. The generate function initiates the response, akin to unlocking the exit of the maze.
Troubleshooting Common Issues
While working on fine-tuning models, you may encounter issues. Here are some common troubleshooting tips:
- Error in Loading Model: Ensure that the correct
model_idis provided. Check for typos or incorrect paths. - Insufficient Memory: If you run into memory issues, consider reducing the
max_seq_lengthor batch size. - Unexpected Output: If the model outputs irrelevant information, refine your input messages to be more specific.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
Success in fine-tuning the Phi3 model opens up new vistas in the AI landscape, making it a vital inclusion for applications requiring sophisticated function calling. Remember, at fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

