How to Use the Phi-3 Portuguese Model for Text Generation

Jun 5, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_27_236

The Phi-3 Portuguese Model, also known as tom-cat-4k-instruct, is a powerful tool designed to assist users in generating text in Portuguese. This guide will walk you through the steps needed to get started with this model, helping bridge the gap in Portuguese language models.

Understanding the Model

Imagine you’re an artist trying to create a masterpiece, but you only have a set of crayons with limited colors. What if you had a full palette at your disposal? The Phi-3 model is that expansive palette, finely tuned with 300,000 instructions in Portuguese, allowing you to create vibrant and nuanced text.

Installation and Setup

Before diving into creating amazing content, you’ll need to install the necessary libraries:

Transformers
Accelerate
BitsAndBytes

Run the following commands in your Python environment:

!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes

Loading the Model

Now that we have everything installed, it’s time to load the model. You will initialize the model and the tokenizer with the following code:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("rhaymisonphi-3-portuguese-tom-cat-4k-instruct", device_map=":0")
tokenizer = AutoTokenizer.from_pretrained("rhaymisonphi-3-portuguese-tom-cat-4k-instruct")
model.eval()

Generating Text

Time to unleash your creativity! To generate text, we will create a pipeline:

from transformers import pipeline

pipe = pipeline("text-generation",
                model=model,
                tokenizer=tokenizer,
                do_sample=True,
                max_new_tokens=512,
                num_beams=2,
                temperature=0.3,
                top_k=50,
                top_p=0.95,
                early_stopping=True,
                pad_token_id=tokenizer.eos_token_id)

Creating Your Prompts

Now, craft your prompt with precision, much like a chef seasoning a dish. A well-formed prompt will guide the model to provide the best responses. Here’s how you can format a question:

def format_template(question: str):
    system_prompt = "Abaixo está uma instrução que descreve uma tarefa, juntamente com uma entrada que fornece mais contexto. Escreva uma resposta que complete adequadamente o pedido."
    return f"{system_prompt} {question}"

question = format_template("É possível ir de carro dos Estados Unidos até o Japão?")
pipe(question)

Troubleshooting

If you encounter any issues, such as a memory problem or a CUDA out-of-memory error, consider using 4-bit or 8-bit quantization configurations to alleviate the strain on your resources. Adjust your code like this:

from transformers import BitsAndBytesConfig
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True
)

model = AutoModelForCausalLM.from_pretrained(
    "rhaymisonphi-3-portuguese-tom-cat-4k-instruct",
    quantization_config=bnb_config,
    device_map=":0")

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

With the Phi-3 Portuguese model installed and ready to go, you can now explore the world of text generation in Portuguese like never before. Whether you are working on educational materials or creative stories, this model will serve as a reliable companion in your projects.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox