The Swallow model, derived from the renowned LLaMA-2 family, has been tailored for enhanced performance in text generation, especially when working with Japanese language data. This guide will walk you through using this powerful model step by step, ensuring that you can leverage its capabilities in your projects.
Getting Started
First, you’ll need to ensure that you have the necessary library installed. The Swallow model relies on the Transformers library for implementation.
- Install the required dependencies:
sh
pip install -r requirements.txt
Using the Swallow Instruct Model
To interact with the Swallow Instruct Model, follow these commands:
python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "tokyotech-llm/Swallow-7b-instruct-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto")
PROMPT_DICT = {
"prompt_input": (
"以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。\n"
"リクエストを適切に完了するための回答を記述してください。\n\n"
"### 指示:\n{instruction}\n\n"
"### 入力:\n{input}\n\n"
"### 応答:\n"
),
"prompt_no_input": (
"以下に、あるタスクを説明する指示があります。\n"
"リクエストを適切に完了するための回答を記述してください。\n\n"
"### 指示:\n{instruction}\n\n"
"### 応答:\n"
),
}
def create_prompt(instruction, input=None):
"""Generates a prompt based on the given instruction and an optional input."""
if input:
return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input)
else:
return PROMPT_DICT["prompt_no_input"].format(instruction=instruction)
# Example usage
instruction_example = "以下のトピックに関する詳細な情報を提供してください。"
input_example = "東京工業大学の主なキャンパスについて教えてください。"
prompt = create_prompt(instruction_example, input_example)
input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
tokens = model.generate(input_ids.to(device=model.device), max_new_tokens=128, temperature=0.99, top_p=0.95, do_sample=True)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)
Using the Base Model
If you prefer to use the base Swallow model without any additional instruction tuning, use the following commands:
python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "tokyotech-llm/Swallow-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
prompt = "東京工業大学の主なキャンパスは、"
input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
tokens = model.generate(input_ids.to(device=model.device), max_new_tokens=128, temperature=0.99, top_p=0.95, do_sample=True)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)
Understanding the Code Analogy
Imagine you’re a chef in a restaurant (the model) and you have a detailed recipe book (the tokenizer) that guides you through making intricate dishes (text generation). Each dish requires various ingredients (input) – some might need specific spices (context), while others simply follow basic flavors (general instructions).
When a customer (user) places an order (instruction), you look up the recipe that outlines how to create the dish based on their preferences and any additional context they’ve provided. After gathering the ingredients, you whip up a delightful meal (output) that satisfies the customer’s taste. This entire process reflects how the Swallow model operates, where the instructions guide the generation, resulting in cohesive and meaningful text outputs.
Troubleshooting
If you encounter any issues while using the Swallow model, consider the following troubleshooting ideas:
- Ensure all dependencies are properly installed as specified in the
requirements.txt. - Check the model name and make sure it’s spelled correctly in your code.
- Verify that you are using an appropriate environment that supports the necessary hardware for model execution.
- Consider adjusting the parameters like
temperatureandtop_pfor better results.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
The Swallow model is a robust tool for text generation, particularly for Japanese and English languages. By following the steps outlined in this guide, you will be well-equipped to begin creating your own text generation applications.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

