How to Use the 360Zhinao Model Series

May 1, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_18_208

Welcome to the world of 360Zhinao, the advanced AI models designed to handle complex language tasks with ease! In this article, we’ll explore how you can quickly set up and use the various models in the 360Zhinao series.

Introduction to 360Zhinao

We have released several models under the 360Zhinao series, including:

360Zhinao-7B-Base
360Zhinao-7B-Chat-4K
360Zhinao-7B-Chat-32K
360Zhinao-7B-Chat-360K

These models leverage a high-quality corpus of approximately 3.4 trillion tokens and come equipped with impressive chat capabilities and context lengths, making them highly competitive.

Quickstart Guide

To get started with 360Zhinao, follow these steps:

Dependency Installation

Firstly, ensure that you have the required dependencies:

python = 3.8
pytorch = 2.0
transformers = 4.37.2
CUDA = 11.4

To install the dependencies, run:

pip install -r requirements.txt

Optionally, for better performance, you can also install Flash-Attention:

FLASH_ATTENTION_FORCE_BUILD=TRUE pip install flash-attn==2.3.6

Diving into the Models

Now, let’s see how to use the models for inference:

Example of Base Model Inference

from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

MODEL_NAME_OR_PATH = "qihoo360/360Zhinao-7B-Base"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME_OR_PATH, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME_OR_PATH, device_map="auto", trust_remote_code=True)
generation_config = GenerationConfig.from_pretrained(MODEL_NAME_OR_PATH, trust_remote_code=True)

inputs = tokenizer(["n1", "n2", "n3", "n4", "n5"], return_tensors="pt")
inputs = inputs.to(model.device)
pred = model.generate(input_ids=inputs["input_ids"], generation_config=generation_config)

print(outputs["n"], tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

Example of Chat Model Inference

messages = []
messages.append({"role": "user", "content": "Hello!"})
response = model.chat(tokenizer=tokenizer, messages=messages, generation_config=generation_config)
messages.append({"role": "assistant", "content": response})

print(messages)

Understanding the Code: An Analogy

Think of using the 360Zhinao models like making yourself a fancy sandwich. The ingredients (dependencies) need to be ready first. You gather your fresh bread (Python and PyTorch), your meats and veggies (the required libraries), and then you start layering them (loading the model and tokenizer). Just like making a sandwich requires the right tools (like a knife for cutting), using these models effectively requires the right configurations and code. Finally, you can take a satisfying bite (run the inference), and enjoy the delicious outcome of your creation!

Troubleshooting Tips

If you encounter issues with model loading, ensure all dependencies are correctly installed.
Check the memory requirements of your GPU, as large models may require significant resources to run.
For specific error messages, consider revisiting the documentation or checking forums for similar issues.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox