Welcome to the world of 360Zhinao, the advanced AI models designed to handle complex language tasks with ease! In this article, we’ll explore how you can quickly set up and use the various models in the 360Zhinao series.
Introduction to 360Zhinao
We have released several models under the 360Zhinao series, including:
- 360Zhinao-7B-Base
- 360Zhinao-7B-Chat-4K
- 360Zhinao-7B-Chat-32K
- 360Zhinao-7B-Chat-360K
These models leverage a high-quality corpus of approximately 3.4 trillion tokens and come equipped with impressive chat capabilities and context lengths, making them highly competitive.
Quickstart Guide
To get started with 360Zhinao, follow these steps:
Dependency Installation
Firstly, ensure that you have the required dependencies:
python = 3.8
pytorch = 2.0
transformers = 4.37.2
CUDA = 11.4
To install the dependencies, run:
pip install -r requirements.txt
Optionally, for better performance, you can also install Flash-Attention:
FLASH_ATTENTION_FORCE_BUILD=TRUE pip install flash-attn==2.3.6
Diving into the Models
Now, let’s see how to use the models for inference:
Example of Base Model Inference
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
MODEL_NAME_OR_PATH = "qihoo360/360Zhinao-7B-Base"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME_OR_PATH, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME_OR_PATH, device_map="auto", trust_remote_code=True)
generation_config = GenerationConfig.from_pretrained(MODEL_NAME_OR_PATH, trust_remote_code=True)
inputs = tokenizer(["n1", "n2", "n3", "n4", "n5"], return_tensors="pt")
inputs = inputs.to(model.device)
pred = model.generate(input_ids=inputs["input_ids"], generation_config=generation_config)
print(outputs["n"], tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
Example of Chat Model Inference
messages = []
messages.append({"role": "user", "content": "Hello!"})
response = model.chat(tokenizer=tokenizer, messages=messages, generation_config=generation_config)
messages.append({"role": "assistant", "content": response})
print(messages)
Understanding the Code: An Analogy
Think of using the 360Zhinao models like making yourself a fancy sandwich. The ingredients (dependencies) need to be ready first. You gather your fresh bread (Python and PyTorch), your meats and veggies (the required libraries), and then you start layering them (loading the model and tokenizer). Just like making a sandwich requires the right tools (like a knife for cutting), using these models effectively requires the right configurations and code. Finally, you can take a satisfying bite (run the inference), and enjoy the delicious outcome of your creation!
Troubleshooting Tips
- If you encounter issues with model loading, ensure all dependencies are correctly installed.
- Check the memory requirements of your GPU, as large models may require significant resources to run.
- For specific error messages, consider revisiting the documentation or checking forums for similar issues.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.