How to Get Started with SUS-Chat: Instruction Tuning Done Right

Jul 13, 2024 | Educational

Welcome to your guide to SUS-Chat, a cutting-edge bilingual dialogue model from the Southern University of Science and Technology (SUSTech) and IDEA-CCNL. This article will take you through the setup process, usage, and how to troubleshoot any issues you might encounter along the way.

What is SUS-Chat?

The SUS-Chat-34B model is a 34-billion parameter bilingual dialogue model that excels in handling complex instructions in both English and Chinese. Leveraging state-of-the-art instruction fine-tuning on rich multilingual datasets, it enhances its responses to user instructions, making it capable of multi-turn dialogues and natural conversations.

Getting Started with SUS-Chat

Prerequisites

  • Python 3.6 or newer
  • A GPU for efficient computation (recommended)
  • Install the required libraries (Transformers and PyTorch)

Installation

To install the necessary libraries, run the following command in your terminal:

pip install torch transformers

Setting Up SUS-Chat

Once the prerequisites are in place, you can begin using SUS-Chat. Here’s how it works:

from transformers import AutoModelForCausalLM, AutoTokenizer

def chat_template(messages):
    history = ""
    for message in messages:
        match message:
            case {"role": "user", "content": user_message}:
                history += f"### Human: {user_message}\n\n### Assistant: "
            case {"role": "assistant", "content": assistant_message}:
                history += assistant_message
    return history

model_path = "SUSTech/SUS-Chat-34B"
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_path).eval()

messages = [{"role": "user", "content": "hi"}]
input_ids = tokenizer.encode(chat_template(messages), return_tensors='pt').to('cuda')
output_ids = model.generate(input_ids, max_length=256)
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=False)
messages.append({"role": "assistant", "content": response})

Understanding the Code

Think of the conversation process with SUS-Chat as if it were a well-coordinated two-person play. Each time a user interacts with the model, a new scene is created.

  • The chat_template function is like the stage director, overseeing the flow of dialogue between the user and the assistant. It manages how each message is formatted for the next scene.
  • The model generates a response based on the conversation history, much like an actor performing their lines while considering what has happened in prior scenes.
  • Every interaction updates the script of the play, reflecting the latest developments in the conversation.

Troubleshooting Common Issues

If you encounter any issues while using SUS-Chat, consider these troubleshooting steps:

  • Installation errors: Ensure all libraries are correctly installed using the pip command mentioned earlier.
  • Model loading issues: Confirm that the model path is accurate and accessible.
  • Memory errors: If using a GPU, ensure it has enough memory for the model size and computations.
  • Incorrect responses: Review the input data and check formatting in the messages list.

For additional support, you can reach out to the community or explore the source code on GitHub.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox