How to Use LongCite-glm4-9b for Long Context Question Answering

Oct 28, 2024 | Educational

In the ever-evolving world of AI and natural language processing, it’s crucial to have tools that can handle longer contexts effectively. LongCite-glm4-9b stands out as a robust model trained to generate fine-grained citations in long-context question-answering scenarios, all while operating with up to 128K tokens. This article will guide you through deploying and using this model seamlessly.

What You Need

Before diving into the code, ensure you have the following:

  • Python installed on your machine.
  • PyTorch and the Transformers library version 4.43.0.
  • The LongCite-glm4-9b model accessible through Hugging Face.

Installation and Setup

To get started, you need to set up your environment to accommodate LongCite-glm4-9b. Here’s a straightforward way to do it:

pip install transformers==4.43.0 torch

Loading the Model

The next step is to import the necessary libraries and load the model. To help you visualize this process, think of loading the model as setting up a giant library with all the books in the world at your fingertips. Now, let’s get to the code:

import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained('THUDMLongCite-glm4-9b', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('THUDMLongCite-glm4-9b', torch_dtype=torch.bfloat16, trust_remote_code=True, device_map='auto')

In this snippet, you’re like a librarian collecting reference cards (tokenizer) and loading the books (model) into your library. This allows you to easily retrieve information later.

Working with Context and Queries

After loading your model, it’s time to feed it some context and a question. For instance, if you want to inquire about Robert Geddes’s profession, your setup looks like this:

context = "W. Russell Todd, 94, United States Army general (b. 1928). ... (complete context here)" 
query = "What was Robert Geddes profession?"

Imagine feeding your librarian a stack of books (the context) along with a specific question about one of the authors. The librarian uses the information in the stack to provide an answer.

Generating the Answer

Now it’s time to generate an answer with citations:

result = model.query_longcite(context, query, tokenizer=tokenizer, max_input_length=128000, max_new_tokens=1024)

print("Answer:\n", format(result['answer']))
print("Statement with citations:\n", format(json.dumps(result['statements_with_citations'], indent=2, ensure_ascii=False)))
print("Context (divided into sentences):\n", format(result['splited_context']))

In this step, your librarian not only provides the answer but also shows how they came to that conclusion, giving you a well-documented citation!

Troubleshooting Common Issues

If you encounter issues, here are some common troubleshooting tips:

  • Model Loading Errors: Ensure you have the correct permissions and that the model name is correctly spelled.
  • Memory Errors: Since the model uses a large context window, ensure your system has enough memory.
  • Dependency Issues: If any library is missing, re-check your installations.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

With LongCite-glm4-9b, you can now tackle long-context question answering with ease. This model not only provides a comprehensive answer but also supports robust citation capabilities for better referencing. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox