How to Fine-Tune the DialoGPT Model Using Amazon’s Topical Chat Dataset

Apr 28, 2022 | Educational

Enhancing conversational abilities in AI is a fascinating endeavor, and fine-tuning models like DialoGPT can lead us closer to engaging open-domain conversations. In this guide, we will walk through the process of fine-tuning the DialoGPT-small model utilizing a subset of Amazon’s Topical Chat dataset, which spans a variety of engaging topics.

Understanding the Dataset and Model

The Amazon’s Topical Chat dataset is designed to help create socialbots that can engage in meaningful conversations across eight broad topics:

Fashion
Politics
Books
Sports
General Entertainment
Music
Science and Technology
Movies

Just like an amateur artist learns to paint from studying masters, our model learns conversational finesse from this dataset, which provides 50,000 observations of dialogues. The goal is to equip the model with the ability to participate in deep and engaging conversations.

Sample Dialogue

Here’s a glimpse of what engaging conversations could look like:

Role     Response
-----------------
User:    Do you like Kanye?
Bot:     He is okay, I like some of his music
User:    Any other thoughts on him?
Bot:     He is not a great person, but I like some of his music, so I'm not too surprised
User:    What does that say about you?
Bot:     I think he's trying to be funny. It's just that he's always out there and out there. Always out there. That's all I'll say.
User:    You're strange.
Bot:     Yeah, I guess so.

Using the Model

Now, let’s dive into how you can utilize this fine-tuned model! Below is an example code snippet to start a conversation with the model:

python
from transformers import AutoModelWithLMHead, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained('satkinson/DialoGPT-small-marvin')
model = AutoModelWithLMHead.from_pretrained('satkinson/DialoGPT-small-marvin')

# Let's chat for 5 lines
for step in range(5):
    # Encode the new user input, add the eos_token and return a tensor in PyTorch
    new_user_input_ids = tokenizer.encode(input("User: ") + tokenizer.eos_token, return_tensors='pt')
    
    # Append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
    
    # Generate a response while limiting the total chat history to 1000 tokens
    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
    
    # Pretty print last output tokens from bot
    print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

Analogy for Code Explanation

Think of the above code as a conversation between you (the user) and a personal assistant (the bot). You start by asking a question (enter the user input). The tokenizer acts like a translator, converting your words into a language that the assistant can understand. Just as the assistant remembers your previous questions during a chat, chat_history_ids keeps track of all exchanges. The model then generates a response based on everything discussed so far. This interactive dialogue continues until you decide to stop (after 5 turns).

Troubleshooting Ideas

While working with this model, you may encounter some obstacles. Here are a few troubleshooting ideas to help you get started:

Memory Errors: If your code runs out of memory, consider reducing the size of the dataset or the number of lines in the conversation.
Token Limit Warnings: If you get warnings about token limits, ensure you’re truncating your chat history to fit within the model’s constraints.
Installation Issues: Double-check that the necessary packages are installed using pip install transformers torch.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

Fine-tuning DialoGPT using Amazon’s Topical Chat dataset can lead to building more engaging conversational agents. By following the steps in this guide, you will be well on your way to creating an AI that can discuss a myriad of topics with fluency. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox