How to Implement Transformer Question Generation on SQuAD

May 24, 2021 | Educational

Welcome to the realm of automated question generation! In this blog post, we will guide you through the process of using the Transformer model for generating questions from the SQuAD dataset. Whether you’re developing a chatbot or just exploring AI, this methodology is fundamental to harnessing Natural Language Processing (NLP).

Understanding the Concept

Before diving deep, let’s get a better grasp of what we’re about to do. Think of the Transformer model as a chef who can take a recipe (the text) and turn it into a buffet of questions. This is not just any chef; it’s one skilled in extracting core concepts and transforming them into engaging queries. The SQuAD dataset serves as the pantry, stocked with rich content that the chef can use to prepare an impressive spread of question dishes.

Input Format

To start, we need to provide the input in a specific manner:

C = [c1, c2, …, [HL], a1, …, aA, [HL], …, cC]

Here, ‘c’ represents the context, and ‘a’ is the answer embedded within special markers [HL] that signal to the model where to focus.

Input Example

For instance, you may input:

Harry Potter is a series of seven fantasy novels written by British author, [HL]J. K. Rowling[HL].

And the generated question could be: Who wrote Harry Potter?

Data Setting

Moving on, it is crucial to set your data correctly. There are two main dataset configurations:

SQuAD
- Train: 87,599
- Validation: 10,570
- SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD NQG
- Train: 75,722
- Dev: 10,570
- Test: 11,877
- Learning to Ask: Neural Question Generation for Reading Comprehension

Available Models

Several models can be utilized for our task:

BART
GPT-2
T5

Experimental Results

Let’s take a quick look at the performance scores of various models tested on the SQuAD and SQuAD NQG datasets:


SQuAD Model                            Bleu 1 Bleu 2  Bleu 3  Bleu 4  METEOR  ROUGE-L
BART-HLSQG                       54.67   39.26   30.34   24.15   25.43  52.64
GPT2-HLSQG                       49.31   33.95   25.41   19.69   22.29  48.82
T5-HLSQG                         54.29   39.22   30.43   24.26   25.56  53.11

SQuAD NQG Model                   Bleu 1  Bleu 2  Bleu 3  Bleu 4  METEOR  ROUGE-L
BERT-HLSQG (Chan et al.)         49.73   34.60   26.13   20.33   23.88  48.23
BART-HLSQG                       54.12   38.19   28.84   22.35   24.55  51.03
GPT2-HLSQG                       49.82   33.69   24.71   18.63   21.90  47.60
T5-HLSQG                         53.13   37.60   28.62   22.38   24.48  51.20

Troubleshooting

If you encounter difficulties while implementing the model, consider the following troubleshooting ideas:

Ensure your dependencies are correctly installed and updated to the latest versions.
Check your input format rigorously; slight deviations could lead to unexpected results.
Review the dataset settings and confirm that the data you’re working with aligns with the model requirements.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Now, you are all set to explore the world of question generation using Transformers! Happy coding!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox