How to Implement Yuanfudao’s Three-way Attention for Commonsense Machine Comprehension

Jul 3, 2021 | Data Science

If you are keen on diving into the world of AI and enhancing machine comprehension using commonsense knowledge, you are in the right place. Here, we will explore how to run Yuanfudao’s model from SemEval-2018 Task 11, utilizing attention-based LSTM networks. This guide provides a complete roadmap to follow, ensuring you understand each step in the process.

Model Overview

The model leverages attention-based LSTM networks to boost machine comprehension by focusing on relevant inputs. If you’re interested in the nitty-gritty of this model, it’s advisable to refer to the paper titled Yuanfudao at SemEval-2018 Task 11. For a broader understanding of this specific task, check the paper on Machine Comprehension Using Commonsense Knowledge.

Getting Started with the Implementation

To successfully run the model, here are the prerequisites and steps you’ll need to follow:

Prerequisites

  • Python version: Ensure you have PyTorch 0.2, 0.3, or 0.4 installed. Although some warnings may crop up, it shouldn’t hinder functionality.
  • Install spaCy: Make sure to use version 2.0.
  • Be wary of Python 3.7: This version will not work because of the async keyword conflict.
  • Hardware: It is advisable to use a GPU machine, as training on a CPU will significantly increase training duration.

Step 1: Download Preprocessed Data

Prepare your data by following these procedures:

  • Download preprocessed data from Google Drive or Baidu Cloud Disk.
  • Once downloaded, unzip the data and place it in the “data” folder.
  • If you prefer to preprocess the dataset on your own, execute ./download.sh to fetch GloVe embeddings and ConceptNet.
  • Run ./run.sh to preprocess the dataset and initiate the model training.
  • Transform the original XML data to JSON format using xml2json by executing ./xml2json.py --pretty --strip_text -t xml2json -o test-data.json test-data.xml.

Step 2: Train the Model

To commence training your model, use the command:

python3 src/main.py --gpu 0

After roughly 50 epochs, expect the accuracy on the development set to reach approximately 83%.

How to Reproduce Competition Results

Following the steps mentioned will yield an accuracy of ~81.5% on the test set. However, to achieve the higher accuracy of ~83.95% seen in official submissions, you can implement the following additional techniques:

  • Pretrain your model using the RACE dataset for 10 epochs.
  • Train 9 distinct models employing different random seeds and ensemble their outputs for enhanced accuracy.

Troubleshooting

In case you encounter issues during your setup or training, here are some troubleshooting tips:

  • Make sure all dependency versions are correctly matched as per the prerequisites.
  • If you are experiencing performance issues, consider upgrading to a machine with a more powerful GPU.
  • Asynchronous conflicts can often be resolved by switching Python versions if you are using 3.7.
  • For data-related issues, double-check that you followed the steps to transform the XML to JSON format accurately.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox