How to Use KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning

Apr 23, 2021 | Data Science

Welcome to the world of commonsense reasoning with KagNet! In this guide, we will walk you through installing KagNet, preparing your datasets, setting up the necessary components, and ultimately training your model. Whether you’re a novice or a seasoned programmer, this blog will serve as a user-friendly resource to help you get started.

Getting Started with KagNet

KagNet is a specialized framework designed to enhance commonsense reasoning through the innovative combination of graph networks and LSTM-based path encoding. To get started, we need to install the necessary dependencies and prepare our data.

Step-by-Step Installation Guide

  • Install Dependencies:
    • Open your terminal and run the following commands:
    • sudo apt-get install graphviz libgraphviz-dev pkg-config
      conda create -n kagnet_test python==3.6.3
      conda activate kagnet_test
      pip install torch torchvision
      pip install tensorflow-gpu==1.10.0
      conda install faiss-gpu cudatoolkit=10.0 -c pytorch -n kagnet_test
      pip install nltk
      conda install -c conda-forge spacy -n kagnet_test
      python -m spacy download en
      pip install jsbeautifier
      pip install networkx
      pip install dgl
      pip install pygraphviz
      pip install allennlp

This series of commands sets up the environment with all necessary packages tailored for our KagNet implementation.

Downloading Datasets

Next, we will download and prepare the CommonsenseQA datasets. Enter the following commands:

cd datasets
mkdir csqa_new
wget -P csqa_new https://s3.amazonaws.com/commensenseqa/train_rand_split.jsonl
wget -P csqa_new https://s3.amazonaws.com/commensenseqa/dev_rand_split.jsonl
wget -P csqa_new https://s3.amazonaws.com/commensenseqa/test_rand_split_no_answers.jsonl
python convert_csqa.py csqa_new/train_rand_split.jsonl csqa_new/train_rand_split.jsonl.statements
python convert_csqa.py csqa_new/dev_rand_split.jsonl csqa_new/dev_rand_split.jsonl.statements
python convert_csqa.py csqa_new/test_rand_split_no_answers.jsonl csqa_new/test_rand_split_no_answers.jsonl.statements

The Analogy of KagNet Components

Think of KagNet as a complex city where each building (node) is filled with information (knowledge). The roads connecting these buildings are like the relationships between concepts in our data. A traditional model might visit each building randomly, while KagNet uses a well-planned route (graph networks) to stop at each important location (LSTM-based path encoder) and collect enriched knowledge as it goes. The more strategic the path, the better the accumulation and understanding of the essential information.

Preprocessing ConceptNet and Embedding Files

After downloading the data, we will preprocess our ConceptNet and embedding files to make them ready for training. Use the following commands:

cd ..
mkdir conceptnet
cd conceptnet
wget https://s3.amazonaws.com/conceptnet/downloads/2018/edges/conceptnet-assertions-5.6.0.csv.gz
gzip -d conceptnet-assertions-5.6.0.csv.gz
python extract_cpnet.py
cd ..
mkdir embeddings
cd embeddings
wget http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip
rm glove.*.zip
cd ..
python glove_to_npy.py  
python create_embeddings_glove.py

Concept Grounding and Schema Graph Construction

To enhance the model’s understanding, we need to ground our concepts and create schema graphs:

cd ..
mkdir grounding
cd grounding
python batched_grounding.py generate_bash ..datasets/csqa_new/train_rand_split.jsonl.statements
bash cmd.sh
python batched_grounding.py combine ..datasets/csqa_new/train_rand_split.jsonl.statements
python prune_qc.py ..datasets/csqa_new/train_rand_split.jsonl.statements.mcp
# Repeat similar commands for dev and test datasets 

Training the KagNet

Once all preprocessing steps are complete, it’s time to train the KagNet:

cd ..
bash train_csqa_bert.sh
python extract_csqa_bert.py --bert_model bert-large-uncased --do_eval --do_lower_case --data_dir ..datasets/csqa_new --eval_batch_size 60 --learning_rate 1e-4  --max_seq_length 70 --mlp_hidden_dim 16 --output_dir .models --save_model_name bert_large_b60g4lr1e-4wd0.01wp0.1_1337 --epoch_id 1 --data_split_to_extract train_rand_split.jsonl --output_sentvec_file ..datasets/csqa_new/train_rand_split.jsonl.statements.finetuned.large --layer_id -1
# Repeat for the development dataset

Troubleshooting Tips

  • If you encounter any errors during installation:
    • Check your Python version – KagNet is compatible with Python 3.6.3.
    • Ensure Conda is correctly installed and activated.
    • Validate that you are in the correct directory before executing scripts.
  • For dataset issues:
    • Make sure to have a reliable internet connection for downloading datasets.
    • Verify file permissions if you can’t access the downloaded files.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

Congratulations! You’ve successfully set up and initiated training for KagNet. Enjoy exploring its capabilities and pushing the envelope in commonsense reasoning!

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox