
Welcome to the exciting world of Knowledge-Augmented Planning for LLM-Based Agents! In this blog, we will explore how advanced techniques enhance the capabilities of large language models (LLMs) to plan and execute complex tasks effectively.
Table of Contents
News
[2024-03] We release a new paper: KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents.
Installation
To get started with KnowAgent, follow these simple installation steps:
git clone https://github.com/zjunlp/KnowAgent.git
cd KnowAgent
pip install -r requirements.txt
We have placed the HotpotQA and ALFWorld datasets under Path_Generation/alfworld_run/data
and Path_Generation/hotpotqa_run/data
respectively. For further configuration, we recommend proceeding with the original setup of ALFWorld and FastChat.
Planning Path Generation
The Planning Path Generation process is integral to KnowAgent. You can find the scripts for running the Planning Path Generation in the Path_Generation
directory, specifically run_alfworld.sh
and run_hotpotqa.sh
. These scripts can be executed using bash commands.
To tailor the scripts to your needs, modify the mode parameter to switch between training (train
) and testing (test
) modes, and change the llm_name
parameter to use a different LLM:
cd Path_Generation
# For training with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode train --output_path ../Self-Learning/trajs
# For testing with HotpotQA
python run_hotpotqa.py --llm_name llama-2-13b --max_context_len 4000 --mode test --output_path output
# For training with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode train --output_path ../Self-Learning/trajs
# For testing with ALFWorld
python alfworld_run/run_alfworld.py --llm_name llama-2-13b --mode test --output_path output
Here we release the trajectories synthesized by Llama-7, 13, 70b-chat in Google Drive before filtering.
Knowledgeable Self-Learning
After obtaining the planning paths and corresponding trajectories, the process of Knowledgeable Self-Learning begins. The generated trajectories are first converted to the Alpaca format using the scripts in the Self-Learning directory, such as traj_reformat.sh
. For initial iterations, use:
cd Self-Learning
# For HotpotQA
python trainHotpotqa_reformat.py --input_path trajs/KnowAgent_HotpotQA_llama-2-13b.jsonl --output_path traindatas
# For ALFWorld
python trainALFWorld_reformat.py --input_path trajs/KnowAgent_ALFWorld_llama-2-13b.jsonl --output_path traindatas
For subsequent iterations, it’s necessary to perform Knowledge-Based Trajectory Filtering and Merging:
python trajs/traj_merge_and_filter.py --task HotpotQA --input_path1 trajs/datas/KnowAgent_HotpotQA_llama-2-13b_D0.jsonl --input_path2 trajs/datas/KnowAgent_HotpotQA_llama-2-13b_D1.jsonl --output_path trajs/datas
Next, commence Self-Learning by running train.sh
and train_iter.sh
:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed train/train_lora.py --model_name_or_path llama-2-13b-chat --lora_r 8 --lora_alpha 16 --lora_dropout 0.05 --data_path datas/data_knowagent.json --output_dir models/HotpotqaM1 --num_train_epochs 5 --per_device_train_batch_size 2 --per_device_eval_batch_size 1 --gradient_accumulation_steps 1 --evaluation_strategy no --save_strategy steps --save_steps 10000 --save_total_limit 1 --learning_rate 1e-4 --weight_decay 0. --warmup_ratio 0.03 --lr_scheduler_type cosine --logging_steps 1 --fp16 True --model_max_length 4096 --gradient_checkpointing True --q_lora False --deepspeed data/zyq/FastChat/playground/deepspeed_config_s3.json --resume_from_checkpoint False
Troubleshooting
If you encounter any issues, here are some troubleshooting ideas:
- Ensure that all repositories and requirements are properly installed.
- Check if the correct LLM name is referenced in your scripts.
- Verify the paths to your datasets are accurate.
- If the scripts do not execute as expected, consider running them one line at a time in your terminal to identify errors.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Citation
Cite our work as follows:
@article{zhu2024knowagent, title={KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents}, author={Zhu, Yuqi and Qiao, Shuofei and Ou, Yixin and Deng, Shumin and Zhang, Ningyu and Lyu, Shiwei and Shen, Yue and Liang, Lei and Gu, Jinjie and Chen, Huajun}, journal={arXiv preprint arXiv:2403.03101}, year={2024}}
Acknowledgement
We express our gratitude to the creators and contributors of the following projects:
- FastChat: Our training module code is adapted from FastChat.
- BOLAA: The inference module code is implemented based on BOLAA.
- Thank you to the teams behind **ReAct**, **Reflexion**, **FireAct**, and others for their foundational work!
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.