How to Use SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool

Jul 17, 2021 | Data Science

Welcome to a user-friendly guide on getting started with SimplyRetrieve, an innovative open-source tool designed for the machine learning community. With its emphasis on Retrieval-Centric Generation (RCG), this tool allows users to create chat tools using custom documents and language models. Let’s dive into how to set it up and troubleshoot any issues you might run into along the way!

What is SimplyRetrieve?

SimplyRetrieve is a lightweight and user-friendly GUI (Graphical User Interface) and API platform designed to enhance your experience with retrieval-centric generation. It boasts features such as:

  • GUI and API based Retrieval-Centric Generation platform
  • Retrieval Tuning Module for Prompt Engineering
  • Private Knowledge Base Constructor
  • Access to various sizes of open-source Large Language Models (LLMs)
  • Multi-user concurrent access

Setting Up SimplyRetrieve

Before you can start exploring SimplyRetrieve, you need to ensure your system meets the prerequisites and is set up correctly.

Prerequisites

  • Git clone the repository.
  • For GPU-based Linux machines, activate your favorite Python virtual environment.
  • Install the required packages by running:
  • pip install -r requirements.txt

Using Your Own Data

If you’d like to use your own data as a knowledge source, follow these steps (though you can skip to the next section if you’re using the default source):

  • Prepare your knowledge source by placing related documents (PDF, TXT, DOC, etc.) into the chatdata directory.
  • Run the preparation script with the command:
  • CUDA_VISIBLE_DEVICES=0 python prepare.py --input data --output knowledge --config configs/default_release.json
  • Supported document formats include PDF, TXT, DOC, DOCX, PPT, PPTX, HTML, MD, CSV, and more.

Remember, the Knowledge Base creation feature is now available through the Knowledge Tab of the GUI tool. This allows users to add knowledge on-the-fly!

How to Run SimplyRetrieve

Once you’ve set up the prerequisites, navigate to the chat directory and execute the following command:

CUDA_VISIBLE_DEVICES=0 python chat.py --config configs/default_release.json

After a few minutes of waiting (grab a coffee!), access the web-based GUI by entering http://LOCAL_SERVER_IP:7860 in your browser, replacing LOCAL_SERVER_IP with your actual GPU server’s IP address.

Understanding the Code: An Analogy

Let’s break down the launch command. Think of running this tool like starting a new café. The command is essentially your opening hours and menu. “CUDA_VISIBLE_DEVICES=0” is like knowing which stove (or GPU) to use in your café kitchen. Using “python chat.py” is akin to turning on your coffee maker to start brewing that perfect cup, while the “–config configs/default_release.json” is your recipe book ensuring all ingredients are correct. With everything prepped, your café is officially open, welcoming customers (or users) for a delightful chat experience!

Troubleshooting Guide

If you encounter issues while setting up or using SimplyRetrieve, here are some troubleshooting ideas:

  • Ensure all dependencies are properly installed and updated.
  • If facing NLTK-related errors, refer to configuration changes following the tips on GitHub.
  • Restart your GPU server and re-run the necessary commands if errors persist.
  • If you don’t have a local GPU server, visit this repository for instructions on using AWS EC2 cloud platform.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

SimplyRetrieve is a powerful and flexible tool, perfect for those looking to experiment with retrieval-centric generation in a private, local environment. Dive in and start your journey in the exciting world of AI today!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox