Chat with Your Documents Privately Using MPT-30B

Jun 2, 2022 | Data Science

In today’s digital era, privacy and control over our data are paramount. With the advent of MPT-30B, an open-source model with an impressive 8k context length, you now have the power to chat with your documents directly on your computer, without needing an internet connection. In this guide, we’ll walk through the requirements, installation process, and usage of MPT-30B for a seamless document interaction experience.

Requirements

Before diving into the installation, ensure your system meets the minimum requirements:

  • 32GB of RAM
  • Python 3.10

Installation Steps

Ready to get started? Follow these steps to unleash the potential of MPT-30B:

  1. Install Poetry: This will help manage project dependencies.
  2. pip install poetry
  3. Clone the Repository: Get the source code.
  4. git clone insert github repo url
  5. Install Project Dependencies: This will set up all required packages.
  6. poetry install
  7. Set Up Environment Variables: Prepare your environment file.
  8. cp .env.example .env
  9. Download the Model: Get the model file (approx. 19GB).
  10. You can run:

    python download_model.py

    Alternatively, download it here. Create a models folder in the root directory and place the file there.

  11. Ingest Your Documents: Get your documents ready for interaction.
  12. The repository creates a folder named source_documents for this purpose. Replace the existing documents with your own in this folder. Supported extensions include:

    • .csv: CSV
    • .docx: Word Document
    • .doc: Word Document
    • .eml: Email
    • .epub: EPub
    • .html: HTML File
    • .md: Markdown
    • .pdf: Portable Document Format (PDF)
    • .pptx: PowerPoint Document
    • .txt: Text file (UTF-8)

    Now run the ingest script:

    shellpython ingest.py

    The output should show the ingestion progress and completion.

Chatting with Your Documents

Once your documents are ingested, you can start chatting:

  1. Load the Command Line:
  2. poetry run python question_answer_docs.py

    Or simply:

    make qa
  3. Enter Your Questions: Wait for the prompt and type your queries. Type “exit” to quit.

Keep in mind that processing time may vary based on your computer’s memory and document size, taking anywhere from 40 to 300 seconds.

Optional: Run the Plain Chatbot

If you’d like a simple interaction with the MPT-30B chatbot without ingesting documents, just run:

poetry run python chat.py

Or:

make chat

Troubleshooting

If you encounter any issues during installation or usage, here are some troubleshooting ideas:

  • Ensure that your system meets the minimum specifications.
  • Verify that you have the correct Python version installed.
  • If the model download fails, check your internet connection or try to download it from the alternative link.
  • For any other issues, refer to the repository documentation or community forums.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox