Have you ever wished you could have a chat with a book? Well, with the dr-doc-search package, you can! This powerful tool allows you to generate embeddings from PDF documents and ask questions about their content using advanced AI techniques. In this article, we’ll walk you through the setup and usage of this amazing tool.
Pre-requisites
Before diving in, there are a few things you need to have in place:
- Tesseract OCR
- ImageMagick – Note for Windows users: Set the location of the ImageMagick executable in the IMCONV environment variable.
Installation Steps
Getting started with dr-doc-search is easy:
pip install dr-doc-search
Example Usage
Using dr-doc-search involves two main steps:
1. Create the Index and Generate Embeddings
First, you need to create an index from a PDF file. Let’s say you want to work with the PDF titled “Parable of a Monetary Economy.” Ensure you set up your OpenAI API key first. If you prefer, you can now use HuggingFace models instead of OpenAI.
Run the following commands in your shell:
export OPENAI_API_KEY=your-openai-api-key
dr-doc-search --train -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf
dr-doc-search --train -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf --embedding huggingface
Once the training process is complete, it will create a set of folders and files in your home directory under OutputDir. It will look something like this:
~/OutputDir/dr-doc-search/parable-of-a-monetary-economy/
├── index
│ ├── docsearch.index
│ └── index.pkl
├── scanned
│ ├── output-1.txt
│ └── ...
└── parable-of-a-monetary-economy-heteconomist.pdf
2. Ask Questions
Now that you have created your index, you can start querying it. Use these commands:
dr-doc-search -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf --input-question "How did the attempt to reduce the debt result in a decrease in employment?"
Alternatively, you can use a web interface to ask questions:
dr-doc-search --web-app -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf
To use HuggingFace models in the web app, you can run:
dr-doc-search --web-app -i ~/Downloads/parable-of-a-monetary-economy-heteconomist.pdf --llm huggingface
Troubleshooting
If you encounter any issues during the setup or execution, here are a few troubleshooting tips:
- Ensure that all pre-requisites are correctly installed.
- Check that the environment variables are properly set for Windows users.
- If the program does not run, verify that your OpenAI API key is correct or ensure that you are connected to the internet.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

