Are you looking to have an interactive experience with your PDFs or .Docx files right from your Windows CMD console? Today, we will guide you through creating a Python script that utilizes Google Gemini Pro models with your own API Key. This article will walk you through the steps, provide an analogy for better understanding, and include troubleshooting tips along the way. Let’s dive in!
Prerequisites
- Python installed on your system.
- A Google API key: You can obtain this from Google AI Studio.
- Basic understanding of using CMD on Windows 10.
Setting Up Your Script
First, let’s talk about what the script will do. Imagine your PDF or document as a book. This script acts as a librarian that opens the book, reads it, understands the key themes, and can answer your questions or summarize sections based on what you ask. Here’s how it breaks down:
Code Overview
Below is a simple Python script to achieve the chat functionality with your PDF or document files:
import os
import google.generativeai as genai
def main():
documents_folder = "input_documents_folder"
log_folder = "log_folder"
output_folder = "output_responses_folder"
models = ["Gemini Pro Model 1", "Gemini Pro Model 2"]
print("Select a model:", models)
doc_files = [f for f in os.listdir(documents_folder) if f.endswith(('.doc', '.docx'))]
print("Documents available:")
for idx, file in enumerate(doc_files):
print(f"{idx}: {file}")
selected_files = input("Select document numbers (comma-separated): ").split(",")
combined_text = ""
for file_index in selected_files:
combined_text += open(os.path.join(documents_folder, doc_files[int(file_index)])).read()
print("Total tokens:", len(combined_text))
iterations = int(input("Enter the number of iterations for the AI model: "))
for _ in range(iterations):
prompt = input("What do you want to do with the text? (e.g., summarize, key concepts): ")
response = genai.Chat(combined_text, model=models[0], prompt=prompt)
print("Response:", response.text)
# Logging the responses
with open(os.path.join(log_folder, f"{date.today()}.log"), 'a') as log_file:
log_file.write(f"Prompt: {prompt} | Response: {response.text}\n")
if __name__ == "__main__":
main()
Breaking Down the Code with an Analogy
Let’s take a closer look at the code by analogy. Think of this script as a bus journey through a library of documents:
- Documents Folder: This is your library where various books (documents) are stored.
- The Librarian: The script serves as the librarian that takes requests from the reader (you) and brings back the relevant information from the documents.
- Model Selection: Choosing a model is like selecting which librarian you want to assist you—each with different expertise.
- Prompt Input: This is like asking the librarian to “summarize this book” or “explain key concepts”—the librarian responds with what you’ve requested.
- Logging Responses: This is akin to keeping track of the conversations you’ve had with the librarian for future reference.
Troubleshooting
If you encounter any issues while using this script, consider the following troubleshooting tips:
- Ensure that all folders (documents, logs, and outputs) exist before running the script. Create them if missing.
- Verify your Google API key is correctly entered in the script.
- If the model selection is not responding, check your internet connection or API access permissions.
- If you experience errors while reading documents, ensure the files are not corrupted and are in the right format.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
By following the steps outlined above, you can successfully create a Chat-With-PDF script that utilizes powerful AI models. This can open up new avenues for processing and interacting with your documents. Keep experimenting, and you may discover even more functionalities!
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

