Mercury

Aug 30, 2021 | Educational

Chat with any Document or Website

With Mercury, you can customize your GPT by training it on specific documents and websites. Imagine having a chat with your favorite book or website, effortlessly pulling accurate information and context. This powerful tool builds on a dialog with chat history and accurately cites its sources, working seamlessly like a smart assistant right at your fingertips.

Supported Files

  • [x] .pdf
  • [x] .docx
  • [x] .md
  • [x] .txt
  • [x] .png
  • [x] .jpg
  • [x] .html
  • [x] .json

Coming Soon

  • [ ] .csv
  • [ ] .pptx
  • [ ] notion
  • [ ] next 13 app dir
  • [ ] vercel ai sdk

How to Train Your Custom GPT

1. Upload

First, you upload your document through the api/embed-file. The file is cleaned to plain text and split into 1000-character sections. OpenAI’s embedding API is then used to generate embeddings for each section using the text-embedding-ada-002 model, and these embeddings are stored in a Pinecone namespace.

2. Scrape

For websites, you use the api/embed-webpage method to scrape and clean web pages to plain text. Like the previous step, these cleaned texts are split into 1000-character documents, embeddings generated, and subsequently stored in the Pinecone namespace.

Querying Your Model

When a user sends a query using the api/query, a single embedding for the user prompt is created. This embedding is then compared with stored embeddings through a similarity search in the vector database. The resulting similar embeddings assist in constructing a prompt for GPT-3, whose response is streaming back to the user, like a conversation flowing organically.

Getting Started

1. Clone Repo and Install Dependencies

Use degit to create your project from the template:

npx degit https://github.com/Jordan-Gilliam/ai-template ai-template
cd ai-template
code .  # install dependencies
npm i

2. Set-up Pinecone

Go to Pinecone, create a free account, and:

  • Set up a new Pinecone Index with Dimensions 1536
  • Copy your API key
  • Note your Environment name (e.g., us-central1-gcp)
  • Record your index name (e.g., mercury)

3. Set-up OpenAI API

Visit OpenAI to create and copy your API key from the API Keys section.

4. Configure Environment Variables

Open the .env.local file and adjust the configuration:

cp .env.example .env.local
# OpenAI
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Pinecone
PINECONE_API_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
PINECONE_ENVIRONMENT=us-central1-gcp
PINECONE_INDEX_NAME=mercury

5. Start the App

Run the following command:

npm run dev

Visit http://localhost:3000 in your browser to see your app in action!

Template Features

  • OpenAI API for generating embeddings and GPT-3 responses
  • Pinecone integration
  • NextJS API Routes (Edge runtime) – streaming
  • Styled with Tailwind CSS
  • Fonts with @next/font
  • Icons from Lucide
  • Dark mode support with next-themes
  • Radix UI Primitives
  • Automatic import sorting with @ianvsprettier-plugin-sort-imports

Inspiration & Acknowledgments

Special thanks to @gannonh and @mayooear whose excellent work inspired this template. Additional resources include:

How Embeddings Work

Imagine ChatGPT is akin to a librarian who can answer generic questions but struggles when asked about niche topics without reliable texts. This app resolves that by using embeddings, akin to using a map that can guide you to the exact resources you need. Each block of text is transformed into a floating-point vector, allowing the app to measure the closeness of documents effortlessly.

Troubleshooting

If you encounter issues during setup or usage, consider these tips:

  • Ensure your API keys are correctly entered without extra spaces.
  • Check if your Pinecone account is active and properly set up.
  • Verify that your files conform to the supported formats listed above.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox