How to Create Your Own Custom GPT Using the GPT Crawler

Jun 15, 2024 | Educational

Are you looking to create a custom GPT (Generative Pre-trained Transformer) from specific websites? The GPT Crawler is a fantastic tool that allows you to crawl a website and generate knowledge files to train your own model. In this post, we’ll walk you through the steps to get started with this powerful tool, including troubleshooting tips to help you out!

Getting Started

Follow the outlined steps to set up your GPT Crawler locally.

Running Locally

Clone the Repository

First things first, you’ll want to clone the GPT Crawler repository. Make sure you have Node.js version 16 installed on your machine.

git clone https://github.com/BuilderIO/gpt-crawler

Install Dependencies

Next, navigate to the cloned repository and install the necessary dependencies using npm.

npm i

Configure the Crawler

Open the config.ts file and update the URL and selector properties according to your needs. For example, if you’re crawling the Builder.io documentation, it might look like the following:

export const defaultConfig: Config = {
  url: "https://www.builder.io/docs/developers",
  match: "https://www.builder.io/docs/**",
  selector: ".docs-builder-container",
  maxPagesToCrawl: 50,
  outputFileName: "output.json",
};

This setup defines which website to crawl, how many pages to explore, and where to save the output data.

Run Your Crawler

With your configuration ready, initiate the crawling process.

npm start

Alternative Methods

You can also run GPT Crawler in different environments.

Running in a Container with Docker

If you prefer Docker, go into the containerapp directory and adjust config.ts similarly. Once done, run the container to generate output.json.

Running as an API

To execute the crawler as an API server, install the dependencies and start the server. The server will run on port 3000 by default.

npm run start:server

Upload Your Data to OpenAI

After crawling, you’ll have a file named output.json. This file can be uploaded to OpenAI to create your custom assistant or GPT.

Create a Custom GPT

1. Go to ChatGPT.
2. Click your name in the bottom left corner.
3. Select “My GPTs” from the menu.
4. Click “Create a GPT.”
5. Choose “Configure” and then select “Upload a file” to upload your generated output.json.

If you encounter an error about the file size, consider splitting it up or reducing the number of tokens.

Create a Custom Assistant

This option provides API access to your indexed knowledge. To create one:

1. Navigate to OpenAI Assistants.
2. Click “+ Create.”
3. Select “upload” and upload your output.json.

Troubleshooting Tips

  • If the crawler isn’t fetching the expected data, double-check the URL and selector in the config.ts file.
  • Make sure your Node.js version is compatible; reinstall if necessary.
  • If you encounter issues during upload to OpenAI, verify if the file exceeds their size and token limitations.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Happy crawling and creating your custom GPT!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox