Creating Knowledge Graphs from Unstructured Data: A How-to Guide

Aug 16, 2022 | Data Science

Welcome to this guide on utilizing the Knowledge Graph Builder App! In this tutorial, we’ll explore how to transform unstructured data from various sources into insightful knowledge graphs. If you’ve ever wondered how to structure a pile of information into a coherent graph, you’ve landed in the right place!

Overview of the Knowledge Graph Builder App

The Knowledge Graph Builder App is designed to convert unstructured data (like PDFs, documents, text files, YouTube videos, and web pages) into structured knowledge graphs, which are stored in Neo4j. The power of Large Language Models (LLMs) such as OpenAI and Gemini is harnessed to extract nodes, relationships, and their properties from the text, creating a meaningful representation of your data using the Langchain framework.

Key Features

  • Knowledge Graph Creation: Convert unstructured data into structured knowledge graphs.
  • Providing Schema: Create custom schemas or use existing ones to guide graph generation.
  • View Graph: Visualize graphs for individual or multiple sources using Bloom.
  • Chat with Data: Engage with your data through conversational queries and retrieve metadata about the responses.

Getting Started

Warning: Make sure you have a Neo4j Database version 5.15 or later with APOC installed to use this Knowledge Graph Builder. You can opt for any Neo4j Aura database, including the free version.

If you’re using Neo4j Desktop, follow the separate deployment of backend and frontend instructions.

Deployment

Local Deployment

Running through Docker-compose

For running the application, you need to set your API keys in a .env file. Here’s how:

env
OPENAI_API_KEY=your-openai-key
DIFFBOT_API_KEY=your-diffbot-key

If you only want to use OpenAI or Diffbot, adjust the file accordingly. After setting your keys, run:

docker-compose up --build

Additional Configs

The app comes configured to accept sources like Local files, YouTube videos, Wikipedia, AWS S3, and web pages by default. To enable Google Cloud Storage (GCS), modify the .env file to include your Google client ID:

env
VITE_REACT_APP_SOURCES=local,youtube,wiki,s3,gcs,web
VITE_GOOGLE_CLIENT_ID=xxxx

Chat Modes

The application supports various chat modes by default. If you wish to specify a particular mode, adjust your .env as necessary:

env
VITE_CHAT_MODES=vector,graph+vector

Running Backend and Frontend Separately (Dev Environment)

Should you choose to run the backend and frontend separately:

  1. Create a frontend.env file and modify it as required. Then, run the following commands:
  2. bash
    cd frontend
    yarn
    yarn run dev
  3. For the backend, create a backend.env file similarly, and execute:
  4. bash
    cd backend
    python -m venv envName
    source envName/bin/activate
    pip install -r requirements.txt
    uvicorn score:app --reload

Deployment in Cloud

If cloud deployment is your choice, run the following:

# Frontend deploy
gcloud run deploy --source . --region us-central1 --allow-unauthenticated
# Backend deploy
gcloud run deploy --set-env-vars OPENAI_API_KEY=your-openai-key --set-env-vars DIFFBOT_API_KEY=your-diffbot-key --set-env-vars NEO4J_URI=your-neo4j-uri --set-env-vars NEO4J_PASSWORD=your-password --set-env-vars NEO4J_USERNAME=your-username --source . --region us-central1 --allow-unauthenticated

Usage Instructions

  1. Connect to your Neo4j Aura Instance using the URI and password.
  2. Select your source from the unstructured data list.
  3. Adjust the LLM from the dropdown if necessary.
  4. Optionally define your node and relationship labels.
  5. Select files to generate your graph, or process all files in New status.
  6. Preview your graph using “View in Grid.”
  7. Interact with your data using the chat-bot to ask questions related to your processed sources.

Troubleshooting

If you encounter any issues during the setup or execution, here are a few troubleshooting ideas:

  • Ensure that your API keys are correctly set in your .env file.
  • Verify that your Neo4j database is running and accessible.
  • Check your Docker setup for any misconfiguration.
  • If you aren’t seeing data in your knowledge graph, double-check the input sources and settings.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Conclusion

Congratulations! You have now learned how to harness the power of the Knowledge Graph Builder App. With all the components in place, you can easily turn unstructured data into structured knowledge graphs, enhancing your data analysis capabilities. Happy graph building!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox