Neum AI

Mar 31, 2024 | Educational

Homepage | Documentation | Blog | Discord | Twitter

What is Neum AI?

Neum AI is a robust data platform that empowers developers to leverage their data to contextualize Large Language Models using Retrieval Augmented Generation (RAG). It simplifies the extraction of data from various existing data sources like document storage and NoSQL, transforms the contents into vector embeddings, and efficiently manages these embeddings in vector databases for similarity search. This comprehensive solution is designed to grow alongside your application, minimizing the time spent integrating services such as data connectors, embedding models, and vector databases.

Features of Neum AI

High throughput distributed architecture: It can handle billions of data points, optimizing embedding generation and ingestion through high degrees of parallelization.
Built-in data connectors: Easily connect to common data sources, embedding services, and vector stores.
Real-time synchronization: Ensures your data is always current.
Customizable data pre-processing: Options for loading, chunking, and selecting data.
Cohesive data management: Supports hybrid retrieval with metadata, automatically augmenting and tracking metadata for a richer retrieval experience.

Getting Started with Neum AI

To get started, you have two choices: use the Neum AI Cloud or set up Local Development. Let’s dive into both!

1. Neum AI Cloud

Sign up today at dashboard.neum.ai. For a quick start, check out our quickstart guide. The Neum AI Cloud supports a large-scale, distributed architecture to run millions of documents through vector embedding.

2. Local Development

To initiate local development, install the neumai package by using the following command:

pip install neumai

To create your first data pipeline, visit our quickstart guide. A pipeline consists of one or multiple sources for data input, one embed connector to vectorize the content, and one sink connector to store the vectors.

Creating and Running a Pipeline

Let’s explore how to craft a pipeline through an analogy. Imagine you want to create a high-end pizza. First, you need the:

Dough (Data Sources): This is your base, similar to the SourceConnector in Neum AI that collects your ingredients from web pages, databases, or files.
Ingredients (Embed Connectors): These are the toppings that enhance your pizza flavor, akin to the OpenAIEmbed that processes your data into vector embeddings.
Oven (Sink Connectors): The oven is where everything comes together. Like the WeaviateSink, which stores your final product in a vector database.

This combination allows you to create a delicious pizza or, in our case, a well-functioning data retrieval system!


from neumai.DataConnectors.WebsiteConnector import WebsiteConnector
from neumai.Shared.Selector import Selector
from neumai.Loaders.HTMLLoader import HTMLLoader
from neumai.Chunkers.RecursiveChunker import RecursiveChunker
from neumai.Sources.SourceConnector import SourceConnector
from neumai.EmbedConnectors import OpenAIEmbed
from neumai.SinkConnectors import WeaviateSink
from neumai.Pipelines import Pipeline

website_connector = WebsiteConnector(
    url="https://www.neum.ai/post/retrieval-augmented-generation-at-scale",
    selector=Selector(to_metadata=[url])
)

source = SourceConnector(
    data_connector=website_connector,
    loader=HTMLLoader(),
    chunker=RecursiveChunker()
)

openai_embed = OpenAIEmbed(api_key="OPEN AI KEY")

weaviate_sink = WeaviateSink(
    url="your-weaviate-url",
    api_key="your-api-key",
    class_name="your-class-name",
)

pipeline = Pipeline(sources=[source], embed=openai_embed, sink=weaviate_sink)
pipeline.run()

results = pipeline.search(
    query="What are the challenges with scaling RAG?",
    number_of_results=3
)

for result in results:
    print(result.metadata)

Troubleshooting

Should you encounter any issues while setting up or running your pipeline, the following tips can guide you:

Ensure that all API keys and URL endpoints are accurate and valid.
Check the compatibility of your data source with Neum AI’s connectors.
Verify network connectivity, especially when accessing external data sources.
If the embeddings appear incorrect, analyze the chunking and loading parameters to ensure data is processed as desired.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Connect with Us

If you have questions or need assistance, feel free to reach out via email at founders@tryneum.com or on Discord. You can also schedule a call with us to discuss your needs.

Available Connectors

For a complete list of available connectors, please check our documentation.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox