Welcome to the fascinating world of MyScaleDB! As the SQL vector database designed for developers who want to build production-grade and scalable AI applications, MyScaleDB allows you to leverage your familiar SQL skills to manage massive volumes of data. In this guide, you’ll learn how to set up MyScaleDB and troubleshoot any potential issues.
What is MyScaleDB?
MyScaleDB is built on top of the ClickHouse platform, optimizing for AI applications, enabling effective management and processing of diverse data types, including structured, text, and vectorized data. The key benefits of using MyScaleDB are:
- Fully SQL-Compatible: Use familiar SQL syntax with vector-related functions, making vector search and SQL-vector join queries as seamless as ever.
- Production-Ready for AI applications: A unified platform for managing various data formats, enhancing retrieval accuracy through metadata filtering.
- Unmatched Performance and Scalability: Leverage cutting-edge architecture for lightning-fast vector operations.
Quick Start with MyScaleDB
Using MyScale Cloud
The simplest way to use MyScaleDB is through the MyScale Cloud service. You can sign up for a free pod supporting 5 million vectors and access the MyScaleDB QuickStart documentation directly for additional guidance.
Self-Hosted Installation
If you prefer self-hosting, you can use Docker to run MyScaleDB. Follow these steps:
Using MyScaleDB Docker Image
To pull and run the latest MyScaleDB Docker image, execute the following command:
docker run --name myscaledb --net=host myscale/myscaledb:1.7.1
Note: The default configuration allows localhost IP access only. Make sure to specify –net=host option.
Using Docker Compose
Set up your directory structure as follows, including the docker-compose.yaml file:
mymyscaledb
|-- docker-compose.yaml
|-- volumes
|-- config
| |-- users.d
| |-- custom_users_config.xml
Here’s a sample configuration for your docker-compose.yaml:
version: '3.7'
services:
myscaledb:
image: myscale/myscaledb:1.7.1
tty: true
ports:
- "8123:8123"
- "9000:9000"
- "8998:8998"
- "9363:9363"
- "9116:9116"
networks:
myscaledb_network:
ipv4_address: 10.0.0.2
volumes:
- $DOCKER_VOLUME_DIRECTORY:-.volumes/data:/var/lib/clickhouse
- $DOCKER_VOLUME_DIRECTORY:-.volumes/log:/var/log/clickhouse-server
- $DOCKER_VOLUME_DIRECTORY:-.volumes/config/users.d/custom_users_config.xml:/etc/clickhouse-server/users.d/custom_users_config.xml
deploy:
resources:
limits:
cpus: "16.00"
memory: "32Gb"
networks:
myscaledb_network:
driver: bridge
ipam:
driver: default
config:
- subnet: 10.0.0.0/24
After updating your configuration file, execute the following commands to start your MyScaleDB instance:
cd myscaledb
docker-compose up -d
Access your MyScaleDB command line interface:
docker exec -it myscaledb-myscaledb-1 clickhouse-client
Now, let’s jump into executing SQL statements!
MyScaleDB Example Usage
Creating a table with a vector column and inserting data can be done with SQL commands. Here’s how you can approach this:
- Create a Table with Vector Column:
CREATE TABLE default.wiki_abstract(
id UInt64,
body String,
title String,
url String,
body_vector Array(Float32),
CONSTRAINT check_length CHECK length(body_vector) = 384
) ENGINE = MergeTree ORDER BY id;
INSERT INTO default.wiki_abstract
SELECT * FROM s3('https://myscale-datasets.s3.ap-southeast-1.amazonaws.com/wiki_abstract_with_vector.parquet', 'Parquet');
ALTER TABLE default.wiki_abstract
ADD VECTOR INDEX vec_idx body_vector TYPE SCANN(metric_type=Cosine);
SELECT id, title, distance(body_vector, array(...)) AS distance
FROM default.wiki_abstract
ORDER BY distance ASC
LIMIT 5;
Troubleshooting
If you face issues while setting up MyScaleDB, here are some troubleshooting tips:
- Check the Docker service status if your containers aren’t starting.
- Ensure your configuration file paths are correct and directories are mounted properly.
- If you encounter connection problems, verify that the correct port is exposed and accessible.
- Inspect logs for any error messages that can provide insights into the issue.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
Conclusion
With MyScaleDB, you can effectively manage both structured and vectorized data, making it a robust option for developers in the AI landscape. Utilize the tools and documentation provided to start building sophisticated datasets and applications now!

