Welcome to the exciting world of AI model serving! Today, we will delve into how to quickly set up and run your AI models using LitServe, a flexible serving engine built on FastAPI, designed for high performance and ease of use.
What is LitServe?
LitServe is a powerful tool that simplifies serving AI models. With its lightning-fast performance, it allows you to utilize various models—from traditional ML to large language models (LLMs)—without the hassle of constantly rebuilding a FastAPI server for each model. By leveraging batching, streaming, and GPU autoscaling features, LitServe makes serving your AI models easy and efficient.
Getting Started with LitServe
Here’s a step-by-step guide on how to set up a server with LitServe:
Step 1: Installation
pip install litserve
Install LitServe using pip. It’s that simple!
Step 2: Define Your Server
In this toy example, we’ll define a server that handles two mathematical models (a compound AI system):
import litserve as ls
class SimpleLitAPI(ls.LitAPI):
def setup(self, device):
self.model1 = lambda x: x**2
self.model2 = lambda x: x**3
def decode_request(self, request):
return request['input']
def predict(self, x):
squared = self.model1(x)
cubed = self.model2(x)
output = squared + cubed
return {'output': output}
def encode_response(self, output):
return output
if __name__ == "__main__":
server = ls.LitServer(SimpleLitAPI(), accelerator='auto', max_batch_size=1)
server.run(port=8000)
In this code snippet, we set up a compound AI system with two simple mathematical models. Here’s how you can think of it:
Imagine you are a chef in a kitchen with two separate cooking stations. At station one, you’re preparing a dish that requires squaring ingredients (like squaring a number), while at station two, you’re preparing another dish that requires cubing the same ingredients. LitServe acts as the head chef who coordinates these two stations efficiently, serving up a delicious compound meal based on user requests.
Step 3: Run Your Server
Now, you can run your server using the command:
python server.py
Your AI server is now live!
Step 4: Test Your Server
You can test your server using the auto-generated test client or by sending a request using cURL:
curl -X POST http://127.0.0.1:8000/predict -H "Content-Type: application/json" -d '{"input": 4.0}'
This command sends a request to your server to see how it handles inputs!
Troubleshooting Common Issues
If you encounter issues while setting up or running LitServe, try these troubleshooting steps:
- Make sure you have all dependencies installed correctly.
- Check for spelling mistakes in your code.
- Ensure that your models are not causing any execution errors.
- Check the port number if the server is not responding.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Advanced Features of LitServe
LitServe boasts several advanced features, including:
- Batching and Streaming
- GPU Autoscaling
- Self-hosting or managed hosting options
- Support for various model types (LLMs, vision, etc.)
- OpenAPI compliant
Conclusion
With LitServe, deploying AI models has never been easier. Its flexibility and performance enhancements make it an excellent choice for developers looking to leverage the power of AI.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

