Easily Serve AI Models Lightning Fast with LitServe

Oct 11, 2020 | Educational

Welcome to the exciting world of AI model serving! Today, we will delve into how to quickly set up and run your AI models using LitServe, a flexible serving engine built on FastAPI, designed for high performance and ease of use.

What is LitServe?

LitServe is a powerful tool that simplifies serving AI models. With its lightning-fast performance, it allows you to utilize various models—from traditional ML to large language models (LLMs)—without the hassle of constantly rebuilding a FastAPI server for each model. By leveraging batching, streaming, and GPU autoscaling features, LitServe makes serving your AI models easy and efficient.

Getting Started with LitServe

Here’s a step-by-step guide on how to set up a server with LitServe:

Step 1: Installation

pip install litserve

Install LitServe using pip. It’s that simple!

Step 2: Define Your Server

In this toy example, we’ll define a server that handles two mathematical models (a compound AI system):

import litserve as ls

class SimpleLitAPI(ls.LitAPI):
    def setup(self, device):
        self.model1 = lambda x: x**2
        self.model2 = lambda x: x**3

    def decode_request(self, request):
        return request['input']

    def predict(self, x):
        squared = self.model1(x)
        cubed = self.model2(x)
        output = squared + cubed
        return {'output': output}

    def encode_response(self, output):
        return output

if __name__ == "__main__":
    server = ls.LitServer(SimpleLitAPI(), accelerator='auto', max_batch_size=1)
    server.run(port=8000)

In this code snippet, we set up a compound AI system with two simple mathematical models. Here’s how you can think of it:

Imagine you are a chef in a kitchen with two separate cooking stations. At station one, you’re preparing a dish that requires squaring ingredients (like squaring a number), while at station two, you’re preparing another dish that requires cubing the same ingredients. LitServe acts as the head chef who coordinates these two stations efficiently, serving up a delicious compound meal based on user requests.

Step 3: Run Your Server

Now, you can run your server using the command:

python server.py

Your AI server is now live!

Step 4: Test Your Server

You can test your server using the auto-generated test client or by sending a request using cURL:

curl -X POST http://127.0.0.1:8000/predict -H "Content-Type: application/json" -d '{"input": 4.0}'

This command sends a request to your server to see how it handles inputs!

Troubleshooting Common Issues

If you encounter issues while setting up or running LitServe, try these troubleshooting steps:

  • Make sure you have all dependencies installed correctly.
  • Check for spelling mistakes in your code.
  • Ensure that your models are not causing any execution errors.
  • Check the port number if the server is not responding.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Advanced Features of LitServe

LitServe boasts several advanced features, including:

  • Batching and Streaming
  • GPU Autoscaling
  • Self-hosting or managed hosting options
  • Support for various model types (LLMs, vision, etc.)
  • OpenAPI compliant

Conclusion

With LitServe, deploying AI models has never been easier. Its flexibility and performance enhancements make it an excellent choice for developers looking to leverage the power of AI.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox