Meet LiteLLM Agent Platform: A Kubernetes-Based, Self-Hosted Infrastructure Layer for Isolated Agent Sandboxes and Persistent Session Management in Production

Learn how to set up and deploy the LiteLLM Agent Platform using Docker and Kubernetes for isolated AI agent sandboxes and persistent session management.

Introduction

In today's world of AI development, running simple AI agents in local scripts is easy. But when you need to deploy these agents in production environments across teams, with isolated environments and persistent sessions, it becomes a complex challenge. The LiteLLM Agent Platform is an open-source solution that addresses this exact problem. In this tutorial, you'll learn how to set up and run a basic LiteLLM Agent Platform using Docker and Kubernetes, which will help you create isolated agent sandboxes and manage persistent sessions.

This tutorial will guide you through:

Installing and setting up the necessary tools
Creating a simple Kubernetes cluster
Deploying the LiteLLM Agent Platform
Testing the platform with a sample agent

By the end of this tutorial, you'll have a working LiteLLM Agent Platform that you can use to run AI agents in isolated environments.

Prerequisites

Before starting this tutorial, ensure you have the following installed on your system:

Docker - For containerization
Kubernetes (kubectl) - For managing containerized applications
Minikube - For creating a local Kubernetes cluster
Git - For cloning repositories

Why these tools? Docker allows us to package our application and its dependencies into containers, making deployment consistent across environments. Kubernetes orchestrates these containers at scale, and Minikube provides a local Kubernetes environment for testing. Git is needed to download the LiteLLM Agent Platform codebase.

Step-by-Step Instructions

1. Install Required Tools

First, make sure you have all the necessary tools installed. For this tutorial, we'll use Minikube to create a local Kubernetes cluster.

For macOS:

brew install minikube kubectl

For Ubuntu:

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
sudo apt-get update && sudo apt-get install -y kubectl

Why? These commands install the required tools for setting up a local Kubernetes environment.

2. Start a Local Kubernetes Cluster

Now, start a local Kubernetes cluster using Minikube:

minikube start

This command creates a local Kubernetes cluster. You can check if it's running with:

kubectl cluster-info

Why? Minikube creates a single-node Kubernetes cluster that's perfect for testing and development purposes.

3. Clone the LiteLLM Agent Platform Repository

Next, clone the LiteLLM Agent Platform repository from GitHub:

git clone https://github.com/BerriAI/litellm-agent-platform.git
 cd litellm-agent-platform

Why? This repository contains the necessary configuration files and scripts to deploy the LiteLLM Agent Platform on Kubernetes.

4. Deploy LiteLLM Agent Platform

Now, deploy the LiteLLM Agent Platform to your local Kubernetes cluster:

kubectl apply -f deploy/

This command applies all the Kubernetes manifests in the deploy directory, which will create the necessary resources for the platform.

Why? The manifests define the desired state of the platform, including services, deployments, and other Kubernetes resources.

5. Check Deployment Status

After deploying, check if all pods are running:

kubectl get pods

You should see pods related to the LiteLLM Agent Platform, such as the gateway, agent, and database pods.

Why? This ensures that all components of the platform are up and running correctly.

6. Access the Platform

To access the LiteLLM Agent Platform, we'll port-forward to the gateway service:

kubectl port-forward svc/litellm-gateway 8000:80

Now, you can access the platform at http://localhost:8000.

Why? Port-forwarding allows you to access services running inside the Kubernetes cluster from your local machine.

7. Test the Platform

To test if the platform works, send a simple request to the gateway:

curl -X POST http://localhost:8000/agent/ \
  -H "Content-Type: application/json" \
  -d '{"task": "Hello, LiteLLM Agent!"}'

You should receive a response indicating that the agent processed your task.

Why? This confirms that the platform is correctly routing requests to the agent and processing them.

Summary

In this tutorial, you've learned how to set up and deploy the LiteLLM Agent Platform using Docker and Kubernetes. You created a local Kubernetes cluster with Minikube, deployed the platform, and tested it with a simple request. This platform allows you to run AI agents in isolated environments and manage persistent sessions, which is crucial for production deployments.

Remember, this is just a basic setup. In production, you'll want to configure more advanced features like authentication, monitoring, and scaling. The LiteLLM Agent Platform provides a solid foundation for building robust AI agent systems.