Stateful Agents

Open Source

Deploy Letta on a VPS

A self-hosted framework for stateful AI agents with persistent long-term memory. Letta orchestrates LLM calls — no GPU required — and stores everything (messages, memory blocks, archival data) in Postgres so agents survive restarts and remember context indefinitely.

At a Glance

Project	letta-ai/letta (formerly MemGPT)
Stack	Python + bundled Postgres + pgvector — image `letta/letta`
Recommended Plan	Cloud VPS 2GB / 2 vCPU (dev); 4GB / 4 vCPU (multi-user)
OS	Ubuntu 24.04 LTS
LLM Backend	OpenAI / Anthropic / Gemini / Ollama (configurable)

No GPU required

Letta orchestrates calls to OpenAI, Anthropic, or any other provider — token generation does not happen locally. Resource needs come from the Python server (~500MB), bundled Postgres (~300MB), and the working set of agent memory blocks + pgvector embeddings, which grows with usage. A single active agent generates roughly 50 to 200 MB/month of database growth depending on conversation length.

Sudo User + SSH Hardening + UFW

Create a sudo user

adduser vanessa
usermod -aG sudo vanessa
rsync --archive --chown=vanessa:vanessa ~/.ssh /home/vanessa
# log out and back in as vanessa

/etc/ssh/sshd_config

PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes

UFW + fail2ban

sudo systemctl reload ssh

sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

sudo apt update
sudo apt install -y fail2ban
sudo systemctl enable --now fail2ban

Letta's API listens on 8283; we bind it to 127.0.0.1 only and front it with Nginx. Do not open 8283 in UFW.

Install Docker + Compose

Docker's official APT repo

sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER
newgrp docker

docker run --rm hello-world
docker compose version

Project Layout + .env

Directories + password

mkdir -p ~/letta/{data/pgdata,data/letta}
cd ~/letta

# Generate a strong server password
openssl rand -base64 32

~/letta/.env

# Authentication
SECURE=true
LETTA_SERVER_PASSWORD=paste-your-generated-password-here

# LLM Providers (set at least one)
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...

# Optional
GEMINI_API_KEY=
# OLLAMA_BASE_URL=http://host.docker.internal:11434/v1

# Optional: tool sandboxing via E2B (recommended for production)
# E2B_API_KEY=
# E2B_SANDBOX_TEMPLATE_ID=

Lock down secrets

chmod 600 ~/letta/.env

docker-compose.yml

~/letta/docker-compose.yml

services:
  letta:
    image: letta/letta:latest
    container_name: letta
    restart: unless-stopped
    ports:
      - "127.0.0.1:8283:8283"
    volumes:
      - ./data/pgdata:/var/lib/postgresql/data
      - ./data/letta:/root/.letta
    env_file:
      - .env
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8283/v1/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 90s
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "5"

The 90-second start_period accommodates the initial Alembic migration, which creates ~42 tables on first launch and takes 1 to 2 minutes on a 2GB VPS. Log rotation matters — chatty agents fill JSON logs fast.

Launch + Health Check

Bring it up

cd ~/letta
docker compose pull
docker compose up -d
docker compose logs -f letta

Wait for "Database migration completed successfully" and "Starting Letta Server at http://0.0.0.0:8283".

Health

curl http://127.0.0.1:8283/v1/health

Nginx Reverse Proxy + Let's Encrypt

Point a DNS A record (and AAAA if you have IPv6) for letta.yourdomain.com at the VPS before continuing.

Install + site

sudo apt install -y nginx certbot python3-certbot-nginx

/etc/nginx/sites-available/letta

server {
    listen 80;
    listen [::]:80;
    server_name letta.yourdomain.com;

    client_max_body_size 25M;

    location / {
        proxy_pass http://127.0.0.1:8283;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Agent invocations can be long-running
        proxy_read_timeout 600s;
        proxy_send_timeout 600s;

        # Streaming responses
        proxy_buffering off;
        proxy_cache off;
    }
}

Enable + cert

sudo ln -s /etc/nginx/sites-available/letta /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

sudo certbot --nginx -d letta.yourdomain.com
sudo certbot renew --dry-run

The 600-second timeouts handle long agent reasoning chains (multi-tool sequences); proxy_buffering off matters for streaming endpoints.

Connect from the ADE or SDK

The Letta Agent Development Environment (ADE) is a graphical UI for building, debugging, and observing agents. Use the desktop app or the hosted version at chat.letta.com:

Select "add a self-hosted server"
Server URL: https://letta.yourdomain.com
Password: LETTA_SERVER_PASSWORD from your .env

Python SDK example

pip install letta-client

from letta_client import Letta

client = Letta(
    base_url="https://letta.yourdomain.com",
    token="your-letta-server-password"
)

agent = client.agents.create(
    model="anthropic/claude-sonnet-4-5",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "human", "value": "Name: Vanessa. Role: Senior Operations Manager."},
        {"label": "persona", "value": "I am a helpful assistant with persistent memory."},
    ],
)

Note: the Docker server (unlike Letta Cloud) requires explicit embedding configuration per agent, or set a default with LETTA_DEFAULT_EMBEDDING_MODEL.

Production Hardening

Tool sandboxing. If agents will run user-defined or third-party tool code, enable E2B by signing up at e2b.dev and adding E2B_API_KEY + E2B_SANDBOX_TEMPLATE_ID to .env, then docker compose restart letta.

Compose: resource limits

    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 3G
        reservations:
          memory: 1G

Nginx rate limit (in http {} of nginx.conf)

limit_req_zone $binary_remote_addr zone=letta_api:10m rate=30r/m;

In the letta server block

location / {
    limit_req zone=letta_api burst=10 nodelay;
    # ... existing proxy_pass directives
}

Pin your version in production — change letta/letta:latest to a specific tag (e.g. letta/letta:0.6.x) so an upstream release does not break compatibility on a routine pull.

Backups

Use pg_dump from inside the container — never copy raw Postgres data files while the DB is running.

Daily cron (crontab -e)

0 3 * * * cd /home/vanessa/letta && mkdir -p backups && \
  docker compose exec -T letta pg_dump -U letta -d letta | \
  gzip > backups/letta-$(date +\%Y\%m\%d).sql.gz && \
  find backups -name "letta-*.sql.gz" -mtime +14 -delete

Restore

docker compose down
docker compose up -d
sleep 60  # wait for migrations
gunzip < backups/letta-20260415.sql.gz | \
  docker compose exec -T letta psql -U letta -d letta

For offsite copies, point Restic or rclone at ~/letta/backups and ~/letta/data/letta.

Updates

Pull + restart

cd ~/letta
docker compose pull
docker compose up -d
docker compose logs -f letta

Always take a database backup before upgrading. Letta runs Alembic migrations automatically on startup, but a backup gives you a rollback path if a migration fails.

Troubleshooting

Container starts but API returns 401: add Authorization: Bearer your-password to every request, or pass token when initializing the SDK.
Migrations hang on first start: Alembic on a 2GB VPS can take 2–3 minutes. Check disk and that data/pgdata permissions are correct.
"No embedding model configured": the Docker server requires explicit embedding per agent or a LETTA_DEFAULT_EMBEDDING_MODEL default.
Nginx 504 on long agent calls: raise proxy_read_timeout and proxy_send_timeout.
Container restarts every few minutes: check for OOM kills. Postgres + Python + pgvector index loads can push past 1.5GB under load — upgrade plan or tune shared_buffers/work_mem.

More Deployment Guides•SGLang Guide•Aphrodite Engine Guide