Deploy Letta on a VPS
A self-hosted framework for stateful AI agents with persistent long-term memory. Letta orchestrates LLM calls — no GPU required — and stores everything (messages, memory blocks, archival data) in Postgres so agents survive restarts and remember context indefinitely.
At a Glance
| Project | letta-ai/letta (formerly MemGPT) |
| Stack | Python + bundled Postgres + pgvector — image letta/letta |
| Recommended Plan | Cloud VPS 2GB / 2 vCPU (dev); 4GB / 4 vCPU (multi-user) |
| OS | Ubuntu 24.04 LTS |
| LLM Backend | OpenAI / Anthropic / Gemini / Ollama (configurable) |
No GPU required
Letta orchestrates calls to OpenAI, Anthropic, or any other provider — token generation does not happen locally. Resource needs come from the Python server (~500MB), bundled Postgres (~300MB), and the working set of agent memory blocks + pgvector embeddings, which grows with usage. A single active agent generates roughly 50 to 200 MB/month of database growth depending on conversation length.
Sudo User + SSH Hardening + UFW
adduser vanessa
usermod -aG sudo vanessa
rsync --archive --chown=vanessa:vanessa ~/.ssh /home/vanessa
# log out and back in as vanessaPermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yessudo systemctl reload ssh
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable
sudo apt update
sudo apt install -y fail2ban
sudo systemctl enable --now fail2banLetta's API listens on 8283; we bind it to 127.0.0.1 only and front it with Nginx. Do not open 8283 in UFW.
Install Docker + Compose
sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER
newgrp docker
docker run --rm hello-world
docker compose versionProject Layout + .env
mkdir -p ~/letta/{data/pgdata,data/letta}
cd ~/letta
# Generate a strong server password
openssl rand -base64 32# Authentication
SECURE=true
LETTA_SERVER_PASSWORD=paste-your-generated-password-here
# LLM Providers (set at least one)
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
# Optional
GEMINI_API_KEY=
# OLLAMA_BASE_URL=http://host.docker.internal:11434/v1
# Optional: tool sandboxing via E2B (recommended for production)
# E2B_API_KEY=
# E2B_SANDBOX_TEMPLATE_ID=chmod 600 ~/letta/.envdocker-compose.yml
services:
letta:
image: letta/letta:latest
container_name: letta
restart: unless-stopped
ports:
- "127.0.0.1:8283:8283"
volumes:
- ./data/pgdata:/var/lib/postgresql/data
- ./data/letta:/root/.letta
env_file:
- .env
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8283/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 90s
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"The 90-second start_period accommodates the initial Alembic migration, which creates ~42 tables on first launch and takes 1 to 2 minutes on a 2GB VPS. Log rotation matters — chatty agents fill JSON logs fast.
Launch + Health Check
cd ~/letta
docker compose pull
docker compose up -d
docker compose logs -f lettaWait for "Database migration completed successfully" and "Starting Letta Server at http://0.0.0.0:8283".
curl http://127.0.0.1:8283/v1/healthNginx Reverse Proxy + Let's Encrypt
Point a DNS A record (and AAAA if you have IPv6) for letta.yourdomain.com at the VPS before continuing.
sudo apt install -y nginx certbot python3-certbot-nginxserver {
listen 80;
listen [::]:80;
server_name letta.yourdomain.com;
client_max_body_size 25M;
location / {
proxy_pass http://127.0.0.1:8283;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Agent invocations can be long-running
proxy_read_timeout 600s;
proxy_send_timeout 600s;
# Streaming responses
proxy_buffering off;
proxy_cache off;
}
}sudo ln -s /etc/nginx/sites-available/letta /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d letta.yourdomain.com
sudo certbot renew --dry-runThe 600-second timeouts handle long agent reasoning chains (multi-tool sequences); proxy_buffering off matters for streaming endpoints.
Connect from the ADE or SDK
The Letta Agent Development Environment (ADE) is a graphical UI for building, debugging, and observing agents. Use the desktop app or the hosted version at chat.letta.com:
- Select "add a self-hosted server"
- Server URL:
https://letta.yourdomain.com - Password:
LETTA_SERVER_PASSWORDfrom your .env
pip install letta-client
from letta_client import Letta
client = Letta(
base_url="https://letta.yourdomain.com",
token="your-letta-server-password"
)
agent = client.agents.create(
model="anthropic/claude-sonnet-4-5",
embedding="openai/text-embedding-3-small",
memory_blocks=[
{"label": "human", "value": "Name: Vanessa. Role: Senior Operations Manager."},
{"label": "persona", "value": "I am a helpful assistant with persistent memory."},
],
)Note: the Docker server (unlike Letta Cloud) requires explicit embedding configuration per agent, or set a default with LETTA_DEFAULT_EMBEDDING_MODEL.
Production Hardening
Tool sandboxing. If agents will run user-defined or third-party tool code, enable E2B by signing up at e2b.dev and adding E2B_API_KEY + E2B_SANDBOX_TEMPLATE_ID to .env, then docker compose restart letta.
deploy:
resources:
limits:
cpus: '2.0'
memory: 3G
reservations:
memory: 1Glimit_req_zone $binary_remote_addr zone=letta_api:10m rate=30r/m;location / {
limit_req zone=letta_api burst=10 nodelay;
# ... existing proxy_pass directives
}Pin your version in production — change letta/letta:latest to a specific tag (e.g. letta/letta:0.6.x) so an upstream release does not break compatibility on a routine pull.
Backups
Use pg_dump from inside the container — never copy raw Postgres data files while the DB is running.
0 3 * * * cd /home/vanessa/letta && mkdir -p backups && \
docker compose exec -T letta pg_dump -U letta -d letta | \
gzip > backups/letta-$(date +\%Y\%m\%d).sql.gz && \
find backups -name "letta-*.sql.gz" -mtime +14 -deletedocker compose down
docker compose up -d
sleep 60 # wait for migrations
gunzip < backups/letta-20260415.sql.gz | \
docker compose exec -T letta psql -U letta -d lettaFor offsite copies, point Restic or rclone at ~/letta/backups and ~/letta/data/letta.
Updates
cd ~/letta
docker compose pull
docker compose up -d
docker compose logs -f lettaAlways take a database backup before upgrading. Letta runs Alembic migrations automatically on startup, but a backup gives you a rollback path if a migration fails.
Troubleshooting
- Container starts but API returns 401: add
Authorization: Bearer your-passwordto every request, or passtokenwhen initializing the SDK. - Migrations hang on first start: Alembic on a 2GB VPS can take 2–3 minutes. Check disk and that
data/pgdatapermissions are correct. - "No embedding model configured": the Docker server requires explicit embedding per agent or a
LETTA_DEFAULT_EMBEDDING_MODELdefault. - Nginx 504 on long agent calls: raise
proxy_read_timeoutandproxy_send_timeout. - Container restarts every few minutes: check for OOM kills. Postgres + Python + pgvector index loads can push past 1.5GB under load — upgrade plan or tune
shared_buffers/work_mem.
