Part 8 of 8

Production Hardening

Reverse proxy, centralized auth, automated backups, and the complete unified stack.

50 minutes

8GB recommended

Prerequisites

Completed Parts 1–7, domain name, basic Linux admin

Time to Complete

45–60 minutes

Recommended Plan

8GB ($40/mo) for the full production stack

Introduction

You've built a comprehensive AI platform across seven guides. Now it's time to make it production-ready: secure, resilient, monitored, and optimized. This final part transforms your development setup into infrastructure you can rely on.

Nginx Reverse Proxy

Route all services through a single Nginx instance with subdomain routing:

Install Nginx

sudo apt install -y nginx

/etc/nginx/sites-available/ai-stack

# Open WebUI — chat.yourdomain.com
server {
    server_name chat.yourdomain.com;
    
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # WebSocket support for streaming
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_read_timeout 300s;
    }
}

# AnythingLLM — apps.yourdomain.com
server {
    server_name apps.yourdomain.com;
    
    location / {
        proxy_pass http://localhost:3001;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

# Tabby — code.yourdomain.com
server {
    server_name code.yourdomain.com;
    
    location / {
        proxy_pass http://localhost:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

# n8n — auto.yourdomain.com
server {
    server_name auto.yourdomain.com;
    
    location / {
        proxy_pass http://localhost:5678;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # WebSocket for n8n
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

# Qdrant Dashboard — vectors.yourdomain.com
server {
    server_name vectors.yourdomain.com;
    
    location / {
        proxy_pass http://localhost:6333;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

sudo ln -s /etc/nginx/sites-available/ai-stack /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

TLS with Let's Encrypt

Install Certbot and get certificates

sudo apt install -y certbot python3-certbot-nginx

# Get certificates for all subdomains
sudo certbot --nginx -d chat.yourdomain.com   -d apps.yourdomain.com   -d code.yourdomain.com   -d auto.yourdomain.com   -d vectors.yourdomain.com

Certbot automatically modifies your Nginx configs and sets up auto-renewal.

Security Headers

Add to each server block

# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;

# Verify auto-renewal
sudo certbot renew --dry-run

Centralized Authentication

For single sign-on across all services, consider deploying Authelia or Authentik as an authentication gateway:

Authelia — Lightweight, YAML-configured, perfect for small teams. See our Authelia guide
Authentik — Full-featured IdP with admin dashboard. See our Authentik guide

Both support MFA/2FA. Configure each AI service to authenticate through the central gateway for a unified login experience.

Firewall Configuration

Lock down all services behind Nginx

# Reset UFW
sudo ufw reset

# Default policies
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH
sudo ufw allow 22/tcp

# Allow HTTP/HTTPS only (Nginx handles routing)
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Enable firewall
sudo ufw enable
sudo ufw status verbose

All application ports (3000, 3001, 5678, 6333, 8080, 11434) are now inaccessible from the internet — only Nginx on 80/443 can reach them internally.

Fail2ban for Brute-Force Protection

sudo apt install -y fail2ban

# Configure Nginx jail
sudo tee /etc/fail2ban/jail.local << 'EOF'
[nginx-http-auth]
enabled = true
port = http,https
filter = nginx-http-auth
logpath = /var/log/nginx/error.log
maxretry = 5
bantime = 3600
EOF

sudo systemctl restart fail2ban

Automated Backups

Backup strategy for each component:

Component	Data	Frequency	Size
Ollama models	~/.ollama/models	Weekly (rarely change)	2–10 GB
Qdrant vectors	Docker volume	Daily (incremental)	Varies
Open WebUI	Docker volume	Daily	Small
n8n workflows	Docker volume	Daily	Small
AnythingLLM	Docker volume	Daily	Small

backup-ai-stack.sh

#!/bin/bash
# Automated backup script for the AI stack
BKDIR="/backups/ai-stack/$(date +%Y-%m-%d)"
mkdir -p "$BKDIR"

echo "=== AI Stack Backup: $(date) ==="

# Backup Docker volumes
for vol in open-webui-data qdrant-data anythingllm-data n8n-data tabby-data; do
  echo "Backing up $vol..."
  docker run --rm \
    -v "$vol":/source:ro \
    -v "$BKDIR":/backup \
    alpine tar czf "/backup/$vol.tar.gz" -C /source .
done

# Backup Ollama models (weekly only)
if [ "$(date +%u)" = "1" ]; then
  echo "Weekly: Backing up Ollama models..."
  tar czf "$BKDIR/ollama-models.tar.gz" -C /home/$USER .ollama/models
fi

# Backup Nginx configs
cp -r /etc/nginx/sites-available "$BKDIR/nginx-configs"

# Retention: keep 30 days
find /backups/ai-stack -maxdepth 1 -mtime +30 -exec rm -rf {} +

echo "=== Backup complete: $BKDIR ==="

chmod +x /usr/local/bin/backup-ai-stack.sh

# Add to crontab (daily at 3 AM)
echo "0 3 * * * /usr/local/bin/backup-ai-stack.sh >> /var/log/ai-stack-backup.log 2>&1" | sudo tee -a /etc/crontab

Resource Tuning & Monitoring

Memory Allocation Strategy (8GB VPS)

Service	RAM Allocation	Notes
Ollama + Model	3–4 GB	Largest consumer; varies by model
Open WebUI	~500 MB	Lightweight
Qdrant	~500 MB	Depends on vector count
AnythingLLM	~300 MB	Lightweight
Tabby	1–2 GB	Depends on code model
n8n	~300 MB	Grows with active workflows
Nginx + OS	~500 MB	Overhead

Docker Resource Limits

Add memory limits to prevent any single service from consuming all RAM:

Add to each docker-compose service

    deploy:
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 256M

Quick Monitoring with Uptime Kuma

For lightweight monitoring, add Uptime Kuma to check service health:

Add to your stack

  uptime-kuma:
    image: louislam/uptime-kuma:latest
    container_name: uptime-kuma
    restart: unless-stopped
    ports:
      - "3002:3001"
    volumes:
      - uptime-kuma-data:/app/data

Configure health checks for each service endpoint (Ollama :11434, Open WebUI :3000, Qdrant :6333, etc.).

The Complete Stack

Here's the unified Docker Compose file combining all services from Parts 1–7 with production settings:

docker-compose.prod.yml

version: "3.8"

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "127.0.0.1:3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
      - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
      - ENABLE_SIGNUP=false
    volumes:
      - open-webui-data:/app/backend/data
    extra_hosts:
      - "host.docker.internal:host-gateway"
    deploy:
      resources:
        limits:
          memory: 512M

  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    restart: unless-stopped
    ports:
      - "127.0.0.1:6333:6333"
    environment:
      - QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
    volumes:
      - qdrant-data:/qdrant/storage
    deploy:
      resources:
        limits:
          memory: 512M

  anythingllm:
    image: mintplexlabs/anythingllm:latest
    container_name: anythingllm
    restart: unless-stopped
    ports:
      - "127.0.0.1:3001:3001"
    environment:
      - LLM_PROVIDER=ollama
      - OLLAMA_BASE_PATH=http://host.docker.internal:11434
      - EMBEDDING_ENGINE=ollama
      - EMBEDDING_MODEL_PREF=nomic-embed-text
      - VECTOR_DB=qdrant
      - QDRANT_ENDPOINT=http://qdrant:6333
      - QDRANT_API_KEY=${QDRANT_API_KEY}
    volumes:
      - anythingllm-data:/app/server/storage
    extra_hosts:
      - "host.docker.internal:host-gateway"
    deploy:
      resources:
        limits:
          memory: 512M

  tabby:
    image: tabbyml/tabby:latest
    container_name: tabby
    restart: unless-stopped
    command: serve --model StarCoder-1B --device cpu
    ports:
      - "127.0.0.1:8080:8080"
    volumes:
      - tabby-data:/data
    environment:
      - TABBY_DISABLE_USAGE_COLLECTION=1
    deploy:
      resources:
        limits:
          memory: 2G

  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    restart: unless-stopped
    ports:
      - "127.0.0.1:5678:5678"
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=${N8N_USER}
      - N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
      - WEBHOOK_URL=https://auto.yourdomain.com/
      - GENERIC_TIMEZONE=UTC
    volumes:
      - n8n-data:/home/node/.n8n
    extra_hosts:
      - "host.docker.internal:host-gateway"
    deploy:
      resources:
        limits:
          memory: 512M

volumes:
  open-webui-data:
  qdrant-data:
  anythingllm-data:
  tabby-data:
  n8n-data:

.env (keep this secure!)

WEBUI_SECRET_KEY=your-generated-secret-key
QDRANT_API_KEY=your-generated-api-key
N8N_USER=admin
N8N_PASSWORD=your-secure-password

Notice: all ports bind to 127.0.0.1 — they're only accessible via the Nginx reverse proxy, not directly from the internet.

Maintenance Playbook

Monthly Checklist

☐Update Docker images: docker compose pull && docker compose up -d

☐Apply OS security patches: sudo apt update && sudo apt upgrade

☐Verify certificate renewal: sudo certbot renew --dry-run

☐Test backup restoration on a separate instance

☐Review Ollama models — pull newer versions if available

☐Check disk space: df -h

☐Review fail2ban bans: sudo fail2ban-client status

☐Rotate logs: verify logrotate is running

🎉 Series Complete!

You've built a complete, production-ready, private AI platform:

Part	Service	Replaces
1	Ollama	OpenAI API ($50–200/mo)
2	Open WebUI	ChatGPT Team ($125/mo)
3	Qdrant RAG	Pinecone ($70+/mo)
4	AnythingLLM	Custom AI apps ($)
5	Tabby	GitHub Copilot ($95/mo)
6	CrewAI	AI agent platforms ($)
7	n8n + Ollama	Zapier AI ($50+/mo)
Total	8GB RamNode VPS	$40/mo vs $390–690+/mo

Zero data exposure. Unlimited usage. Complete sovereignty over your AI infrastructure.

View Cloud VPS Plans More Deployment Guides

← Part 7: n8n + Ollama Back to Series Overview