Self-Hosted AI Stack Series
    Part 8 of 8

    Production Hardening

    Reverse proxy, centralized auth, automated backups, and the complete unified stack.

    50 minutes
    8GB recommended
    Prerequisites

    Completed Parts 1–7, domain name, basic Linux admin

    Time to Complete

    45–60 minutes

    Recommended Plan

    8GB ($40/mo) for the full production stack

    Introduction

    You've built a comprehensive AI platform across seven guides. Now it's time to make it production-ready: secure, resilient, monitored, and optimized. This final part transforms your development setup into infrastructure you can rely on.

    Nginx Reverse Proxy

    Route all services through a single Nginx instance with subdomain routing:

    Install Nginx
    sudo apt install -y nginx
    /etc/nginx/sites-available/ai-stack
    # Open WebUI — chat.yourdomain.com
    server {
        server_name chat.yourdomain.com;
        
        location / {
            proxy_pass http://localhost:3000;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            
            # WebSocket support for streaming
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_read_timeout 300s;
        }
    }
    
    # AnythingLLM — apps.yourdomain.com
    server {
        server_name apps.yourdomain.com;
        
        location / {
            proxy_pass http://localhost:3001;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
    
    # Tabby — code.yourdomain.com
    server {
        server_name code.yourdomain.com;
        
        location / {
            proxy_pass http://localhost:8080;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
    
    # n8n — auto.yourdomain.com
    server {
        server_name auto.yourdomain.com;
        
        location / {
            proxy_pass http://localhost:5678;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            
            # WebSocket for n8n
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
    }
    
    # Qdrant Dashboard — vectors.yourdomain.com
    server {
        server_name vectors.yourdomain.com;
        
        location / {
            proxy_pass http://localhost:6333;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
    sudo ln -s /etc/nginx/sites-available/ai-stack /etc/nginx/sites-enabled/
    sudo nginx -t
    sudo systemctl reload nginx

    TLS with Let's Encrypt

    Install Certbot and get certificates
    sudo apt install -y certbot python3-certbot-nginx
    
    # Get certificates for all subdomains
    sudo certbot --nginx -d chat.yourdomain.com   -d apps.yourdomain.com   -d code.yourdomain.com   -d auto.yourdomain.com   -d vectors.yourdomain.com

    Certbot automatically modifies your Nginx configs and sets up auto-renewal.

    Security Headers

    Add to each server block
    # Security headers
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    # Verify auto-renewal
    sudo certbot renew --dry-run

    Centralized Authentication

    For single sign-on across all services, consider deploying Authelia or Authentik as an authentication gateway:

    • Authelia — Lightweight, YAML-configured, perfect for small teams. See our Authelia guide
    • Authentik — Full-featured IdP with admin dashboard. See our Authentik guide

    Both support MFA/2FA. Configure each AI service to authenticate through the central gateway for a unified login experience.

    Firewall Configuration

    Lock down all services behind Nginx
    # Reset UFW
    sudo ufw reset
    
    # Default policies
    sudo ufw default deny incoming
    sudo ufw default allow outgoing
    
    # Allow SSH
    sudo ufw allow 22/tcp
    
    # Allow HTTP/HTTPS only (Nginx handles routing)
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
    
    # Enable firewall
    sudo ufw enable
    sudo ufw status verbose

    All application ports (3000, 3001, 5678, 6333, 8080, 11434) are now inaccessible from the internet — only Nginx on 80/443 can reach them internally.

    Fail2ban for Brute-Force Protection

    sudo apt install -y fail2ban
    
    # Configure Nginx jail
    sudo tee /etc/fail2ban/jail.local << 'EOF'
    [nginx-http-auth]
    enabled = true
    port = http,https
    filter = nginx-http-auth
    logpath = /var/log/nginx/error.log
    maxretry = 5
    bantime = 3600
    EOF
    
    sudo systemctl restart fail2ban

    Automated Backups

    Backup strategy for each component:

    ComponentDataFrequencySize
    Ollama models~/.ollama/modelsWeekly (rarely change)2–10 GB
    Qdrant vectorsDocker volumeDaily (incremental)Varies
    Open WebUIDocker volumeDailySmall
    n8n workflowsDocker volumeDailySmall
    AnythingLLMDocker volumeDailySmall
    backup-ai-stack.sh
    #!/bin/bash
    # Automated backup script for the AI stack
    BKDIR="/backups/ai-stack/$(date +%Y-%m-%d)"
    mkdir -p "$BKDIR"
    
    echo "=== AI Stack Backup: $(date) ==="
    
    # Backup Docker volumes
    for vol in open-webui-data qdrant-data anythingllm-data n8n-data tabby-data; do
      echo "Backing up $vol..."
      docker run --rm \
        -v "$vol":/source:ro \
        -v "$BKDIR":/backup \
        alpine tar czf "/backup/$vol.tar.gz" -C /source .
    done
    
    # Backup Ollama models (weekly only)
    if [ "$(date +%u)" = "1" ]; then
      echo "Weekly: Backing up Ollama models..."
      tar czf "$BKDIR/ollama-models.tar.gz" -C /home/$USER .ollama/models
    fi
    
    # Backup Nginx configs
    cp -r /etc/nginx/sites-available "$BKDIR/nginx-configs"
    
    # Retention: keep 30 days
    find /backups/ai-stack -maxdepth 1 -mtime +30 -exec rm -rf {} +
    
    echo "=== Backup complete: $BKDIR ==="
    chmod +x /usr/local/bin/backup-ai-stack.sh
    
    # Add to crontab (daily at 3 AM)
    echo "0 3 * * * /usr/local/bin/backup-ai-stack.sh >> /var/log/ai-stack-backup.log 2>&1" | sudo tee -a /etc/crontab

    Resource Tuning & Monitoring

    Memory Allocation Strategy (8GB VPS)

    ServiceRAM AllocationNotes
    Ollama + Model3–4 GBLargest consumer; varies by model
    Open WebUI~500 MBLightweight
    Qdrant~500 MBDepends on vector count
    AnythingLLM~300 MBLightweight
    Tabby1–2 GBDepends on code model
    n8n~300 MBGrows with active workflows
    Nginx + OS~500 MBOverhead

    Docker Resource Limits

    Add memory limits to prevent any single service from consuming all RAM:

    Add to each docker-compose service
        deploy:
          resources:
            limits:
              memory: 512M
            reservations:
              memory: 256M

    Quick Monitoring with Uptime Kuma

    For lightweight monitoring, add Uptime Kuma to check service health:

    Add to your stack
      uptime-kuma:
        image: louislam/uptime-kuma:latest
        container_name: uptime-kuma
        restart: unless-stopped
        ports:
          - "3002:3001"
        volumes:
          - uptime-kuma-data:/app/data

    Configure health checks for each service endpoint (Ollama :11434, Open WebUI :3000, Qdrant :6333, etc.).

    The Complete Stack

    Here's the unified Docker Compose file combining all services from Parts 1–7 with production settings:

    docker-compose.prod.yml
    version: "3.8"
    
    services:
      open-webui:
        image: ghcr.io/open-webui/open-webui:main
        container_name: open-webui
        restart: unless-stopped
        ports:
          - "127.0.0.1:3000:8080"
        environment:
          - OLLAMA_BASE_URL=http://host.docker.internal:11434
          - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
          - ENABLE_SIGNUP=false
        volumes:
          - open-webui-data:/app/backend/data
        extra_hosts:
          - "host.docker.internal:host-gateway"
        deploy:
          resources:
            limits:
              memory: 512M
    
      qdrant:
        image: qdrant/qdrant:latest
        container_name: qdrant
        restart: unless-stopped
        ports:
          - "127.0.0.1:6333:6333"
        environment:
          - QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
        volumes:
          - qdrant-data:/qdrant/storage
        deploy:
          resources:
            limits:
              memory: 512M
    
      anythingllm:
        image: mintplexlabs/anythingllm:latest
        container_name: anythingllm
        restart: unless-stopped
        ports:
          - "127.0.0.1:3001:3001"
        environment:
          - LLM_PROVIDER=ollama
          - OLLAMA_BASE_PATH=http://host.docker.internal:11434
          - EMBEDDING_ENGINE=ollama
          - EMBEDDING_MODEL_PREF=nomic-embed-text
          - VECTOR_DB=qdrant
          - QDRANT_ENDPOINT=http://qdrant:6333
          - QDRANT_API_KEY=${QDRANT_API_KEY}
        volumes:
          - anythingllm-data:/app/server/storage
        extra_hosts:
          - "host.docker.internal:host-gateway"
        deploy:
          resources:
            limits:
              memory: 512M
    
      tabby:
        image: tabbyml/tabby:latest
        container_name: tabby
        restart: unless-stopped
        command: serve --model StarCoder-1B --device cpu
        ports:
          - "127.0.0.1:8080:8080"
        volumes:
          - tabby-data:/data
        environment:
          - TABBY_DISABLE_USAGE_COLLECTION=1
        deploy:
          resources:
            limits:
              memory: 2G
    
      n8n:
        image: n8nio/n8n:latest
        container_name: n8n
        restart: unless-stopped
        ports:
          - "127.0.0.1:5678:5678"
        environment:
          - N8N_BASIC_AUTH_ACTIVE=true
          - N8N_BASIC_AUTH_USER=${N8N_USER}
          - N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
          - WEBHOOK_URL=https://auto.yourdomain.com/
          - GENERIC_TIMEZONE=UTC
        volumes:
          - n8n-data:/home/node/.n8n
        extra_hosts:
          - "host.docker.internal:host-gateway"
        deploy:
          resources:
            limits:
              memory: 512M
    
    volumes:
      open-webui-data:
      qdrant-data:
      anythingllm-data:
      tabby-data:
      n8n-data:
    .env (keep this secure!)
    WEBUI_SECRET_KEY=your-generated-secret-key
    QDRANT_API_KEY=your-generated-api-key
    N8N_USER=admin
    N8N_PASSWORD=your-secure-password

    Notice: all ports bind to 127.0.0.1 — they're only accessible via the Nginx reverse proxy, not directly from the internet.

    Maintenance Playbook

    Monthly Checklist

    Update Docker images: docker compose pull && docker compose up -d
    Apply OS security patches: sudo apt update && sudo apt upgrade
    Verify certificate renewal: sudo certbot renew --dry-run
    Test backup restoration on a separate instance
    Review Ollama models — pull newer versions if available
    Check disk space: df -h
    Review fail2ban bans: sudo fail2ban-client status
    Rotate logs: verify logrotate is running

    🎉 Series Complete!

    You've built a complete, production-ready, private AI platform:

    PartServiceReplaces
    1OllamaOpenAI API ($50–200/mo)
    2Open WebUIChatGPT Team ($125/mo)
    3Qdrant RAGPinecone ($70+/mo)
    4AnythingLLMCustom AI apps ($)
    5TabbyGitHub Copilot ($95/mo)
    6CrewAIAI agent platforms ($)
    7n8n + OllamaZapier AI ($50+/mo)
    Total8GB RamNode VPS$40/mo vs $390–690+/mo

    Zero data exposure. Unlimited usage. Complete sovereignty over your AI infrastructure.