ELT / CDC
    Docker Compose

    Deploy OLake on a VPS

    Self-host OLake on a RamNode VPS — Docker Compose stack with Temporal, PostgreSQL, and Elasticsearch for replicating databases into Iceberg and Parquet, behind Caddy TLS.

    OLake is an open-source, Go-based extract-and-load platform that replicates operational databases (PostgreSQL, MySQL, MongoDB, Oracle, MSSQL, DB2) and other sources into open lakehouse formats such as Apache Iceberg and Parquet. It runs full loads and change data capture without Spark, Flink, Kafka, or Debezium, and ships a self-serve web UI for configuring sources, destinations, and jobs. This guide deploys the full OLake UI stack on a single RamNode VPS using Docker Compose, then locks it behind Caddy with TLS so the UI is never directly exposed.

    OLake is licensed under Apache 2.0.

    What you are deploying

    The OLake UI is not a single container. The published compose stack brings up several services that work together:

    ServiceRole
    OLake UIWeb interface for sources, destinations, and jobs
    Temporal workerRuns the actual replication jobs
    Temporal serverWorkflow orchestration engine
    Temporal UIWorkflow monitoring and debugging
    PostgreSQLStores job configs and sync state
    ElasticsearchBacking store for Temporal workflow data
    Signup initOne-time job that creates the default admin user

    The UI is exposed on port 8000. The replication jobs themselves spin up additional short-lived connector containers, which is why this stack needs the Docker socket and a bit more memory than a typical web app.

    Prerequisites

    Because the stack includes Elasticsearch and Temporal alongside Postgres, give it room. A RamNode KVM VPS with at least 4 GB RAM and 2 vCPU is a sensible floor, and 8 GB is more comfortable if you run several concurrent jobs or large tables. Provision generous disk for the persistence directory and any local Parquet output.

    This guide assumes Ubuntu 24.04 LTS, a non-root sudo user, and a DNS A record (for example olake.example.com) pointing at the VPS before you begin so Caddy can issue a certificate.

    Your destination (Iceberg on S3, MinIO, Glue, a REST catalog such as Lakekeeper or Nessie, or local Parquet) is configured later inside the UI and is not part of this server build. If you write to remote object storage, those credentials are entered in the UI and encrypted at rest.

    1. Initial server preparation and hardening

    shell
    sudo adduser deploy
    sudo usermod -aG sudo deploy
    sudo apt update && sudo apt -y upgrade

    Harden SSH in /etc/ssh/sshd_config with PermitRootLogin no and PasswordAuthentication no, then sudo systemctl restart ssh once your key is in place.

    Configure the firewall so only SSH and the web ports are reachable. Port 8000, the Postgres port, Elasticsearch, and the Temporal ports all stay closed to the internet.

    shell
    sudo ufw default deny incoming
    sudo ufw default allow outgoing
    sudo ufw allow OpenSSH
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
    sudo ufw enable

    Enable automatic security updates:

    shell
    sudo apt -y install unattended-upgrades
    sudo dpkg-reconfigure --priority=low unattended-upgrades

    Elasticsearch needs an elevated vm.max_map_count. Set it permanently:

    shell
    echo 'vm.max_map_count=262144' | sudo tee /etc/sysctl.d/99-olake.conf
    sudo sysctl --system

    2. Install Docker

    shell
    sudo apt -y install ca-certificates curl gnupg
    sudo install -m 0755 -d /etc/apt/keyrings
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
      | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
      https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" \
      | sudo tee /etc/apt/sources.list.d/docker.list
    sudo apt update
    sudo apt -y install docker-ce docker-ce-cli containerd.io docker-compose-plugin
    sudo usermod -aG docker deploy

    Log out and back in so the group change takes effect.

    3. Fetch and configure the OLake stack

    OLake distributes a versioned compose file. Pull it into a working directory so you can edit the defaults before first launch rather than running the one-line quickstart, which would start with insecure defaults.

    shell
    mkdir -p ~/olake && cd ~/olake
    curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose-v1.yml -o docker-compose.yml

    Before starting anything, edit three blocks at the top of the file.

    Change the default admin credentials. The stack creates this user on first startup, so set it now:

    shell
    x-signup-defaults:
      username: &defaultUsername "your-admin-username"
      password: &defaultPassword "a-long-random-password"
      email: &defaultEmail "admin@example.com"

    Set an explicit persistence path so you know exactly which directory to back up. By default data lands in ${PWD}/olake-data:

    shell
    x-app-defaults:
      host_persistence_path: &hostPersistencePath /var/lib/olake

    Create that directory and make it writable:

    shell
    sudo install -d -o $USER -g $USER /var/lib/olake

    Set an encryption key so source and destination credentials are not stored in plaintext in the metadata database. For a single VPS a passphrase is fine (OLake hashes it with SHA-256); for stricter setups point this at a KMS key ARN:

    shell
    x-encryption:
      key: &encryptionKey "a-strong-passphrase-here"

    4. Bind the UI to localhost only

    The default compose file publishes the UI on 0.0.0.0:8000, which would expose it on the public IP. Pin it to loopback so only the reverse proxy can reach it. Find the OLake UI service ports entry and change it from 8000:8000 to:

    shell
        ports:
          - "127.0.0.1:8000:8000"

    Do the same for the Temporal UI service if it publishes a host port. Nothing in this stack needs a publicly bound port once Caddy is in front.

    5. Start the stack

    shell
    cd ~/olake
    docker compose up -d

    Give Elasticsearch and Temporal a minute to become healthy, then check:

    shell
    docker compose ps

    All services should report healthy or running. The signup-init container runs once and exits, which is expected. The UI is now answering on 127.0.0.1:8000.

    6. Reverse proxy and TLS with Caddy

    Install Caddy:

    shell
    sudo apt -y install debian-keyring debian-archive-keyring apt-transport-https curl
    curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' \
      | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
    curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' \
      | sudo tee /etc/apt/sources.list.d/caddy-stable.list
    sudo apt update && sudo apt -y install caddy

    OLake's UI login is your first line of defense, but you should add a second one at the proxy because the stack also exposes a Temporal UI with no auth of its own. Generate a bcrypt hash for an extra basic-auth gate:

    shell
    caddy hash-password --plaintext 'your-proxy-password'

    /etc/caddy/Caddyfile:

    shell
    olake.example.com {
        encode gzip
        basic_auth {
            opsuser PASTE_THE_BCRYPT_HASH_HERE
        }
        reverse_proxy 127.0.0.1:8000
    }
    shell
    sudo systemctl reload caddy

    For an internal-only tool like this, consider also restricting port 443 to known IP addresses, either in the Caddyfile with a remote-IP matcher or at the firewall. OLake holds credentials to your production databases, so treat access to it as sensitive.

    7. Backups

    OLake keeps its state in two places: the persistence directory you set in section 3, and the metadata Postgres database inside the stack. Back up both.

    The persistence directory is a straightforward file copy:

    shell
    sudo tar czf /var/backups/olake-data-$(date +%F).tar.gz -C /var/lib/olake .

    Dump the internal Postgres from within its container. Identify the service name from docker compose ps, then:

    shell
    docker compose exec -T postgresql pg_dumpall -U postgres \
      | gzip > /var/backups/olake-meta-$(date +%F).sql.gz

    Schedule both with cron or a systemd timer and push the archives off the VPS to RamNode object storage or another remote target. Your replicated data itself lives in the destination (S3, Iceberg, or local Parquet) and should be backed up according to that system's own practices. If you write Parquet locally on the VPS, include that output directory in your backup plan as well.

    8. Monitoring and alerting

    The Temporal UI is the authoritative view of job health: it shows running, completed, and failed workflows and lets you inspect why a sync failed. The temporal-worker logs also surface the periodic log-cleaner activity, which is worth watching to confirm old job logs are being pruned.

    For automated alerting, watch container health and failed Temporal workflows. A simple approach is a cron job that runs docker compose ps and flags any unhealthy service, plus a query against the Temporal API for failed workflow counts.

    On alert delivery, keep RamNode's mail restrictions in mind. RamNode blocks or throttles direct outbound SMTP on port 25 by default, so any alerting that relies on a local mailer or a raw port-25 connection will fail silently. Route notifications through a transactional email API over HTTPS, a chat webhook (Slack, Discord, or similar), or an authenticated relay on port 587 rather than expecting the VPS to deliver mail directly.

    9. Upgrades

    OLake changed its compose layout at the end of January 2026, moving to the docker-compose-v1.yml file used in this guide. If you are coming from an older docker-compose.yml, follow OLake's documented migration path rather than swapping files in place, since the persistence layout differs.

    For routine upgrades, pull the latest images and recreate:

    shell
    cd ~/olake
    docker compose pull
    docker compose up -d

    Your data and configuration survive because they live in the persistence directory and named volumes, not in the containers. Connector images used by jobs are pulled on demand; pin connector versions to stable releases (for example v0.1.8) rather than latest if you need reproducible job behavior.

    10. Troubleshooting

    If the UI never comes up, the most common cause is Elasticsearch failing to start because vm.max_map_count is too low. Confirm the sysctl from section 1 took effect with sysctl vm.max_map_count.

    If the stack starts but you cannot log in, the signup-init container may have run before you set custom credentials. Check its logs with docker compose logs signup-init, and if needed wipe and recreate with corrected defaults.

    If a sync job fails immediately, open the Temporal UI and inspect the workflow. Source connection errors (wrong host, missing replication permissions, CDC not enabled on the source) are the usual culprits and show up clearly there.

    If the box runs out of memory under concurrent jobs, reduce job concurrency or move to a larger RamNode plan. Elasticsearch and Temporal together set a meaningful baseline before any replication work begins.