OpenTelemetry Deployment Guide | Full-Stack Observability

What is OpenTelemetry?

OpenTelemetry (OTel) is the industry-standard open-source observability framework for collecting, processing, and exporting telemetry data. It provides a unified approach to traces, metrics, and logs, eliminating the need for multiple vendor-specific agents.

• Distributed Tracing: Track requests across services with Jaeger
• Metrics Collection: Time-series data via Prometheus
• Host Metrics: CPU, memory, disk, and network out of the box
• Vendor Neutral: Single SDK for any backend
• Grafana Dashboards: Unified visualization across all signals
• Batch Processing: Efficient data pipeline with configurable limits

This guide builds on top of other RamNode guides:

Docker Basics Guide Grafana + Prometheus Guide

Prerequisites

Recommended VPS Specifications

Component	Minimum	Recommended
CPU	2 vCPUs	4 vCPUs
RAM	2 GB	4 GB
Storage	20 GB SSD	40 GB+ SSD
OS	Ubuntu 22.04 LTS	Ubuntu 24.04 LTS

💡 Tip: A RamNode KVM VPS with 4 GB RAM is ideal for running the full observability stack alongside a small to medium application workload. For high-throughput environments, consider scaling to 8 GB. View Cloud VPS Plans →

Software Requirements

• Docker Engine 24.0+ and Docker Compose v2 — see our Docker guide
• curl and wget for downloading binaries

Initial Server Setup

Update your system and install Docker:

Update & Install Prerequisites

sudo apt update && sudo apt upgrade -y
sudo apt install -y ca-certificates curl gnupg lsb-release

Install Docker

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker

Verify Installation

docker --version
docker compose version

For a detailed Docker walkthrough, see our Docker Basics Guide.

Architecture Overview

Component Breakdown

Component	Role	Default Port
OTel Collector	Receives, processes, and exports telemetry	4317 (gRPC), 4318 (HTTP)
Prometheus	Time-series metrics storage and querying	9090
Jaeger	Distributed trace storage and UI	16686 (UI), 14250
Grafana	Visualization dashboards and alerting	3000

Data Flow

Your application sends traces and metrics to the OpenTelemetry Collector via gRPC or HTTP. The Collector processes the data (batching, filtering, enriching) and exports it to the appropriate backends — metrics go to Prometheus, traces go to Jaeger. Grafana connects to each backend as a data source, providing unified dashboards across all three telemetry signals.

Deploy with Docker Compose

Create Project Directory

mkdir -p ~/otel-stack/{config,data/prometheus,data/grafana}
cd ~/otel-stack

docker-compose.yml

version: '3.8'

services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    container_name: otel-collector
    command: ['--config=/etc/otel/config.yaml']
    volumes:
      - ./config/otel-collector.yaml:/etc/otel/config.yaml:ro
    ports:
      - '4317:4317'     # OTLP gRPC
      - '4318:4318'     # OTLP HTTP
      - '8888:8888'     # Collector metrics
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - ./config/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - ./data/prometheus:/prometheus
    ports:
      - '9090:9090'
    restart: unless-stopped

  jaeger:
    image: jaegertracing/all-in-one:latest
    container_name: jaeger
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    ports:
      - '16686:16686'    # Jaeger UI
      - '14250:14250'    # gRPC
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    volumes:
      - ./data/grafana:/var/lib/grafana
    ports:
      - '3000:3000'
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=changeme
    restart: unless-stopped

⚠️ Important: Replace GF_SECURITY_ADMIN_PASSWORD with a strong, unique password before deploying to production.

Configure the OpenTelemetry Collector

The Collector is the central hub. Its configuration defines receivers, processors, and exporters.

config/otel-collector.yaml

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      memory: {}
      disk: {}
      network: {}
      load: {}

processors:
  batch:
    send_batch_size: 1024
    timeout: 5s
  memory_limiter:
    check_interval: 5s
    limit_mib: 512
    spike_limit_mib: 128
  resourcedetection:
    detectors: [system]
    system:
      hostname_sources: [os]

exporters:
  prometheus:
    endpoint: 0.0.0.0:8889
    namespace: otel
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true
  logging:
    loglevel: warn

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlp/jaeger, logging]
    metrics:
      receivers: [otlp, hostmetrics]
      processors: [memory_limiter, batch, resourcedetection]
      exporters: [prometheus]

Key Configuration Decisions

Memory limiter: Prevents the Collector from consuming excessive RAM. On a 4 GB VPS, capping at 512 MiB with a 128 MiB spike limit provides a good balance.
Batch processor: Batching reduces network overhead and improves backend write performance. A batch size of 1024 with a 5-second timeout works well for most workloads.
Host metrics: The hostmetrics receiver provides system-level CPU, memory, disk, and network metrics without any application instrumentation.

Configure Prometheus

Prometheus scrapes the metrics endpoint exposed by the Collector.

config/prometheus.yml

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otel-collector:8889']

  - job_name: 'collector-internal'
    static_configs:
      - targets: ['otel-collector:8888']

The first job collects application and host metrics (port 8889). The second scrapes the Collector's own internal metrics (port 8888) for monitoring pipeline health.

Launch the Stack

Start All Services

cd ~/otel-stack
docker compose up -d

Verify Running Containers

docker compose ps

Check Collector Logs

docker compose logs otel-collector --tail 50

Verify Endpoints

Service	URL	Expected
Grafana	`http://YOUR_IP:3000`	Login page
Jaeger UI	`http://YOUR_IP:16686`	Search page
Prometheus	`http://YOUR_IP:9090`	Query interface
Collector Health	`http://YOUR_IP:8888/metrics`	Metrics output

Instrument a Sample Application

To verify the full pipeline, instrument a simple Python Flask application. The same principles apply to any language supported by OpenTelemetry SDKs.

Install OTel Python Packages

pip install flask \
  opentelemetry-api \
  opentelemetry-sdk \
  opentelemetry-instrumentation-flask \
  opentelemetry-exporter-otlp

app.py — Flask with OTel Instrumentation

from flask import Flask
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter \
    import OTLPSpanExporter
from opentelemetry.instrumentation.flask \
    import FlaskInstrumentor
from opentelemetry.sdk.resources import Resource

# Configure the tracer
resource = Resource.create({'service.name': 'my-flask-app'})
provider = TracerProvider(resource=resource)
exporter = OTLPSpanExporter(
    endpoint='http://localhost:4317',
    insecure=True
)
provider.add_span_processor(BatchSpanExporter(exporter))
trace.set_tracer_provider(provider)

app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)

@app.route('/')
def hello():
    return 'Hello from OpenTelemetry!'

@app.route('/health')
def health():
    return {'status': 'ok'}

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Test the Application

python app.py &
curl http://localhost:5000/
curl http://localhost:5000/health

After sending requests, open the Jaeger UI at http://YOUR_IP:16686 and select "my-flask-app" from the service dropdown. You should see traces for each HTTP request.

Set Up Grafana Dashboards

Grafana provides the visualization layer. For a comprehensive walkthrough of Grafana setup, see our Grafana + Prometheus Guide.

Adding Data Sources

Prometheus: Navigate to Configuration → Data Sources → Add data source. Select Prometheus and set the URL to http://prometheus:9090. Click Save & Test.
Jaeger: Add another data source, select Jaeger, and set the URL to http://jaeger:16686. This enables trace visualization within Grafana panels.

Recommended Dashboards

• Node Exporter Full (Dashboard ID 1860) for system-level metrics
• OpenTelemetry Collector (Dashboard ID 15983) for pipeline health
• Custom application dashboards using PromQL queries against your OTel metrics

To import a community dashboard, go to Dashboards → Import, enter the dashboard ID, select your Prometheus data source, and click Import.

Firewall & Security Hardening

UFW Configuration

# Allow SSH
sudo ufw allow 22/tcp

# Allow OTel Collector (restrict to app servers)
sudo ufw allow from YOUR_APP_IP to any port 4317
sudo ufw allow from YOUR_APP_IP to any port 4318

# Allow Grafana (restrict to your IP)
sudo ufw allow from YOUR_ADMIN_IP to any port 3000

# Block public access to Prometheus and Jaeger
# Access via Grafana data sources instead

sudo ufw enable

Additional Security Measures

• Place Grafana behind a reverse proxy (Nginx or Caddy) with TLS
• Enable Grafana authentication with OAuth or LDAP for team access
• Set resource limits on Docker containers to prevent runaway memory usage
• Rotate the Grafana admin password and store credentials in environment files excluded from version control
• Use Docker network isolation so Prometheus and Jaeger are only accessible within the Compose network

Resource Tuning for VPS Environments

Memory Allocation Guidelines

Component	2 GB VPS	4 GB VPS	8 GB VPS
OTel Collector	256 MiB	512 MiB	1 GiB
Prometheus	512 MiB	1 GiB	2 GiB
Jaeger	256 MiB	512 MiB	1 GiB
Grafana	128 MiB	256 MiB	512 MiB

Prometheus Retention Settings

Adjust storage retention to match your available disk space. Add these flags to the Prometheus service:

Prometheus Retention Flags

command:
  - '--config.file=/etc/prometheus/prometheus.yml'
  - '--storage.tsdb.retention.time=15d'
  - '--storage.tsdb.retention.size=5GB'

Reducing Collector Overhead

For low-traffic applications, increase the batch timeout and reduce scrape frequency:

Low-Traffic Batch Config

processors:
  batch:
    send_batch_size: 512
    timeout: 10s

Troubleshooting

Common Issues

Collector fails to start: Check the config YAML for syntax errors using docker compose logs otel-collector. The most common issue is incorrect indentation in pipeline definitions.

No metrics in Prometheus: Verify that the scrape target is reachable at http://YOUR_IP:9090/targets. If the otel-collector target shows as down, confirm port 8889 is exposed.

No traces in Jaeger: Confirm your application sends data to the correct Collector endpoint (port 4317 for gRPC or 4318 for HTTP). Check Collector logs for export errors.

High memory usage: Lower the memory_limiter thresholds and add Docker memory limits using deploy.resources.limits.memory in your Compose file.

Grafana cannot reach data sources: Use Docker service names (prometheus, jaeger) rather than localhost in Grafana data source URLs.

Useful Diagnostic Commands

Container Resource Usage

docker stats --no-stream

Test OTLP Endpoint Connectivity

curl -v http://localhost:4318/v1/traces

View Collector Internal Metrics

curl http://localhost:8888/metrics | grep otel

Restart a Single Service

docker compose restart otel-collector

Next Steps

• Add Loki for centralized log aggregation, completing the three pillars of observability
• Set up Grafana alerting rules for latency spikes, error rates, or resource exhaustion
• Instrument additional services and use trace context propagation across microservices
• Implement the OTel Collector's tail_sampling processor to reduce storage costs
• Explore Grafana Tempo as an alternative to Jaeger for Grafana-native tracing
• Automate the deployment with Ansible or Terraform for reproducible infrastructure

← More Deployment Guides View Cloud VPS Plans →

Deploy OpenTelemetry on RamNode VPS

📋 Guide Contents