Event Streaming Guide

    Self-Hosted Apache Kafka

    Deploy Apache Kafka, the distributed event streaming platform, on RamNode VPS. Handle trillions of events with real-time data pipelines.

    Ubuntu 20.04+
    Kafka 3.7+
    ⏱️ 20-30 minutes

    Why Apache Kafka?

    Apache Kafka is a distributed event streaming platform capable of handling trillions of events per day. Originally developed by LinkedIn, it has become the backbone for real-time data pipelines and event-driven architectures.

    Real-time data streaming and event processing
    Log aggregation and centralized logging
    Microservices communication and decoupling
    Change data capture (CDC) and replication

    Prerequisites

    Development

    • • 2+ CPU cores
    • • 4 GB RAM minimum
    • • 20 GB SSD storage
    • • Java 11+

    Production

    • • 4+ CPU cores
    • • 8+ GB RAM recommended
    • • 100+ GB NVMe SSD
    • • Java 17+ (LTS)
    1

    Initial Server Setup

    Update system and install dependencies:

    System Setup
    # Update system packages
    sudo apt update && sudo apt upgrade -y
    
    # Install essential tools
    sudo apt install -y wget curl net-tools
    
    # Set timezone (adjust as needed)
    sudo timedatectl set-timezone America/New_York

    Java Installation

    1

    Install OpenJDK 17

    Apache Kafka requires Java to run:

    Install Java
    # Install OpenJDK 17
    sudo apt install -y openjdk-17-jdk
    
    # Verify installation
    java -version
    
    # Set JAVA_HOME environment variable
    echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64' >> ~/.bashrc
    echo 'export PATH=$PATH:$JAVA_HOME/bin' >> ~/.bashrc
    source ~/.bashrc

    Kafka Installation

    1

    Create Kafka User

    Create a dedicated user for security:

    Create User
    # Create kafka user
    sudo useradd -r -s /bin/false kafka
    
    # Create directories
    sudo mkdir -p /opt/kafka /var/kafka/data /var/kafka/logs
    
    # Set ownership
    sudo chown -R kafka:kafka /opt/kafka /var/kafka
    2

    Download and Extract Kafka

    Download the latest Kafka release:

    Download Kafka
    # Download Kafka (check apache.org for latest version)
    cd /tmp
    wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
    
    # Extract to installation directory
    sudo tar -xzf kafka_2.13-3.7.0.tgz -C /opt/kafka --strip-components=1
    
    # Set ownership
    sudo chown -R kafka:kafka /opt/kafka

    Note: Kafka 3.7+ includes KRaft mode which eliminates the ZooKeeper dependency. This guide covers KRaft configuration.

    KRaft Mode Configuration

    Recommended: KRaft (Kafka Raft) mode eliminates ZooKeeper dependency, simplifying operations and reducing resource requirements.

    1

    Generate Cluster ID

    Generate a unique cluster identifier:

    Generate Cluster ID
    # Generate unique cluster ID
    KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
    echo $KAFKA_CLUSTER_ID
    
    # Save for reference
    echo $KAFKA_CLUSTER_ID | sudo tee /var/kafka/cluster-id
    2

    Configure KRaft Properties

    Create the KRaft configuration file:

    Edit Configuration
    sudo nano /opt/kafka/config/kraft/server.properties
    server.properties
    # Server Basics
    process.roles=broker,controller
    node.id=1
    controller.quorum.voters=1@localhost:9093
    
    # Listeners
    listeners=PLAINTEXT://:9092,CONTROLLER://:9093
    advertised.listeners=PLAINTEXT://YOUR_VPS_IP:9092
    controller.listener.names=CONTROLLER
    inter.broker.listener.name=PLAINTEXT
    
    # Log Directories
    log.dirs=/var/kafka/data
    
    # Topic Defaults
    num.partitions=3
    default.replication.factor=1
    min.insync.replicas=1
    
    # Log Retention
    log.retention.hours=168
    log.retention.bytes=1073741824
    log.segment.bytes=1073741824
    
    # Performance Tuning
    num.network.threads=3
    num.io.threads=8
    socket.send.buffer.bytes=102400
    socket.receive.buffer.bytes=102400
    socket.request.max.bytes=104857600

    Important: Replace YOUR_VPS_IP with your actual RamNode VPS public IP address.

    3

    Format Storage Directory

    Format storage with the cluster ID:

    Format Storage
    # Format storage with cluster ID
    sudo -u kafka /opt/kafka/bin/kafka-storage.sh format \
      -t $(cat /var/kafka/cluster-id) \
      -c /opt/kafka/config/kraft/server.properties

    Systemd Service

    1

    Create Service File

    Create a systemd service for automatic startup:

    Create Service
    sudo nano /etc/systemd/system/kafka.service
    kafka.service
    [Unit]
    Description=Apache Kafka Server
    Documentation=https://kafka.apache.org/documentation/
    Requires=network.target
    After=network.target
    
    [Service]
    Type=simple
    User=kafka
    Group=kafka
    Environment="JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64"
    Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
    ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
    ExecStop=/opt/kafka/bin/kafka-server-stop.sh
    Restart=on-failure
    RestartSec=10
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    2

    Enable and Start Kafka

    Start the Kafka service:

    Start Service
    # Reload systemd
    sudo systemctl daemon-reload
    
    # Enable on boot
    sudo systemctl enable kafka
    
    # Start Kafka
    sudo systemctl start kafka
    
    # Check status
    sudo systemctl status kafka

    Basic Operations

    1

    Creating Topics

    Create and manage Kafka topics:

    Topic Management
    # Create a topic with 3 partitions
    /opt/kafka/bin/kafka-topics.sh --create \
      --topic my-first-topic \
      --partitions 3 \
      --replication-factor 1 \
      --bootstrap-server localhost:9092
    
    # List all topics
    /opt/kafka/bin/kafka-topics.sh --list \
      --bootstrap-server localhost:9092
    
    # Describe topic details
    /opt/kafka/bin/kafka-topics.sh --describe \
      --topic my-first-topic \
      --bootstrap-server localhost:9092
    2

    Producing Messages

    Send messages to a topic:

    Console Producer
    # Start console producer
    /opt/kafka/bin/kafka-console-producer.sh \
      --topic my-first-topic \
      --bootstrap-server localhost:9092
    
    # Type messages and press Enter
    # Press Ctrl+C to exit
    3

    Consuming Messages

    Read messages from a topic:

    Console Consumer
    # Start console consumer (from beginning)
    /opt/kafka/bin/kafka-console-consumer.sh \
      --topic my-first-topic \
      --from-beginning \
      --bootstrap-server localhost:9092
    
    # Consume with consumer group
    /opt/kafka/bin/kafka-console-consumer.sh \
      --topic my-first-topic \
      --group my-consumer-group \
      --bootstrap-server localhost:9092
    4

    Managing Consumer Groups

    View and manage consumer groups:

    Consumer Groups
    # List consumer groups
    /opt/kafka/bin/kafka-consumer-groups.sh --list \
      --bootstrap-server localhost:9092
    
    # Describe consumer group
    /opt/kafka/bin/kafka-consumer-groups.sh --describe \
      --group my-consumer-group \
      --bootstrap-server localhost:9092
    
    # Reset offsets to earliest
    /opt/kafka/bin/kafka-consumer-groups.sh \
      --group my-consumer-group \
      --topic my-first-topic \
      --reset-offsets --to-earliest \
      --execute \
      --bootstrap-server localhost:9092

    Security Configuration

    1

    Firewall Configuration

    Configure UFW to allow only necessary ports:

    UFW Rules
    # Enable UFW if not already enabled
    sudo ufw enable
    
    # Allow SSH (important!)
    sudo ufw allow 22/tcp
    
    # Allow Kafka broker port
    sudo ufw allow 9092/tcp
    
    # Allow from specific IP range (recommended)
    # sudo ufw allow from 10.0.0.0/8 to any port 9092
    
    # Verify rules
    sudo ufw status verbose

    SSL/TLS Encryption

    1

    Generate SSL Certificates

    Create SSL certificates for encrypted connections:

    Generate Certificates
    # Create SSL directory
    sudo mkdir -p /opt/kafka/ssl
    cd /opt/kafka/ssl
    
    # Generate CA
    openssl req -new -x509 -keyout ca-key -out ca-cert -days 365 \
      -subj '/CN=kafka-ca' -nodes
    
    # Generate server keystore
    keytool -keystore kafka.server.keystore.jks -alias localhost \
      -keyalg RSA -validity 365 -genkey \
      -storepass changeme -keypass changeme \
      -dname 'CN=YOUR_VPS_IP'
    
    # Create certificate signing request
    keytool -keystore kafka.server.keystore.jks -alias localhost \
      -certreq -file cert-file -storepass changeme
    
    # Sign certificate with CA
    openssl x509 -req -CA ca-cert -CAkey ca-key \
      -in cert-file -out cert-signed \
      -days 365 -CAcreateserial
    
    # Import CA and signed cert to keystore
    keytool -keystore kafka.server.keystore.jks -alias CARoot \
      -import -file ca-cert -storepass changeme -noprompt
    keytool -keystore kafka.server.keystore.jks -alias localhost \
      -import -file cert-signed -storepass changeme
    
    # Create truststore
    keytool -keystore kafka.server.truststore.jks -alias CARoot \
      -import -file ca-cert -storepass changeme -noprompt
    
    # Set permissions
    sudo chown -R kafka:kafka /opt/kafka/ssl
    sudo chmod 600 /opt/kafka/ssl/*
    2

    Configure SSL in Kafka

    Update server.properties for SSL:

    SSL Configuration
    # Add SSL listener
    listeners=PLAINTEXT://:9092,SSL://:9093,CONTROLLER://:9094
    advertised.listeners=PLAINTEXT://YOUR_VPS_IP:9092,SSL://YOUR_VPS_IP:9093
    
    # SSL Configuration
    ssl.keystore.location=/opt/kafka/ssl/kafka.server.keystore.jks
    ssl.keystore.password=changeme
    ssl.key.password=changeme
    ssl.truststore.location=/opt/kafka/ssl/kafka.server.truststore.jks
    ssl.truststore.password=changeme
    ssl.client.auth=required
    ssl.enabled.protocols=TLSv1.3,TLSv1.2

    Performance Tuning

    1

    JVM Configuration

    Optimize JVM settings for your VPS resources:

    VPS RAMHeap Settings
    4GB VPSKAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
    8GB VPSKAFKA_HEAP_OPTS="-Xmx4G -Xms4G"
    16GB+ VPSKAFKA_HEAP_OPTS="-Xmx6G -Xms6G"
    JVM Performance Options
    KAFKA_JVM_PERFORMANCE_OPTS="-XX:+UseG1GC \
      -XX:MaxGCPauseMillis=20 \
      -XX:InitiatingHeapOccupancyPercent=35 \
      -XX:+ExplicitGCInvokesConcurrent"
    2

    OS-Level Tuning

    Optimize kernel parameters:

    File Descriptor Limits
    # Increase file descriptor limits
    echo 'kafka soft nofile 65536' | sudo tee -a /etc/security/limits.conf
    echo 'kafka hard nofile 65536' | sudo tee -a /etc/security/limits.conf
    Kernel Parameters
    # Add to /etc/sysctl.conf
    sudo tee -a /etc/sysctl.conf << EOF
    # Kafka performance tuning
    vm.swappiness=1
    vm.dirty_ratio=80
    vm.dirty_background_ratio=5
    net.core.wmem_default=131072
    net.core.rmem_default=131072
    net.core.wmem_max=2097152
    net.core.rmem_max=2097152
    net.ipv4.tcp_wmem=4096 65536 2048000
    net.ipv4.tcp_rmem=4096 65536 2048000
    EOF
    
    # Apply changes
    sudo sysctl -p

    Monitoring

    1

    Enable JMX Monitoring

    Add JMX settings to the kafka.service file:

    JMX Configuration
    Environment="JMX_PORT=9999"
    Environment="KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote \
      -Dcom.sun.management.jmxremote.authenticate=false \
      -Dcom.sun.management.jmxremote.ssl=false \
      -Djava.rmi.server.hostname=YOUR_VPS_IP"
    2

    Log Management

    View and manage Kafka logs:

    View Logs
    # View Kafka logs
    sudo journalctl -u kafka -f
    
    # Check server log
    tail -f /opt/kafka/logs/server.log
    3

    Health Check Script

    Create a simple health check script:

    /opt/kafka/scripts/health-check.sh
    #!/bin/bash
    BOOTSTRAP=localhost:9092
    
    # Check if Kafka is responding
    if /opt/kafka/bin/kafka-broker-api-versions.sh \
        --bootstrap-server $BOOTSTRAP &>/dev/null; then
        echo "Kafka is healthy"
        exit 0
    else
        echo "Kafka is not responding"
        exit 1
    fi

    Troubleshooting

    Kafka Fails to Start

    Check journalctl for errors and verify JAVA_HOME is set correctly:

    Debug Startup
    sudo journalctl -u kafka -n 50
    echo $JAVA_HOME

    Connection Refused

    Verify advertised.listeners matches your VPS IP and check firewall rules:

    Check Configuration
    grep advertised.listeners /opt/kafka/config/kraft/server.properties
    sudo ufw status

    Out of Memory Errors

    Reduce KAFKA_HEAP_OPTS values or increase VPS RAM allocation. Edit the systemd service file to adjust heap settings.

    Disk Space Issues

    Adjust log.retention.hours and log.retention.bytes settings in server.properties:

    Check Disk
    df -h /var/kafka/data
    du -sh /var/kafka/data/*

    Quick Reference

    Default Ports

    • • Broker: 9092
    • • Controller: 9093
    • • SSL: 9093 (if configured)
    • • JMX: 9999

    Important Paths

    • • Install: /opt/kafka/
    • • Data: /var/kafka/data/
    • • Config: /opt/kafka/config/kraft/
    • • Logs: /opt/kafka/logs/

    Common Commands

    • • kafka-topics.sh
    • • kafka-console-producer.sh
    • • kafka-console-consumer.sh
    • • kafka-consumer-groups.sh

    Next Steps

    • • Explore Kafka Connect
    • • Set up Kafka Streams
    • • Implement Prometheus monitoring
    • • Consider Schema Registry