Why Apache Kafka?
Apache Kafka is a distributed event streaming platform capable of handling trillions of events per day. Originally developed by LinkedIn, it has become the backbone for real-time data pipelines and event-driven architectures.
Prerequisites
Development
- • 2+ CPU cores
- • 4 GB RAM minimum
- • 20 GB SSD storage
- • Java 11+
Production
- • 4+ CPU cores
- • 8+ GB RAM recommended
- • 100+ GB NVMe SSD
- • Java 17+ (LTS)
Initial Server Setup
Update system and install dependencies:
# Update system packages
sudo apt update && sudo apt upgrade -y
# Install essential tools
sudo apt install -y wget curl net-tools
# Set timezone (adjust as needed)
sudo timedatectl set-timezone America/New_YorkJava Installation
Install OpenJDK 17
Apache Kafka requires Java to run:
# Install OpenJDK 17
sudo apt install -y openjdk-17-jdk
# Verify installation
java -version
# Set JAVA_HOME environment variable
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64' >> ~/.bashrc
echo 'export PATH=$PATH:$JAVA_HOME/bin' >> ~/.bashrc
source ~/.bashrcKafka Installation
Create Kafka User
Create a dedicated user for security:
# Create kafka user
sudo useradd -r -s /bin/false kafka
# Create directories
sudo mkdir -p /opt/kafka /var/kafka/data /var/kafka/logs
# Set ownership
sudo chown -R kafka:kafka /opt/kafka /var/kafkaDownload and Extract Kafka
Download the latest Kafka release:
# Download Kafka (check apache.org for latest version)
cd /tmp
wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
# Extract to installation directory
sudo tar -xzf kafka_2.13-3.7.0.tgz -C /opt/kafka --strip-components=1
# Set ownership
sudo chown -R kafka:kafka /opt/kafkaNote: Kafka 3.7+ includes KRaft mode which eliminates the ZooKeeper dependency. This guide covers KRaft configuration.
KRaft Mode Configuration
Recommended: KRaft (Kafka Raft) mode eliminates ZooKeeper dependency, simplifying operations and reducing resource requirements.
Generate Cluster ID
Generate a unique cluster identifier:
# Generate unique cluster ID
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
echo $KAFKA_CLUSTER_ID
# Save for reference
echo $KAFKA_CLUSTER_ID | sudo tee /var/kafka/cluster-idConfigure KRaft Properties
Create the KRaft configuration file:
sudo nano /opt/kafka/config/kraft/server.properties# Server Basics
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@localhost:9093
# Listeners
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://YOUR_VPS_IP:9092
controller.listener.names=CONTROLLER
inter.broker.listener.name=PLAINTEXT
# Log Directories
log.dirs=/var/kafka/data
# Topic Defaults
num.partitions=3
default.replication.factor=1
min.insync.replicas=1
# Log Retention
log.retention.hours=168
log.retention.bytes=1073741824
log.segment.bytes=1073741824
# Performance Tuning
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600Important: Replace YOUR_VPS_IP with your actual RamNode VPS public IP address.
Format Storage Directory
Format storage with the cluster ID:
# Format storage with cluster ID
sudo -u kafka /opt/kafka/bin/kafka-storage.sh format \
-t $(cat /var/kafka/cluster-id) \
-c /opt/kafka/config/kraft/server.propertiesSystemd Service
Create Service File
Create a systemd service for automatic startup:
sudo nano /etc/systemd/system/kafka.service[Unit]
Description=Apache Kafka Server
Documentation=https://kafka.apache.org/documentation/
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
RestartSec=10
LimitNOFILE=65536
[Install]
WantedBy=multi-user.targetEnable and Start Kafka
Start the Kafka service:
# Reload systemd
sudo systemctl daemon-reload
# Enable on boot
sudo systemctl enable kafka
# Start Kafka
sudo systemctl start kafka
# Check status
sudo systemctl status kafkaBasic Operations
Creating Topics
Create and manage Kafka topics:
# Create a topic with 3 partitions
/opt/kafka/bin/kafka-topics.sh --create \
--topic my-first-topic \
--partitions 3 \
--replication-factor 1 \
--bootstrap-server localhost:9092
# List all topics
/opt/kafka/bin/kafka-topics.sh --list \
--bootstrap-server localhost:9092
# Describe topic details
/opt/kafka/bin/kafka-topics.sh --describe \
--topic my-first-topic \
--bootstrap-server localhost:9092Producing Messages
Send messages to a topic:
# Start console producer
/opt/kafka/bin/kafka-console-producer.sh \
--topic my-first-topic \
--bootstrap-server localhost:9092
# Type messages and press Enter
# Press Ctrl+C to exitConsuming Messages
Read messages from a topic:
# Start console consumer (from beginning)
/opt/kafka/bin/kafka-console-consumer.sh \
--topic my-first-topic \
--from-beginning \
--bootstrap-server localhost:9092
# Consume with consumer group
/opt/kafka/bin/kafka-console-consumer.sh \
--topic my-first-topic \
--group my-consumer-group \
--bootstrap-server localhost:9092Managing Consumer Groups
View and manage consumer groups:
# List consumer groups
/opt/kafka/bin/kafka-consumer-groups.sh --list \
--bootstrap-server localhost:9092
# Describe consumer group
/opt/kafka/bin/kafka-consumer-groups.sh --describe \
--group my-consumer-group \
--bootstrap-server localhost:9092
# Reset offsets to earliest
/opt/kafka/bin/kafka-consumer-groups.sh \
--group my-consumer-group \
--topic my-first-topic \
--reset-offsets --to-earliest \
--execute \
--bootstrap-server localhost:9092Security Configuration
Firewall Configuration
Configure UFW to allow only necessary ports:
# Enable UFW if not already enabled
sudo ufw enable
# Allow SSH (important!)
sudo ufw allow 22/tcp
# Allow Kafka broker port
sudo ufw allow 9092/tcp
# Allow from specific IP range (recommended)
# sudo ufw allow from 10.0.0.0/8 to any port 9092
# Verify rules
sudo ufw status verboseSSL/TLS Encryption
Generate SSL Certificates
Create SSL certificates for encrypted connections:
# Create SSL directory
sudo mkdir -p /opt/kafka/ssl
cd /opt/kafka/ssl
# Generate CA
openssl req -new -x509 -keyout ca-key -out ca-cert -days 365 \
-subj '/CN=kafka-ca' -nodes
# Generate server keystore
keytool -keystore kafka.server.keystore.jks -alias localhost \
-keyalg RSA -validity 365 -genkey \
-storepass changeme -keypass changeme \
-dname 'CN=YOUR_VPS_IP'
# Create certificate signing request
keytool -keystore kafka.server.keystore.jks -alias localhost \
-certreq -file cert-file -storepass changeme
# Sign certificate with CA
openssl x509 -req -CA ca-cert -CAkey ca-key \
-in cert-file -out cert-signed \
-days 365 -CAcreateserial
# Import CA and signed cert to keystore
keytool -keystore kafka.server.keystore.jks -alias CARoot \
-import -file ca-cert -storepass changeme -noprompt
keytool -keystore kafka.server.keystore.jks -alias localhost \
-import -file cert-signed -storepass changeme
# Create truststore
keytool -keystore kafka.server.truststore.jks -alias CARoot \
-import -file ca-cert -storepass changeme -noprompt
# Set permissions
sudo chown -R kafka:kafka /opt/kafka/ssl
sudo chmod 600 /opt/kafka/ssl/*Configure SSL in Kafka
Update server.properties for SSL:
# Add SSL listener
listeners=PLAINTEXT://:9092,SSL://:9093,CONTROLLER://:9094
advertised.listeners=PLAINTEXT://YOUR_VPS_IP:9092,SSL://YOUR_VPS_IP:9093
# SSL Configuration
ssl.keystore.location=/opt/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=changeme
ssl.key.password=changeme
ssl.truststore.location=/opt/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=changeme
ssl.client.auth=required
ssl.enabled.protocols=TLSv1.3,TLSv1.2Performance Tuning
JVM Configuration
Optimize JVM settings for your VPS resources:
| VPS RAM | Heap Settings |
|---|---|
| 4GB VPS | KAFKA_HEAP_OPTS="-Xmx1G -Xms1G" |
| 8GB VPS | KAFKA_HEAP_OPTS="-Xmx4G -Xms4G" |
| 16GB+ VPS | KAFKA_HEAP_OPTS="-Xmx6G -Xms6G" |
KAFKA_JVM_PERFORMANCE_OPTS="-XX:+UseG1GC \
-XX:MaxGCPauseMillis=20 \
-XX:InitiatingHeapOccupancyPercent=35 \
-XX:+ExplicitGCInvokesConcurrent"OS-Level Tuning
Optimize kernel parameters:
# Increase file descriptor limits
echo 'kafka soft nofile 65536' | sudo tee -a /etc/security/limits.conf
echo 'kafka hard nofile 65536' | sudo tee -a /etc/security/limits.conf# Add to /etc/sysctl.conf
sudo tee -a /etc/sysctl.conf << EOF
# Kafka performance tuning
vm.swappiness=1
vm.dirty_ratio=80
vm.dirty_background_ratio=5
net.core.wmem_default=131072
net.core.rmem_default=131072
net.core.wmem_max=2097152
net.core.rmem_max=2097152
net.ipv4.tcp_wmem=4096 65536 2048000
net.ipv4.tcp_rmem=4096 65536 2048000
EOF
# Apply changes
sudo sysctl -pMonitoring
Enable JMX Monitoring
Add JMX settings to the kafka.service file:
Environment="JMX_PORT=9999"
Environment="KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false \
-Djava.rmi.server.hostname=YOUR_VPS_IP"Log Management
View and manage Kafka logs:
# View Kafka logs
sudo journalctl -u kafka -f
# Check server log
tail -f /opt/kafka/logs/server.logHealth Check Script
Create a simple health check script:
#!/bin/bash
BOOTSTRAP=localhost:9092
# Check if Kafka is responding
if /opt/kafka/bin/kafka-broker-api-versions.sh \
--bootstrap-server $BOOTSTRAP &>/dev/null; then
echo "Kafka is healthy"
exit 0
else
echo "Kafka is not responding"
exit 1
fiTroubleshooting
Kafka Fails to Start
Check journalctl for errors and verify JAVA_HOME is set correctly:
sudo journalctl -u kafka -n 50
echo $JAVA_HOMEConnection Refused
Verify advertised.listeners matches your VPS IP and check firewall rules:
grep advertised.listeners /opt/kafka/config/kraft/server.properties
sudo ufw statusOut of Memory Errors
Reduce KAFKA_HEAP_OPTS values or increase VPS RAM allocation. Edit the systemd service file to adjust heap settings.
Disk Space Issues
Adjust log.retention.hours and log.retention.bytes settings in server.properties:
df -h /var/kafka/data
du -sh /var/kafka/data/*Quick Reference
Default Ports
- • Broker: 9092
- • Controller: 9093
- • SSL: 9093 (if configured)
- • JMX: 9999
Important Paths
- • Install: /opt/kafka/
- • Data: /var/kafka/data/
- • Config: /opt/kafka/config/kraft/
- • Logs: /opt/kafka/logs/
Common Commands
- • kafka-topics.sh
- • kafka-console-producer.sh
- • kafka-console-consumer.sh
- • kafka-consumer-groups.sh
Next Steps
- • Explore Kafka Connect
- • Set up Kafka Streams
- • Implement Prometheus monitoring
- • Consider Schema Registry
