Monitoring multiple remote servers from a central location is essential for maintaining system health and preventing downtime. This guide will show you how to set up Grafana, Prometheus, and Node Exporter to monitor all your servers from one dashboard.

What You’ll Build

A monitoring system where:

  • Central monitoring server runs Grafana and Prometheus
  • Remote servers run Node Exporter to collect metrics
  • Single dashboard shows metrics from all servers
  • Alerts notify you when issues occur

Architecture

Central Server                    Remote Servers
┌─────────────────┐              ┌─────────────────┐
│ Grafana :3000   │◄─────────────┤ Server 1 :9100  │
│ Prometheus :9090│              └─────────────────┘
└─────────────────┘              ┌─────────────────┐
                                 │ Server 2 :9100  │
                                 └─────────────────┘

Prerequisites

  • Central monitoring server (2GB RAM minimum)
  • SSH access to all remote servers
  • Basic Linux command line knowledge

Set Up Remote Servers

SSH into each remote server and run these commands:

Install Node Exporter

# Create user
sudo useradd --no-create-home --shell /bin/false node_exporter

# Download and install
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
tar xvf node_exporter-1.6.1.linux-amd64.tar.gz
sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
rm -rf node_exporter-1.6.1.linux-amd64*

Create Service File

sudo nano /etc/systemd/system/node_exporter.service

Add this content:

[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter --web.listen-address=0.0.0.0:9100

[Install]
WantedBy=multi-user.target

Start the Service

sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

# Verify it's working
curl http://localhost:9100/metrics | head -10

Configure Firewall

Allow access from your monitoring server only:

# Replace MONITORING_SERVER_IP with your actual monitoring server IP
sudo ufw allow from MONITORING_SERVER_IP to any port 9100
sudo ufw enable

Repeat these steps on all remote servers you want to monitor.

Set Up Central Monitoring Server

Install Prometheus

# Create user
sudo useradd --no-create-home --shell /bin/false prometheus

# Create directories
sudo mkdir /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus

# Download Prometheus
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz
tar xvf prometheus-2.47.0.linux-amd64.tar.gz

# Install
sudo cp prometheus-2.47.0.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.47.0.linux-amd64/promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool

# Copy console files
sudo cp -r prometheus-2.47.0.linux-amd64/consoles /etc/prometheus
sudo cp -r prometheus-2.47.0.linux-amd64/console_libraries /etc/prometheus
sudo chown -R prometheus:prometheus /etc/prometheus/

# Clean up
rm -rf prometheus-2.47.0.linux-amd64*

Configure Prometheus

sudo nano /etc/prometheus/prometheus.yml

Add this configuration (replace IP addresses with your actual server IPs):

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'servers'
    static_configs:
      - targets: 
          - '10.0.1.10:9100'  # Server 1
          - '10.0.1.11:9100'  # Server 2
          - '10.0.1.12:9100'  # Server 3
        labels:
          group: 'production'

Create Prometheus Service

sudo nano /etc/systemd/system/prometheus.service

Add this content:

[Unit]
Description=Prometheus
After=network.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.listen-address=0.0.0.0:9090

[Install]
WantedBy=multi-user.target

Start Prometheus

sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
sudo systemctl status prometheus

Install Grafana

# Add Grafana repository
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list

# Install Grafana
sudo apt update
sudo apt install grafana -y

# Start Grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

Configure Firewall

# Allow Grafana access
sudo ufw allow 3000

# Optional: Allow Prometheus access
sudo ufw allow 9090

sudo ufw enable

Set Up Dashboards

Access Grafana

  1. Open http://your-monitoring-server-ip:3000
  2. Login with username: admin, password: admin
  3. Change the password when prompted

Add Prometheus Data Source

  1. Click the gear icon (⚙️) → Data Sources
  2. Click “Add data source”
  3. Select “Prometheus”
  4. Set URL to http://localhost:9090
  5. Click “Save & Test”

Import Dashboard

  1. Click the “+” icon → Import
  2. Enter dashboard ID: 1860
  3. Click “Load”
  4. Select your Prometheus data source
  5. Click “Import”

You’ll now see metrics from all your servers!

Basic Alerting

Create an alert rules file:

sudo nano /etc/prometheus/alert_rules.yml

Add these basic alerts:

groups:
  - name: basic_alerts
    rules:
      - alert: ServerDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Server {{ $labels.instance }} is down"

      - alert: HighCPUUsage
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU on {{ $labels.instance }}"

      - alert: HighMemoryUsage
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory on {{ $labels.instance }}"

      - alert: LowDiskSpace
        expr: 100 - ((node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"} / node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}) * 100) > 90
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Low disk space on {{ $labels.instance }}"

Update Prometheus config to include alerts:

sudo nano /etc/prometheus/prometheus.yml

Add this line under the global section:

rule_files:
  - "alert_rules.yml"

Restart Prometheus:

sudo systemctl restart prometheus

Key Dashboards to Create

Server Overview Panel

Create a table showing all servers:

  • Panel Type: Table
  • Query: up
  • Shows which servers are online/offline

CPU Usage Panel

  • Panel Type: Time series
  • Query: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
  • Shows CPU usage for each server

Memory Usage Panel

  • Panel Type: Time series
  • Query: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
  • Shows memory usage percentage

Disk Usage Panel

  • Panel Type: Gauge
  • Query: 100 - ((node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"} / node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}) * 100)
  • Shows disk usage percentage

Troubleshooting

Servers Not Showing in Prometheus

  1. Check if Node Exporter is running:
    sudo systemctl status node_exporter
    
  2. Test network connectivity:
    curl http://REMOTE_SERVER_IP:9100/metrics
    
  3. Check firewall rules:
    sudo ufw status
    

Prometheus Not Starting

Check logs:

sudo journalctl -u prometheus -f

Common issues:

  • Config file syntax errors
  • Permission problems
  • Port already in use

Grafana Dashboard Empty

  1. Verify Prometheus data source is working
  2. Check if data is in Prometheus: http://your-ip:9090/targets
  3. Verify queries in dashboard panels

Maintenance

Regular Tasks

  1. Update software monthly:
    sudo apt update && sudo apt upgrade
    
  2. Check disk space on monitoring server:
    df -h /var/lib/prometheus
    
  3. Review alerts and adjust thresholds as needed
  4. Backup configurations:
    sudo tar czf monitoring-backup-$(date +%Y%m%d).tar.gz \
        /etc/prometheus/ /etc/grafana/
    

Performance Tips

  • Increase scrape intervals for non-critical servers
  • Use shorter retention periods to save disk space
  • Monitor the monitoring server itself

Security Best Practices

  1. Use HTTPS for Grafana (set up nginx reverse proxy with SSL)
  2. Restrict access to monitoring ports using firewall rules
  3. Change default passwords for Grafana
  4. Regularly update all components
  5. Use strong authentication for Grafana users

Remember to regularly review your dashboards and fine-tune alert thresholds based on your specific infrastructure needs.