Netmaker with Kernel WireGuard & EMQX
Kernel WireGuard performance, MQTT control plane, explicit egress and ingress gateways. The control-heavy mesh for operators who want every knob.
Ubuntu 24.04, kernel WireGuard module, wildcard DNS
~75 minutes
Kernel WireGuard mesh with explicit gateways and ACLs
Architecture
- • Netmaker server — Go binary, REST API, SQLite or PostgreSQL state.
- • EMQX — MQTT broker. The server publishes config; netclients subscribe and apply.
- • CoreDNS — internal DNS for nodes.
- • netclient — agent on every peer, manages the WireGuard interface.
- • netmaker-ui — the dashboard.
The MQTT control plane is the architectural difference from Netbird: instead of long-lived gRPC streams, peers maintain MQTT subscriptions and config updates push near-instantly to thousands of nodes without per-connection state on the server.
Sizing on RamNode
Lab, < 20 nodes 2 GB SQLite, single VPS
20-100 nodes 4 GB SQLite still fine, EMQX is the bottleneck
100-500 nodes 4-8 GB PostgreSQL, tune EMQX acceptors
500+ nodes 8 GB+ EMQX on its own instanceInstall with nm-quick.sh
wget -O /root/nm-quick.sh https://raw.githubusercontent.com/gravitl/netmaker/master/scripts/nm-quick.sh
chmod +x /root/nm-quick.sh
sudo /root/nm-quick.shThe script asks for a base domain (e.g. nm.example.com), Let's Encrypt email, and Community vs Pro. Choose Community for this guide. Open ports: 443/tcp, 51821-51830/udp, 8883/tcp, 80/tcp.
What nm-quick Deployed
The script wrote /root/docker-compose.yml with these key services:
services:
netmaker:
image: gravitl/netmaker:v0.24.x
cap_add: [NET_ADMIN, NET_RAW, SYS_MODULE]
sysctls:
- net.ipv4.ip_forward=1
- net.ipv4.conf.all.src_valid_mark=1
environment:
SERVER_NAME: "nm.example.com"
SERVER_HOST: "<public-ip>"
BROKER_ENDPOINT: "wss://broker.nm.example.com"
MQ_PASSWORD: "<generated>"
MQ_USERNAME: "netmaker"
MASTER_KEY: "<generated>"
DATABASE: "sqlite"
DNS_MODE: "on"
MANAGE_IPTABLES: "on"
DEFAULT_PROXY_MODE: "auto"
DEFAULT_LISTEN_PORT: "51821"
ports: ["51821-51830:51821-51830/udp"]
mq:
image: emqx/emqx:5.6
ports: ["8883:8883", "8084:8084"]
coredns:
image: coredns/coredns
caddy:
image: caddy:2
ports: ["80:80", "443:443"]Caddy terminates TLS for the API, dashboard, and EMQX WSS endpoints. First visit to https://dashboard.nm.example.com creates the admin account.
Creating a Network
In the dashboard, Networks → New: name corp, address range 10.50.0.0/24, optional IPv6 fd00:50::/64, default UDP port 51821, MTU 1420, NAT enabled.
Adding Nodes
Generate an Enrollment Key bound to corp. On the host:
curl -sL 'https://gravitl.com/scripts/netclient-install.sh' | sudo bash
sudo netclient join -t <enrollment-key>
sudo wg show
sudo netclient list
ping 10.50.0.<other-node-ip>The agent registers, the server pushes WireGuard config via MQTT, and a netmaker interface appears within seconds.
ACLs
Netmaker ACLs operate at the network level. Default is allow-all. To enforce policy: open the network's ACL view, toggle default-allow off (now deny-all), add explicit allow entries between specific nodes or tags.
The model is simpler than Netbird's: allow or deny per node pair, optionally tag-scoped. There is no port-level filtering at the ACL layer — for that, use nftables/iptables on each node. Common mistake: setting deny-all without first ensuring server-to-node allow entries exist (the server is itself a node).
Egress Gateways
An egress gateway makes a non-overlay subnet reachable from inside the overlay (e.g., a private subnet at 192.168.50.0/24 in another cloud). Mark the node as egress, specify the advertised subnets; Netmaker pushes routing updates so other nodes route via the gateway's overlay IP. The gateway needs ip_forward=1 and a route to the target subnet via its non-overlay interface; the agent handles MASQUERADE when MANAGE_IPTABLES=on.
Ingress Gateways and External Clients
The inverse of egress: external clients (a contractor's laptop) connect via standard WireGuard config without running netclient. Mark a node as ingress, create an external client bound to it, hand off the config or QR code. Traffic from the external client routes through the gateway into the rest of the overlay according to ACLs.
Relays
When two nodes cannot establish direct WireGuard, designate a node with a public endpoint as a relay (toggle "Is Relay" and list which nodes relay through it). The relay decrypts and re-encrypts at the WireGuard layer — it sees plaintext at the relay node. Plan placement accordingly.
DNS via CoreDNS
Every node registers a hostname against its overlay IP. Other nodes resolve <node>.<network> (e.g. db1.corp). For systemd-resolved hosts:
[Resolve]
DNS=10.50.0.1
Domains=~corpFailover and HA
Community Edition runs as a single server. Existing nodes keep working when the server is down — WireGuard configs are local and peer-to-peer traffic does not need the server. Only enrollment and policy changes require it. For HA control plane, the Pro Edition supports server clustering.
Backup and Restore
Back up SQLite at /root/data/netmaker.db, the compose file with env vars (master key, MQ password, JWT secret), and the Caddyfile.
docker compose exec -T netmaker \
sqlite3 /root/data/netmaker.db ".backup /tmp/netmaker.db.bak"
docker compose cp netmaker:/tmp/netmaker.db.bak ./
tar czf netmaker-$(date +%F).tar.gz netmaker.db.bak docker-compose.yml CaddyfileHardening Checklist
- MQTT auth is non-negotiable. An open broker on 8883 is a control-plane takeover.
- Master key custody.
MASTER_KEYis root authority — treat like a database password. - Disable the default admin after creating real admins.
- EMQX TLS even though Caddy terminates WSS.
- Restrict the API with IP allowlisting where operators connect from.
- Verify kernel WireGuard. On minimal images install
linux-modules-extra-$(uname -r); verify withmodprobe wireguard. - Ship audit logs off the Netmaker host itself.
Troubleshooting
- • Node enrolls but invisible in dashboard.
journalctl -u netclient— usually MQTT DNS or TLS chain trust on minimal hosts. - • Two nodes cannot ping. ACLs first, then
wg showhandshake timestamps. - • MQTT keeps disconnecting. EMQX overload or WSS being killed by an intermediate proxy. Tune
EMQX_NODE__PROCESS_LIMIT. - • DNS fails inside overlay. CoreDNS up? Resolver actually pointed at it?
systemd-resolve --status. - • WireGuard interface missing. Kernel module not loaded —
dmesg,lsmod | grep wireguard.
