Overwatch Deployment Runbook
This runbook provides step-by-step instructions for deploying OpenGSLB Overwatch nodes that serve authoritative DNS and validate agent health claims.
Overview
Overwatch nodes are the core DNS-serving components of OpenGSLB:
Serve authoritative DNS with GSLB routing decisions
Receive health updates from agents via gossip
Perform external validation of agent health claims
Sign DNS responses with DNSSEC
Operate independently (no cluster coordination)
Prerequisites
System Requirements
Resource |
Minimum |
Recommended |
High Traffic |
|---|---|---|---|
CPU |
2 cores |
4 cores |
8 cores |
Memory |
512 MB |
1 GB |
2 GB |
Disk |
1 GB |
5 GB |
10 GB |
Network |
Gigabit |
Gigabit |
10 Gigabit |
Network Requirements
Direction |
Port |
Protocol |
Purpose |
|---|---|---|---|
Inbound |
53 |
UDP/TCP |
DNS queries |
Inbound |
7946 |
TCP/UDP |
Gossip from agents |
Inbound |
8080 |
TCP |
API endpoint (default: localhost only) |
Inbound |
9090 |
TCP |
Metrics endpoint |
Outbound |
9090 |
TCP |
DNSSEC key sync (to peers) |
Outbound |
Backend ports |
TCP |
Health validation |
DNS Integration Considerations
Before deployment, plan how DNS will be integrated:
Direct Resolution: Clients point directly to Overwatch nodes
Conditional Forwarding: Corporate DNS forwards GSLB zones to Overwatch
Stub Zone: Authoritative DNS delegates GSLB subdomain
Information Needed
DNS zones to serve (e.g.,
gslb.example.com)Gossip encryption key (generate if first Overwatch)
Service tokens for each application
GeoIP database (for geolocation routing)
Peer Overwatch addresses (for HA/DNSSEC sync)
Installation
Step 1: Download and Install Binary
# Set version
VERSION="1.0.0"
# Download for your platform
curl -Lo opengslb https://github.com/loganrossus/OpenGSLB/releases/download/v${VERSION}/opengslb-linux-amd64
chmod +x opengslb
sudo mv opengslb /usr/local/bin/
# Also install CLI tool
curl -Lo opengslb-cli https://github.com/loganrossus/OpenGSLB/releases/download/v${VERSION}/opengslb-cli-linux-amd64
chmod +x opengslb-cli
sudo mv opengslb-cli /usr/local/bin/
Step 2: Create System User
# Create opengslb user and group
sudo useradd --system --no-create-home --shell /bin/false opengslb
# Create data directory
sudo mkdir -p /var/lib/opengslb
sudo chown opengslb:opengslb /var/lib/opengslb
sudo chmod 700 /var/lib/opengslb
# Create config directory
sudo mkdir -p /etc/opengslb
sudo chown root:opengslb /etc/opengslb
sudo chmod 750 /etc/opengslb
# Create GeoIP database directory
sudo mkdir -p /var/lib/opengslb/geoip
sudo chown opengslb:opengslb /var/lib/opengslb/geoip
Step 3: Generate Secrets
# Generate gossip encryption key (save this securely!)
GOSSIP_KEY=$(openssl rand -base64 32)
echo "Gossip Key: $GOSSIP_KEY"
# Generate service tokens for each application
WEBAPP_TOKEN=$(openssl rand -base64 32)
API_TOKEN=$(openssl rand -base64 32)
echo "WebApp Token: $WEBAPP_TOKEN"
echo "API Token: $API_TOKEN"
Important: Store these secrets in a secure location (vault, secrets manager). You’ll need:
Gossip key: Shared between all Overwatches and agents
Service tokens: Shared with respective agent deployments
Step 4: Set Up GeoIP Database (Optional)
For geolocation routing, download the MaxMind GeoLite2 database:
# Register at https://www.maxmind.com/en/geolite2/signup
# Download GeoLite2-Country database
# Place database in the correct location
sudo mv GeoLite2-Country.mmdb /var/lib/opengslb/geoip/
sudo chown opengslb:opengslb /var/lib/opengslb/geoip/GeoLite2-Country.mmdb
Step 5: Create Configuration File
sudo tee /etc/opengslb/overwatch.yaml << 'EOF'
mode: overwatch
overwatch:
identity:
node_id: overwatch-us-east-1
region: us-east
# Agent authentication tokens
# REPLACE with your actual tokens
agent_tokens:
webapp: "YOUR_WEBAPP_TOKEN_HERE"
api: "YOUR_API_TOKEN_HERE"
gossip:
bind_address: "0.0.0.0:7946"
encryption_key: "YOUR_GOSSIP_KEY_HERE"
probe_interval: 1s
probe_timeout: 500ms
gossip_interval: 200ms
validation:
enabled: true
check_interval: 30s
check_timeout: 5s
stale:
threshold: 30s
remove_after: 5m
dnssec:
enabled: true
algorithm: ECDSAP256SHA256
key_sync:
peers: [] # Add peer Overwatch URLs for HA
poll_interval: 1h
timeout: 30s
# Geolocation configuration (optional)
geolocation:
database_path: "/var/lib/opengslb/geoip/GeoLite2-Country.mmdb"
default_region: us-east
ecs_enabled: true
custom_mappings:
- cidr: "10.0.0.0/8"
region: us-east
comment: "Internal networks default to us-east"
data_dir: /var/lib/opengslb
# DNS server configuration
dns:
listen_address: "0.0.0.0:53"
default_ttl: 30
return_last_healthy: false
zones:
- gslb.example.com
# Region definitions (for static backends or region mapping)
regions:
- name: us-east
countries: ["US", "CA", "MX"]
continents: ["NA", "SA"]
servers: [] # Populated dynamically from agents
- name: eu-west
countries: ["GB", "DE", "FR", "ES", "IT"]
continents: ["EU"]
servers: []
- name: ap-southeast
continents: ["AS", "OC"]
servers: []
# Domain routing configuration
domains:
- name: webapp.gslb.example.com
routing_algorithm: geolocation
regions:
- us-east
- eu-west
- ap-southeast
ttl: 30
- name: api.gslb.example.com
routing_algorithm: latency
regions:
- us-east
- eu-west
ttl: 15
latency_config:
smoothing_factor: 0.3
max_latency_ms: 500
min_samples: 3
logging:
level: info
format: json
metrics:
enabled: true
address: ":9090"
api:
enabled: true
address: "127.0.0.1:8080" # Localhost only by default for security
allowed_networks:
- 10.0.0.0/8
- 192.168.0.0/16
- 127.0.0.1/32
trust_proxy_headers: false
EOF
# Set secure permissions
sudo chown root:opengslb /etc/opengslb/overwatch.yaml
sudo chmod 640 /etc/opengslb/overwatch.yaml
Step 6: Create systemd Service
sudo tee /etc/systemd/system/opengslb-overwatch.service << 'EOF'
[Unit]
Description=OpenGSLB Overwatch
Documentation=https://opengslb.org/docs
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=opengslb
Group=opengslb
ExecStart=/usr/local/bin/opengslb --config=/etc/opengslb/overwatch.yaml
ExecReload=/bin/kill -SIGHUP $MAINPID
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
# Required for binding to port 53
AmbientCapabilities=CAP_NET_BIND_SERVICE
# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
PrivateTmp=yes
ReadWritePaths=/var/lib/opengslb
# Environment
Environment="GOMAXPROCS=4"
[Install]
WantedBy=multi-user.target
EOF
Step 7: Allow DNS Port Binding
For non-root binding to port 53:
# Option 1: Using capabilities (recommended)
sudo setcap 'cap_net_bind_service=+ep' /usr/local/bin/opengslb
# Option 2: Use systemd AmbientCapabilities (already in service file above)
Step 8: Start Overwatch
# Reload systemd
sudo systemctl daemon-reload
# Enable and start Overwatch
sudo systemctl enable opengslb-overwatch
sudo systemctl start opengslb-overwatch
# Check status
sudo systemctl status opengslb-overwatch
DNS Integration Patterns
Pattern 1: Direct Client Resolution
Configure clients to use Overwatch directly:
# Client /etc/resolv.conf
nameserver 10.0.1.53 # Overwatch 1
nameserver 10.0.1.54 # Overwatch 2
nameserver 10.0.1.55 # Overwatch 3
options timeout:2 attempts:3
Pattern 2: BIND Conditional Forwarding
# named.conf
zone "gslb.example.com" {
type forward;
forward only;
forwarders {
10.0.1.53;
10.0.1.54;
10.0.1.55;
};
};
Pattern 3: Unbound Stub Zone
# unbound.conf
stub-zone:
name: "gslb.example.com"
stub-addr: 10.0.1.53
stub-addr: 10.0.1.54
stub-addr: 10.0.1.55
Pattern 4: Parent Zone Delegation
In your parent zone (e.g., example.com):
; NS records for delegation
gslb IN NS ns1.gslb.example.com.
gslb IN NS ns2.gslb.example.com.
gslb IN NS ns3.gslb.example.com.
; Glue records
ns1.gslb IN A 10.0.1.53
ns2.gslb IN A 10.0.1.54
ns3.gslb IN A 10.0.1.55
; DS record for DNSSEC (get from Overwatch API)
gslb IN DS 12345 13 2 abc123...
DNSSEC Setup
DNSSEC is enabled by default. After starting Overwatch:
Get DS Records for Parent Zone
# Using CLI
opengslb-cli dnssec ds --zone gslb.example.com --api http://localhost:8080
# Using curl
curl http://localhost:8080/api/v1/dnssec/ds | jq .
Output:
{
"enabled": true,
"ds_records": [
{
"zone": "gslb.example.com.",
"key_tag": 12345,
"algorithm": 13,
"digest_type": 2,
"digest": "abc123def456...",
"ds_record": "gslb.example.com. IN DS 12345 13 2 abc123def456..."
}
]
}
Add the DS record to your parent zone to enable DNSSEC chain of trust.
DNSSEC Key Synchronization
For multiple Overwatches, configure key sync:
dnssec:
enabled: true
key_sync:
peers:
- "https://overwatch-2.internal:9090"
- "https://overwatch-3.internal:9090"
poll_interval: 1h
timeout: 30s
API Security Configuration
Network Restrictions
api:
enabled: true
address: ":9090"
allowed_networks:
- 10.0.0.0/8 # Internal network
- 192.168.0.0/16 # VPN/corporate
- 127.0.0.1/32 # Localhost
trust_proxy_headers: false
Behind a Load Balancer
If API is behind a reverse proxy:
api:
trust_proxy_headers: true
allowed_networks:
- 10.0.0.0/8
The proxy must set X-Forwarded-For header.
Metrics and Monitoring
Prometheus Configuration
# prometheus.yml
scrape_configs:
- job_name: 'opengslb-overwatch'
static_configs:
- targets:
- 'overwatch-1.internal:9090'
- 'overwatch-2.internal:9090'
- 'overwatch-3.internal:9090'
scrape_interval: 15s
Key Metrics to Monitor
# DNS query rate
rate(opengslb_dns_queries_total[5m])
# DNS error rate
sum(rate(opengslb_dns_queries_total{status!="success"}[5m])) / sum(rate(opengslb_dns_queries_total[5m]))
# Healthy backends
opengslb_overwatch_backends_healthy
# Stale agents
opengslb_overwatch_stale_agents
Alert Examples
groups:
- name: opengslb-overwatch
rules:
- alert: OpenGSLBLowHealthyBackends
expr: opengslb_overwatch_backends_healthy < 2
for: 2m
labels:
severity: critical
annotations:
summary: "Less than 2 healthy backends"
- alert: OpenGSLBStaleAgents
expr: opengslb_overwatch_stale_agents > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Agents are stale"
Verification Steps
1. Check Service Status
sudo systemctl status opengslb-overwatch
2. Verify DNS is Responding
# Query Overwatch directly
dig @localhost webapp.gslb.example.com +short
# Query with DNSSEC validation
dig @localhost webapp.gslb.example.com +dnssec
3. Check API is Accessible
# Health check
curl http://localhost:8080/api/v1/live
# Readiness check
curl http://localhost:8080/api/v1/ready
# List backends
curl http://localhost:8080/api/v1/overwatch/backends | jq .
4. Check Metrics Endpoint
curl http://localhost:9090/metrics | grep opengslb
5. Verify Gossip is Listening
ss -tulnp | grep 7946
Smoke Tests
Run these after deployment to verify functionality:
#!/bin/bash
# smoke-test.sh
OVERWATCH="localhost"
DNS_PORT="53"
API_PORT="8080"
METRICS_PORT="9090"
DOMAIN="webapp.gslb.example.com"
echo "=== OpenGSLB Overwatch Smoke Test ==="
# Test 1: DNS query
echo -n "DNS Query: "
if dig @${OVERWATCH} -p ${DNS_PORT} ${DOMAIN} +short | grep -q "."; then
echo "PASS"
else
echo "FAIL"
fi
# Test 2: API liveness
echo -n "API Liveness: "
if curl -s http://${OVERWATCH}:${API_PORT}/api/v1/live | grep -q "alive"; then
echo "PASS"
else
echo "FAIL"
fi
# Test 3: API readiness
echo -n "API Readiness: "
if curl -s http://${OVERWATCH}:${API_PORT}/api/v1/ready | grep -q "ready"; then
echo "PASS"
else
echo "FAIL"
fi
# Test 4: DNSSEC
echo -n "DNSSEC: "
if dig @${OVERWATCH} -p ${DNS_PORT} ${DOMAIN} +dnssec | grep -q "RRSIG"; then
echo "PASS"
else
echo "FAIL (may need DS in parent zone)"
fi
# Test 5: Metrics
echo -n "Metrics: "
if curl -s http://${OVERWATCH}:${METRICS_PORT}/metrics | grep -q "opengslb_dns_queries_total"; then
echo "PASS"
else
echo "FAIL"
fi
echo "=== Smoke Test Complete ==="
Troubleshooting
DNS Not Resolving
Check Overwatch is listening:
ss -tulnp | grep :53
Check for port conflicts:
sudo lsof -i :53 # May need to disable systemd-resolved sudo systemctl stop systemd-resolved
Test directly:
dig @127.0.0.1 webapp.gslb.example.com
Agents Not Registering
Check gossip is listening:
ss -tulnp | grep 7946
Verify encryption key:
Must match between Overwatch and agents
Check agent tokens:
Tokens in
agent_tokensmust match agent configuration
API Not Accessible
Check binding:
ss -tulnp | grep 8080
Check allowed networks:
Your IP must be in
allowed_networksCIDR ranges
Check firewall:
sudo iptables -L -n | grep 8080
DNSSEC Issues
Verify keys exist:
curl http://localhost:8080/api/v1/dnssec/status | jq .
Check DS record in parent:
dig DS gslb.example.com +trace
Configuration Reference
See Configuration Reference for complete configuration options.