Grafana + Prometheus Stack: Complete Docker Setup
What Is the Grafana + Prometheus Stack?
Grafana and Prometheus together form the most popular open-source monitoring stack in the self-hosting world. Prometheus scrapes and stores time-series metrics. Grafana visualizes them with dashboards. Add node_exporter for host metrics, cAdvisor for Docker container metrics, and Alertmanager for notifications — and you have enterprise-grade monitoring for free.
Updated March 2026: Verified with latest Docker images and configurations.
This guide deploys the complete stack with a single docker compose up -d.
Prerequisites
- A Linux server (Ubuntu 22.04+ recommended)
- Docker and Docker Compose installed (guide)
- 2 GB of free RAM (minimum for all 5 services)
- 10 GB of free disk space (metrics storage)
- Basic understanding of Docker networking
Architecture Overview
| Component | Role | Port |
|---|---|---|
| Prometheus | Metrics collection and storage | 9090 |
| Grafana | Dashboard visualization | 3000 |
| node_exporter | Host system metrics (CPU, RAM, disk) | 9100 |
| cAdvisor | Docker container metrics | 8080 |
| Alertmanager | Alert routing and notifications | 9093 |
Data flows in one direction: exporters → Prometheus → Grafana. Prometheus scrapes metrics from exporters at regular intervals and stores them. Grafana queries Prometheus to render dashboards. Alertmanager receives firing alerts from Prometheus and sends notifications.
Docker Compose Configuration
Create a project directory with the following structure:
mkdir -p monitoring/{prometheus,grafana/provisioning/datasources,alertmanager}
cd monitoring
Create docker-compose.yml:
services:
prometheus:
image: prom/prometheus:v3.10.0
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- ./prometheus/rules.yml:/etc/prometheus/rules.yml:ro
- prometheus-data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=30d'
- '--web.enable-lifecycle'
networks:
- monitoring
depends_on:
- node-exporter
- cadvisor
grafana:
image: grafana/grafana:12.4.1
container_name: grafana
restart: unless-stopped
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning:ro
environment:
GF_SECURITY_ADMIN_USER: admin # Default admin username
GF_SECURITY_ADMIN_PASSWORD: changeme # CHANGE THIS
GF_SERVER_ROOT_URL: http://localhost:3000 # Set to your domain if using reverse proxy
GF_INSTALL_PLUGINS: grafana-clock-panel # Optional plugins
networks:
- monitoring
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:v1.10.2
container_name: node-exporter
restart: unless-stopped
ports:
- "9100:9100"
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--path.rootfs=/rootfs'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
networks:
- monitoring
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.51.0
container_name: cadvisor
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
privileged: true
devices:
- /dev/kmsg:/dev/kmsg
networks:
- monitoring
alertmanager:
image: prom/alertmanager:v0.31.1
container_name: alertmanager
restart: unless-stopped
ports:
- "9093:9093"
volumes:
- ./alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
- alertmanager-data:/alertmanager
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
- '--storage.path=/alertmanager'
networks:
- monitoring
networks:
monitoring:
driver: bridge
volumes:
prometheus-data:
grafana-data:
alertmanager-data:
Prometheus Configuration
Create prometheus/prometheus.yml:
global:
scrape_interval: 15s # How often to scrape targets
evaluation_interval: 15s # How often to evaluate alert rules
scrape_timeout: 10s # Timeout per scrape
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- rules.yml
scrape_configs:
# Monitor Prometheus itself
- job_name: 'prometheus'
static_configs:
- targets: ['prometheus:9090']
# Host system metrics via node_exporter
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
# Docker container metrics via cAdvisor
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
# Grafana metrics
- job_name: 'grafana'
static_configs:
- targets: ['grafana:3000']
Create prometheus/rules.yml for alerting rules:
groups:
- name: system-alerts
rules:
# Alert when CPU usage exceeds 80% for 5 minutes
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is above 80% for more than 5 minutes."
# Alert when available memory drops below 15%
- alert: LowMemory
expr: (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100) < 15
for: 5m
labels:
severity: warning
annotations:
summary: "Low memory on {{ $labels.instance }}"
description: "Available memory is below 15%."
# Alert when disk usage exceeds 85%
- alert: HighDiskUsage
expr: (1 - node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"}) * 100 > 85
for: 5m
labels:
severity: critical
annotations:
summary: "Disk almost full on {{ $labels.instance }}"
description: "Disk usage exceeds 85% on {{ $labels.mountpoint }}."
# Alert when a Docker container is down
- alert: ContainerDown
expr: absent(container_last_seen{name!=""})
for: 1m
labels:
severity: critical
annotations:
summary: "Container {{ $labels.name }} is down"
- name: service-alerts
rules:
# Alert when Prometheus target is down
- alert: TargetDown
expr: up == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Target {{ $labels.instance }} is down"
description: "{{ $labels.job }} target {{ $labels.instance }} has been unreachable for 2 minutes."
Alertmanager Configuration
Create alertmanager/alertmanager.yml:
global:
resolve_timeout: 5m
# Uncomment and configure for email alerts:
# smtp_smarthost: 'smtp.gmail.com:587'
# smtp_from: '[email protected]'
# smtp_auth_username: '[email protected]'
# smtp_auth_password: 'app-specific-password'
# smtp_require_tls: true
templates: []
route:
receiver: 'default'
group_by: ['alertname', 'instance']
group_wait: 30s # Wait before sending first notification
group_interval: 5m # Interval between grouped notifications
repeat_interval: 4h # Resend if alert still firing
receivers:
- name: 'default'
# Email notifications (uncomment global SMTP settings above):
# email_configs:
# - to: '[email protected]'
# send_resolved: true
# Slack notifications:
# slack_configs:
# - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
# channel: '#alerts'
# send_resolved: true
# title: '{{ .GroupLabels.alertname }}'
# text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
# Discord notifications (via webhook):
# discord_configs:
# - webhook_url: 'https://discord.com/api/webhooks/YOUR/WEBHOOK'
# send_resolved: true
Grafana Datasource Provisioning
Create grafana/provisioning/datasources/prometheus.yml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: false
This automatically configures Prometheus as the default datasource when Grafana starts — no manual setup needed.
Start the Stack
docker compose up -d
Verify all services are running:
docker compose ps
| Service | URL | Purpose |
|---|---|---|
| Prometheus | http://your-server:9090 | Query metrics, check targets |
| Grafana | http://your-server:3000 | Dashboards |
| node_exporter | http://your-server:9100/metrics | Raw host metrics |
| cAdvisor | http://your-server:8080 | Container metrics UI |
| Alertmanager | http://your-server:9093 | Alert status and silences |
Initial Grafana Setup
- Open Grafana at
http://your-server:3000 - Log in with
admin/changeme(change the password on first login) - The Prometheus datasource is already configured via provisioning
- Import community dashboards:
- Go to Dashboards → Import
- Enter dashboard ID 1860 for “Node Exporter Full” (host metrics)
- Enter dashboard ID 193 for “Docker Monitoring” (container metrics)
- Select “Prometheus” as the datasource
These two dashboards give you immediate visibility into host system health and Docker container resource usage.
Adding More Scrape Targets
To monitor additional services that expose Prometheus metrics, add entries to prometheus/prometheus.yml:
scrape_configs:
# ... existing configs ...
# Example: Monitor your Nextcloud instance
- job_name: 'nextcloud'
static_configs:
- targets: ['nextcloud:9090']
metrics_path: '/ocs/v2.php/apps/serverinfo/api/v1/info'
params:
format: ['prometheus']
# Example: Monitor another host via node_exporter
- job_name: 'remote-server'
static_configs:
- targets: ['192.168.1.100:9100']
After editing, reload Prometheus without restarting:
curl -X POST http://localhost:9090/-/reload
Data Retention and Storage
| Setting | Default | Recommended |
|---|---|---|
| Retention time | 15 days | 30-90 days |
| Retention size | Unlimited | Set based on disk capacity |
| Disk usage per day | ~50-200 MB | Depends on scrape targets and interval |
Adjust retention in the Prometheus command section:
command:
- '--storage.tsdb.retention.time=90d' # Keep 90 days
- '--storage.tsdb.retention.size=10GB' # Or cap at 10 GB
Common Mistakes
-
Not exposing host filesystems to node_exporter. Without
/proc,/sys, and/mounted read-only, node_exporter reports container metrics instead of host metrics. -
Using
localhostin scrape configs. Inside Docker, services reference each other by container name (prometheus,grafana), notlocalhost. -
Forgetting to reload Prometheus after config changes. Use
curl -X POST http://localhost:9090/-/reloador restart the container. -
Setting scrape intervals too low. A 5-second interval generates massive data volumes. Start with 15 seconds and lower only if needed.
-
Not setting retention limits. Without
retention.timeorretention.size, Prometheus will fill your disk.
Resource Requirements
| Service | RAM (idle) | RAM (load) | CPU |
|---|---|---|---|
| Prometheus | 200 MB | 500 MB+ | Low-Medium |
| Grafana | 100 MB | 300 MB | Low |
| node_exporter | 10 MB | 20 MB | Very Low |
| cAdvisor | 50 MB | 100 MB | Low |
| Alertmanager | 30 MB | 50 MB | Very Low |
| Total | ~400 MB | ~1 GB | Low |
Next Steps
- Add Loki for log aggregation alongside metrics
- Set up Grafana alerts for unified alerting
- Monitor remote servers by running node_exporter on each host
- Create custom dashboards for your specific self-hosted applications
Related
Get self-hosting tips in your inbox
Get the Docker Compose configs, hardware picks, and setup shortcuts we don't put in articles. Weekly. No spam.
Comments