How to Self-Host Prometheus with Docker Compose
What Is Prometheus?
Prometheus is an open-source monitoring and alerting toolkit. It scrapes metrics from configured targets at regular intervals, stores them in a time-series database, and provides a powerful query language (PromQL) for analysis. Prometheus is the backbone of most self-hosted monitoring stacks — it collects the data that Grafana visualizes. It replaces the data collection layer of paid tools like Datadog, New Relic, and Pingdom.
Prerequisites
- A Linux server (Ubuntu 22.04+ recommended)
- Docker and Docker Compose installed (guide)
- 1 GB of RAM minimum (2 GB+ recommended for many scrape targets)
- Services or exporters to monitor
Docker Compose Configuration
Create a docker-compose.yml file:
services:
prometheus:
image: prom/prometheus:v3.9.1
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- ./alert-rules.yml:/etc/prometheus/alert-rules.yml:ro
- prometheus-data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=30d"
- "--web.enable-lifecycle"
networks:
- monitoring
node-exporter:
image: prom/node-exporter:v1.9.1
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- "--path.procfs=/host/proc"
- "--path.sysfs=/host/sys"
- "--path.rootfs=/rootfs"
- "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
networks:
- monitoring
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.52.1
container_name: cadvisor
restart: unless-stopped
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
networks:
- monitoring
volumes:
prometheus-data:
networks:
monitoring:
driver: bridge
Create a prometheus.yml configuration file:
global:
scrape_interval: 15s # How often to scrape targets
evaluation_interval: 15s # How often to evaluate alerting rules
rule_files:
- "alert-rules.yml"
scrape_configs:
# Monitor Prometheus itself
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
# Monitor host system via Node Exporter
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
# Monitor Docker containers via cAdvisor
- job_name: "cadvisor"
static_configs:
- targets: ["cadvisor:8080"]
Create a basic alert-rules.yml:
groups:
- name: host-alerts
rules:
- alert: HighCPU
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
- alert: HighMemory
expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage on {{ $labels.instance }}"
- alert: DiskAlmostFull
expr: (1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 > 85
for: 10m
labels:
severity: critical
annotations:
summary: "Disk almost full on {{ $labels.instance }}"
- alert: ContainerDown
expr: absent(container_last_seen{name=~".+"})
for: 1m
labels:
severity: critical
annotations:
summary: "Container {{ $labels.name }} is down"
Start the stack:
docker compose up -d
Initial Setup
- Open the Prometheus UI at
http://your-server-ip:9090 - Go to Status > Targets — all scrape targets should show as “UP”
- Test a query in the Graph tab:
up— shows which targets are reachablenode_memory_MemAvailable_bytes— available memoryrate(node_cpu_seconds_total{mode="idle"}[5m])— CPU idle rate
Prometheus is a data backend — for dashboards, connect it to Grafana.
Configuration
Adding Scrape Targets
Add any service that exposes a /metrics endpoint:
scrape_configs:
# Custom application
- job_name: "my-app"
static_configs:
- targets: ["my-app:8080"]
metrics_path: "/metrics" # default, can be changed
# Multiple targets for the same job
- job_name: "web-servers"
static_configs:
- targets:
- "server1:9100"
- "server2:9100"
- "server3:9100"
After editing prometheus.yml, reload the config:
curl -X POST http://localhost:9090/-/reload
This works because --web.enable-lifecycle is enabled.
Common Exporters
| Exporter | What It Monitors | Image | Port |
|---|---|---|---|
| Node Exporter | Host CPU, memory, disk, network | prom/node-exporter | 9100 |
| cAdvisor | Docker containers | gcr.io/cadvisor/cadvisor | 8080 |
| Blackbox Exporter | HTTP/TCP/DNS/ICMP probes | prom/blackbox-exporter | 9115 |
| SNMP Exporter | Network devices | prom/snmp-exporter | 9116 |
| PostgreSQL Exporter | PostgreSQL databases | prometheuscommunity/postgres-exporter | 9187 |
Storage and Retention
Prometheus stores data on disk. Control how much with:
command:
- "--storage.tsdb.retention.time=30d" # Keep 30 days of data
- "--storage.tsdb.retention.size=10GB" # Or cap at 10 GB
You can set both — whichever limit is hit first triggers cleanup.
Alerting with Alertmanager
Prometheus evaluates alert rules but needs Alertmanager to route notifications. Add it to your stack:
alertmanager:
image: prom/alertmanager:v0.28.1
container_name: alertmanager
restart: unless-stopped
ports:
- "9093:9093"
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
networks:
- monitoring
Create alertmanager.yml:
route:
receiver: "email"
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receivers:
- name: "email"
email_configs:
- to: "[email protected]"
from: "[email protected]"
smarthost: "smtp.example.com:587"
auth_username: "[email protected]"
auth_password: "your-smtp-password"
Add Alertmanager to prometheus.yml:
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager:9093"]
Reverse Proxy
Behind Nginx Proxy Manager or Traefik, forward traffic to port 9090. Add --web.external-url=https://prometheus.example.com to the command to fix redirect URLs.
Security: Prometheus has no built-in authentication. Either use a reverse proxy with basic auth or restrict access to your local network. Do not expose port 9090 publicly without authentication.
See Reverse Proxy Setup.
Backup
Back up the Prometheus data volume:
docker compose stop prometheus
docker run --rm \
-v prometheus-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/prometheus-backup.tar.gz /data
docker compose start prometheus
Also back up your configuration files (prometheus.yml, alert-rules.yml, alertmanager.yml).
See Backup Strategy.
Troubleshooting
Target shows as “DOWN”
Symptom: Targets page shows a red “DOWN” status.
Fix:
- Verify the target container is running:
docker ps - Check that containers are on the same Docker network
- Test connectivity:
docker exec prometheus wget -qO- http://node-exporter:9100/metrics | head - Ensure the target’s
/metricsendpoint is accessible
”out of order sample” errors in logs
Symptom: Prometheus logs show “out of order sample” warnings.
Fix: This happens when scrape intervals are too short or when time syncs on the host. Ensure NTP is configured:
timedatectl status
High memory usage
Symptom: Prometheus uses several GB of RAM.
Fix: Memory scales with the number of active time series. Reduce by:
- Increasing
scrape_interval(30s instead of 15s) - Reducing the number of metrics collected (use
metric_relabel_configsto drop unneeded metrics) - Setting
--storage.tsdb.retention.sizeto cap disk usage
Metrics missing after restart
Symptom: Historical data is gone after container restart.
Fix: Ensure prometheus-data volume is persisted. If using bind mounts, check permissions — Prometheus runs as UID 65534 (nobody).
Resource Requirements
- RAM: 500 MB for a few targets, 2-4 GB for 50+ targets with many metrics
- CPU: Low at rest, moderate during scrapes and queries
- Disk: ~2 bytes per sample. 10 targets scraped every 15s for 30 days ≈ 500 MB. 100 targets ≈ 5 GB.
Frequently Asked Questions
Do I need Prometheus if I have Grafana?
Grafana is a visualization tool — it doesn’t collect data. You need a data source like Prometheus to feed it metrics. Prometheus scrapes and stores the data; Grafana displays it.
Prometheus vs InfluxDB — which should I use?
Prometheus is pull-based (it scrapes targets) and purpose-built for monitoring. InfluxDB is push-based (services send data to it) and more general-purpose. For infrastructure monitoring, Prometheus has a larger ecosystem of exporters and community dashboards. Use InfluxDB for IoT or custom application metrics.
How many targets can Prometheus handle?
A single Prometheus instance can comfortably scrape 500+ targets with 15s intervals. Beyond that, consider federation (multiple Prometheus instances) or Thanos/Cortex for long-term storage.
Verdict
Prometheus is the standard for self-hosted metrics collection. The exporter ecosystem covers everything — servers, containers, databases, network devices, and custom applications. PromQL is powerful once learned, and the community has built thousands of pre-made dashboards and alert rules. Pair it with Grafana for visualization and you have an enterprise-grade monitoring stack for free. The learning curve is steeper than simpler tools like Uptime Kuma or Netdata, but the flexibility and scalability are worth it.
Related
Get self-hosting tips in your inbox
New guides, comparisons, and setup tutorials — delivered weekly. No spam.