Prometheus doesn’t actually "scrape" targets in the way you might think; it pulls metrics from targets that expose an HTTP endpoint.

Let’s see Prometheus in action. Imagine we have a simple web service running on 192.168.1.100:8080. This service exposes a /metrics endpoint. Prometheus, configured to know about this service, will periodically send an HTTP GET request to http://192.168.1.100:8080/metrics. The response is a plain text body containing metric names, their values, and associated labels.

Here’s a snippet of what that might look like:

# HELP http_requests_total Total number of HTTP requests received.
# TYPE http_requests_total counter
http_requests_total{method="POST",path="/users",status="201"} 150
http_requests_total{method="GET",path="/users",status="200"} 475
# HELP http_response_duration_seconds Duration of HTTP requests in seconds.
# TYPE http_response_duration_seconds histogram
http_response_duration_seconds_bucket{le="0.1",method="GET",path="/users"} 300
http_response_duration_seconds_bucket{le="0.5",method="GET",path="/users"} 450
http_response_duration_seconds_bucket{le="+Inf",method="GET",path="/users"} 475
http_response_duration_seconds_sum{method="GET",path="/users"} 120.5
http_response_duration_seconds_count{method="GET",path="/users"} 475

Prometheus stores these time-series data points, indexed by metric name and labels. The core of Prometheus’s power lies in how it uses this data: Scrape Configuration, Recording Rules, and Alerting Rules.

Your prometheus.yml file is the central hub. The scrape_configs section tells Prometheus what to scrape and how.

scrape_configs:
  - job_name: 'my-web-app'
    static_configs:
      - targets: ['192.168.1.100:8080']

This is the most basic setup. Prometheus will try to connect to 192.168.1.100:8080 every default scrape interval (usually 15 seconds). You can customize this:

scrape_configs:
  - job_name: 'my-web-app'
    scrape_interval: 30s  # Scrape every 30 seconds
    static_configs:
      - targets: ['192.168.1.100:8080']

For dynamic environments, you’ll use service discovery (e.g., Consul, Kubernetes, EC2). Prometheus queries these sources to automatically discover targets.

Recording Rules precompute and store frequently needed or computationally expensive metric aggregations. Instead of calculating sum(rate(http_requests_total[5m])) every time you need it, you record it.

In a separate file, say recording_rules.yml, typically included via rule_files in prometheus.yml:

groups:
  - name: http_rules
    rules:
      - record: http_requests_per_minute
        expr: sum(rate(http_requests_total[1m]))

This rule, when evaluated, creates a new time series named http_requests_per_minute that holds the calculated value. This significantly speeds up dashboards and alerts that rely on this rate.

Alerting Rules define conditions that, when met, trigger alerts. These are also defined in .yml files.

groups:
  - name: http_alerts
    rules:
      - alert: HighErrorRate
        expr: sum(rate(http_requests_total{status=~"5.."} [5m])) / sum(rate(http_requests_total[5m])) * 100 > 5
        for: 5m
        labels:
          severity: critical
        annotations:

          summary: "High HTTP error rate detected on {{ $labels.instance }}"


          description: "{{ $value | printf \"%.2f\" }}% of requests in the last 5 minutes resulted in a 5xx error."

Here:

  • alert: HighErrorRate: The name of the alert.

  • expr: The PromQL expression. If it evaluates to true (a non-zero value), the alert is considered. This expression checks if the percentage of 5xx errors over the last 5 minutes exceeds 5%.

  • for: 5m: The condition must be true for at least 5 minutes before the alert fires. This prevents flapping alerts for transient issues.

  • labels: Additional labels attached to the alert, useful for routing and grouping.

  • annotations: Human-readable information about the alert. {{ $labels.instance }} and {{ $value }} are template variables populated with data from the expression.

Prometheus itself doesn’t send alerts; it evaluates them and exposes an /alerts endpoint. An Alertmanager component then takes these alerts, deduplicates them, groups them, silences them, and routes them to notification channels like Slack, PagerDuty, or email.

The relationship between Prometheus and Alertmanager is crucial: Prometheus generates alerts based on rules, and Alertmanager handles the delivery and management of those alerts.

When Prometheus evaluates an alert rule, it checks if the expr returns any time series. If it does, the alert is considered "pending." If the condition remains true for the for duration, the alert transitions to "firing." Prometheus then exposes these firing alerts on its /alerts endpoint. Alertmanager periodically scrapes this /alerts endpoint. If Alertmanager receives an alert that isn’t already being processed and isn’t silenced, it will group it according to its configuration and send notifications.

A common pitfall is forgetting to configure Alertmanager in prometheus.yml or misconfiguring the Alertmanager’s own configuration file, leading to alerts being generated but never reaching humans.

The next thing you’ll typically wrestle with is understanding PromQL’s vector matching rules when combining metrics with different label sets.

Want structured learning?

Take the full Sre course →