Vector StatsD and DogStatsD sources let you collect application metrics, but they’re often misunderstood as just simple UDP listeners.

Here’s how you can use them to collect metrics from an application that emits them using the StatsD protocol. We’ll use a simple Python script to generate some metrics and then configure Vector to collect and process them.

First, let’s create a small Python script that will act as our application emitting metrics. Save this as generate_metrics.py:

import statsd
import time
import random

client = statsd.StatsClient('localhost', 8125, prefix='my_app')

while True:
    client.incr('requests.total')
    client.timing('request.latency', random.randint(50, 200))
    if random.random() < 0.1:
        client.incr('errors.count')
    time.sleep(0.1)

This script connects to the StatsD default address (localhost:8125) and sends a counter for total requests, a timing metric for request latency, and a counter for errors, all prefixed with my_app.

Now, let’s configure Vector to receive these metrics. Create a vector.toml file with the following content:

[sources.statsd_server]
type = "statsd"
address = "0.0.0.0:8125"
protocol = "udp"

[transforms.parse_statsd]
type = "statsd_parser"
inputs = ["statsd_server"]

[transforms.remap_metric_name]
type = "remap"
inputs = ["parse_statsd"]
source = '''
.metric_name = "app_metrics." + .metric_name
'''

[sinks.console_output]
type = "console"
inputs = ["remap_metric_name"]

This configuration does the following:

  • [sources.statsd_server]: Defines a StatsD source named statsd_server. It listens on all interfaces (0.0.0.0) on port 8125 using UDP. Vector’s statsd source is a versatile listener that can handle both StatsD and DogStatsD protocols out of the box.
  • [transforms.parse_statsd]: This is the crucial part for StatsD. The statsd_parser transform takes the raw UDP datagrams received by the statsd_server source and parses them into structured metric events. It understands the StatsD line protocol.
  • [transforms.remap_metric_name]: This remap transform demonstrates how you can manipulate the parsed metric data. Here, we’re prepending app_metrics. to every incoming metric name. This is useful for organizing metrics from different sources.
  • [sinks.console_output]: A simple console sink to print the processed metrics to standard output, allowing us to see them in action.

To run this:

  1. Save the Python script as generate_metrics.py.
  2. Save the Vector configuration as vector.toml.
  3. Install the statsd Python library: pip install statsd.
  4. Start the Vector agent. If you have Vector installed, you can run: vector --config vector.toml.
  5. In a separate terminal, run the Python script: python generate_metrics.py.

You should see output in the Vector terminal similar to this:

2023-10-27T10:30:00.123Z INFO [vector] event.source_type="statsd" event.source_instance="statsd_server" event.metric_name="app_metrics.requests.total" event.value=1 event.metric_type="counter"
2023-10-27T10:30:00.234Z INFO [vector] event.source_type="statsd" event.source_instance="statsd_server" event.metric_name="app_metrics.request.latency" event.value=123 event.metric_type="timer"
2023-10-27T10:30:01.567Z INFO [vector] event.source_type="statsd" event.source_instance="statsd_server" event.metric_name="app_metrics.errors.count" event.value=1 event.metric_type="counter"

Notice how the metric names are now prefixed with app_metrics., and Vector has correctly identified the metric type (counter, timer).

The surprising truth about Vector’s StatsD source is that it’s not just a passive listener; it actively parses the StatsD protocol into structured events, enabling rich transformations.

The statsd source in Vector is designed to be flexible. It listens on a specified address and protocol (UDP is most common for StatsD, but TCP is also supported). When data arrives, it’s handed off to the statsd_parser transform, which is where the magic happens. This parser understands the StatsD format: metric_name:value|type|@sample_rate|#tag_key:tag_value. It correctly distinguishes between counters, gauges, timers, and sets, and extracts any tags that might be present.

Consider this slightly more complex StatsD line sent from an application: user.login.failed:1|c|#service:auth,region:us-east-1.

When Vector receives this via the statsd source and processes it with statsd_parser, the resulting event will have fields like:

  • metric_name: user.login.failed
  • value: 1
  • metric_type: counter
  • tags: {"service": "auth", "region": "us-east-1"}

This structured data is what makes Vector powerful. You can then route these metrics, aggregate them, filter them, or send them to various backends like Prometheus, InfluxDB, or even another StatsD server.

The statsd source itself doesn’t have many configurable options beyond address and protocol. The real power comes from pairing it with the statsd_parser transform and then using subsequent transforms like remap, filter, or aggregator to shape the data before it reaches your chosen sink. For example, you might want to filter out metrics with a specific tag, or aggregate timers into percentiles.

Most people don’t realize that the statsd source can also ingest DogStatsD metrics, which are an extension of the StatsD protocol that includes richer tagging support. Vector’s statsd_parser handles this seamlessly, automatically extracting the extra tags provided by DogStatsD and making them available as standard event fields.

The next step is usually to send these processed metrics to a dedicated time-series database or monitoring system.

Want structured learning?

Take the full Vector course →