The most surprising thing about vector disk buffers is that they don’t actually guarantee event persistence across restarts in the way most people assume.

Let’s see it in action. Imagine you’ve got a vector agent running and shipping logs to a downstream service.

[sources.my_logs]
type = "file"
include = ["/var/log/my_app.log"]

[sinks.my_es]
type = "elasticsearch"
endpoint = "http://localhost:9200"
index_prefix = "my-app"
# This is the crucial part for disk buffering
[sinks.my_es.buffer]
type = "disk"
max_size = "1GB"
when_full = "block"

In this setup, vector will try to write events from my_logs to my_es. If my_es (Elasticsearch, in this case) is slow or unavailable, vector won’t just drop the events. Instead, it will spill them to disk. This is managed by the buffer.disk configuration.

Here’s the mental model: vector is a pipeline. Data flows from sources, through transforms (if any), to sinks. When a sink can’t keep up, or is temporarily down, the buffer acts as a temporary holding pen. The disk buffer specifically writes these events to a directory on your filesystem.

The type = "disk" tells vector to use the filesystem for buffering. max_size = "1GB" sets the maximum amount of disk space the buffer can consume. when_full = "block" means that if the buffer fills up, vector will pause ingesting new data from the source until space becomes available in the buffer. Other options for when_full include drop_newest, drop_oldest, and fail.

Now, about that "persistence across restarts" claim. When vector shuts down gracefully, it tries to flush its in-memory buffers to the disk buffer. Similarly, when it starts up, it tries to read from the disk buffer before processing new data. The key word here is "tries."

The disk buffer is structured as a series of files, often with a journal-like append-only mechanism. When vector starts, it scans this directory. It reconstructs its state based on the files it finds. If a file is corrupted, or if the shutdown wasn’t clean (e.g., a hard power off), vector might not be able to recover all the data. It might start with an empty buffer, or a partially recovered one, effectively losing events that were in the buffer at the time of the crash. The disk buffer is more about handling backpressure during operation than about guaranteeing state recovery post-crash.

If you need true, guaranteed event persistence across restarts, especially in the face of unexpected shutdowns, you’re going to need a more robust solution. This might involve using a dedicated message queue like Kafka or Pulsar as a sink, or carefully configuring vector’s internal state management with options that prioritize durability over performance, though even those have failure modes.

The next thing you’ll likely run into is how to monitor the health and fullness of this disk buffer.

Want structured learning?

Take the full Vector course →