Vector’s end-to-end acknowledgements are the system’s way of proving to you that your data made it where it was supposed to go, no matter how many hops it took.
Let’s watch Vector process a simple flow: a file source writing to a console sink.
# vector.toml
[sources.my_file]
type = "file"
include = ["/tmp/input.log"]
[sinks.my_console]
type = "console"
inputs = ["my_file"]
When Vector reads a line from /tmp/input.log, it assigns it a unique ID. It then sends this "event" (which includes the line’s data and its ID) to the console sink. The console sink, before printing it, acknowledges receipt of the event. If the console sink were to fail before acknowledging, Vector would know the data might be lost and retry. Once acknowledged, Vector considers that event "acknowledged" and can safely remove it from its internal buffer.
This sounds simple, but the magic happens when you introduce intermediate components, like a vector transform or a kafka sink.
Imagine this:
# vector.toml
[sources.my_file]
type = "file"
include = ["/tmp/input.log"]
[transforms.my_transform]
type = "remap"
inputs = ["my_file"]
source = '''
.message = .message + " - transformed"
'''
[sinks.my_kafka]
type = "kafka"
inputs = ["my_transform"]
# ... kafka sink configuration ...
Here, my_file reads a line. It sends it to my_transform. my_transform adds " - transformed" to the message. Then, my_transform sends the modified event to my_kafka. The kafka sink writes the event to Kafka and waits for Kafka to confirm the write. Only after the kafka sink receives confirmation from Kafka does it acknowledge the event back to my_transform. my_transform then acknowledges it to my_file. Each step confirms receipt, building a chain of trust. If any link in this chain breaks (e.g., Kafka is down, or my_transform crashes), the acknowledgement stops, and Vector will retry sending the unacknowledged events.
The core mechanism is Vector’s internal event bus and its state management. When an event is generated by a source, it’s placed into a buffer. It’s only removed from this buffer once an acknowledgement "bubbles up" the pipeline from the ultimate sink back to the source. This acknowledgement signifies that the data has been successfully processed and persisted by the downstream component. If a component crashes or becomes unavailable before acknowledging, the event remains in the buffer, and Vector will attempt to resend it upon recovery.
This end-to-end guarantee is crucial for scenarios where data loss is unacceptable, such as financial transactions or critical system logs. Vector achieves this by maintaining a persistent internal buffer (often on disk) for events that have been sent but not yet acknowledged. This buffer acts as a safety net. If a Vector process restarts, it can reload these unacknowledged events and resume sending them.
The most surprising thing about Vector’s acknowledgements is how granular they are. It’s not a "batch acknowledged" system. Each individual event is tracked. This means if you have a batch of 1000 events sent to a Kafka sink, and only the 501st event fails to write to Kafka, Vector will only re-send events 501 through 1000. The first 500 are considered successfully acknowledged. This fine-grained tracking prevents unnecessary re-processing of already-safe data.
Consider the buffer transform. It’s a powerful tool for managing backpressure and ensuring durability within Vector itself, even before data hits an external sink.
# vector.toml
[sources.my_file]
type = "file"
include = ["/tmp/input.log"]
[transforms.my_buffer]
type = "buffer"
inputs = ["my_file"]
max_events = 10000
max_bytes = "100MB"
# Other buffer config like `when_full`
[sinks.my_kafka]
type = "kafka"
inputs = ["my_buffer"]
# ... kafka sink configuration ...
When my_file sends events to my_buffer, my_buffer doesn’t immediately acknowledge them. It holds them, accumulating them until its max_events or max_bytes are reached, or until a flush interval passes. Only when my_buffer successfully writes a batch of events to its downstream sink (my_kafka in this case) and receives an acknowledgement from my_kafka does it then acknowledge those events back to my_file. This means my_buffer acts as an intermediate durability layer. If Vector crashes after my_buffer has received events but before my_kafka has acknowledged them, my_buffer will re-send those events from its persistent storage upon restart. This prevents data loss between the source and the buffer transform itself.
The actual persistence mechanism for the buffer transform’s internal queue is typically on disk, using a WAL (Write-Ahead Log) or similar strategy. This ensures that even if the Vector process crashes and restarts, the data that was in the buffer transform’s queue is not lost. It can be replayed from disk.
The next hurdle you’ll face is understanding how Vector handles acknowledgement timeouts and retries when downstream components are slow or unresponsive.