Valkey Streams are designed to be a durable, append-only log, and the real magic happens when you use them with consumer groups to build reliable, parallel processing systems.
Let’s see it in action. Imagine we have a Valkey instance running and we want to send some messages to a stream named sensor_readings.
valkey-cli XADD sensor_readings * temperature 25 humidity 60
# Output: 1678886400000-0
valkey-cli XADD sensor_readings * temperature 26 humidity 62
# Output: 1678886405000-0
valkey-cli XADD sensor_readings * temperature 24 humidity 59
# Output: 1678886410000-0
Now, let’s create a consumer group called processing_group for this stream. The $ signifies that this group should only start consuming messages generated after the group was created.
valkey-cli XGROUP CREATE sensor_readings processing_group $
# Output: OK
A consumer, let’s call it worker-1, can then read messages from this group. The 0-0 ID means "read from the beginning of the stream that this group hasn’t processed yet."
valkey-cli XREADGROUP GROUP processing_group worker-1 COUNT 2 STREAMS sensor_readings 0-0
Here’s what the output might look like:
1) 1) "sensor_readings"
2) 1) 1) "1678886400000-0"
2) 1) "temperature"
2) "25"
3) "humidity"
4) "60"
2) 1) "1678886405000-0"
2) 1) "temperature"
2) "26"
3) "humidity"
4) "62"
Notice that worker-1 received two messages. These messages are now "pending" for worker-1 within the processing_group. Another consumer, worker-2, could then read the next available message:
valkey-cli XREADGROUP GROUP processing_group worker-2 COUNT 2 STREAMS sensor_readings 0-0
This would yield the third message:
1) 1) "sensor_readings"
2) 1) 1) "1678886410000-0"
2) 1) "temperature"
2) "24"
3) "humidity"
4) "59"
The core problem Valkey Streams solve is building a robust, fault-tolerant message processing pipeline. Traditional message queues often struggle with either high throughput or guaranteed delivery. Streams offer both by combining an append-only log with a sophisticated consumer group mechanism. The log ensures messages are never lost once written, and consumer groups provide a way for multiple consumers to share the burden of processing, with each message being delivered to only one consumer within a group. If a consumer fails, its pending messages can be reassessed and delivered to another consumer.
Internally, a stream is a sorted set where the score is the timestamp and the member is the message ID. Each stream has a collection of consumer groups, and each group maintains its own pointer to the last delivered message ID for each consumer within that group. This allows for independent progress tracking. The XADD command appends a new entry, and XREADGROUP is the workhorse for consumers. XACK is crucial for acknowledging that a message has been successfully processed, removing it from the pending list.
The pending entries list (PEL) is a critical internal data structure for each consumer group. When a consumer reads messages using XREADGROUP, those messages are added to the PEL for that specific consumer within the group. They are not yet acknowledged. This PEL serves as a record of messages that have been delivered but not yet confirmed as processed. If a consumer crashes before acknowledging its messages, an administrator or another process can use the XPENDING command to inspect the PEL and then use XCLAIM to reassign those pending messages to a different consumer, ensuring no data is lost and processing continues.
The MAXLEN option in XADD is a double-edged sword. While it can limit memory usage by trimming old messages, it’s not a guarantee of data durability if your consumer group hasn’t processed them yet. MAXLEN operates independently of consumer groups. If you set MAXLEN 1000 and your stream has 1001 messages, the oldest message is discarded regardless of whether any consumer group has processed it. This means if you need guaranteed processing of every message, you should either not use MAXLEN or use MAXLEN ~ 0 which trims only when the stream size exceeds the specified limit, effectively making it a capped but more durable log.
The next logical step is understanding how to monitor and manage these pending messages, especially in the face of consumer failures, using commands like XPENDING and XCLAIM.