The Vector Filter Transform (VFT) doesn’t just filter; it fundamentally redefines how data flows through your system by allowing you to dynamically change the destination and content of messages based on their attributes.
Imagine a stream of incoming sensor data, each message tagged with a sensor_id, location, and reading_type. We want to route temperature readings from warehouse-A to a dedicated monitoring service, while humidity readings from the same location go to a different analytics platform.
Here’s a sample VRL snippet that accomplishes this:
# Define a rule to route temperature readings from warehouse-A
rule "route-warehouse-a-temp" {
when {
# Check if the reading_type is 'temperature' AND location is 'warehouse-A'
eq(reading_type, "temperature")
and
eq(location, "warehouse-A")
}
then {
# Set the target to the temperature monitoring service
set_target("temperature-monitor");
# Optionally, add a new field or modify existing ones
set_field("processed_at", now());
}
}
# Define a rule to route humidity readings from warehouse-A
rule "route-warehouse-a-humidity" {
when {
eq(reading_type, "humidity")
and
eq(location, "warehouse-A")
}
then {
set_target("humidity-analytics");
set_field("processed_at", now());
}
}
# Default rule for any other messages from warehouse-A
rule "default-warehouse-a" {
when {
eq(location, "warehouse-A")
}
then {
set_target("default-warehouse-a-sink");
set_field("processed_at", now());
}
}
In this configuration, we define three distinct rules. The rule keyword introduces a new filtering condition. The when block contains the logic that must be true for the rule to be applied. We use functions like eq() for equality checks and and to combine conditions. The then block specifies the actions to take if the when condition is met. set_target() is crucial; it tells the system where to send the message next. set_field() allows you to add or modify message attributes.
The Vector Filter Transform is built around the concept of a "pipeline." A message enters the pipeline and is evaluated against each rule sequentially. The first rule whose when condition evaluates to true will have its then block executed, and the message will be routed according to the set_target() action. If no rule matches, the message might be dropped or routed to a default sink, depending on your overall pipeline configuration.
Let’s see how this plays out with actual data. Suppose a message arrives:
{
"sensor_id": "temp-sensor-123",
"location": "warehouse-A",
"reading_type": "temperature",
"value": 22.5
}
- The message is evaluated against
route-warehouse-a-temp. eq(reading_type, "temperature")istrue.eq(location, "warehouse-A")istrue.- The
andcondition istrue. - The
thenblock executes.set_target("temperature-monitor")is called. set_field("processed_at", now())adds a timestamp.- The message, now with an added
processed_atfield, is sent to thetemperature-monitortarget. The subsequent rules are not evaluated for this message.
Now consider this message:
{
"sensor_id": "hum-sensor-456",
"location": "warehouse-A",
"reading_type": "humidity",
"value": 65.1
}
- The message is evaluated against
route-warehouse-a-temp. eq(reading_type, "temperature")isfalse. The rule doesn’t match.- The message moves to
route-warehouse-a-humidity. eq(reading_type, "humidity")istrue.eq(location, "warehouse-A")istrue.- The
andcondition istrue. - The
thenblock executes.set_target("humidity-analytics")is called. set_field("processed_at", now())adds a timestamp.- The message is sent to the
humidity-analyticstarget.
If a message like this arrives:
{
"sensor_id": "vibration-sensor-789",
"location": "warehouse-A",
"reading_type": "vibration",
"value": 0.1
}
route-warehouse-a-tempandroute-warehouse-a-humidityboth fail becausereading_typeis nottemperatureorhumidity.- The message moves to
default-warehouse-a. eq(location, "warehouse-A")istrue.- The
thenblock executes.set_target("default-warehouse-a-sink")is called. set_field("processed_at", now())adds a timestamp.- The message is sent to the
default-warehouse-a-sinktarget.
The order of your rules is critical. Vector processes them top-to-bottom. If a message matches a rule, it’s routed, and no further rules are evaluated for that message. This allows you to create specific overrides for certain conditions before falling back to more general routing.
One subtle but powerful aspect of VFT is that the then block can modify the message itself before it’s routed. You can drop fields, add new ones, or even transform existing ones using VRL’s extensive function library. This means the VFT isn’t just a router; it’s also a lightweight data transformation engine, allowing you to clean, enrich, or reshape data on the fly as it’s being routed.
Understanding the sequential evaluation of rules and the ability to modify message content within the then block is key to mastering complex routing scenarios.
The next step after mastering dynamic routing with VFT is to explore how to handle message ordering and batching for downstream consumers.