Vector Enrichment Tables let you inject contextual data into your network traffic logs, turning raw IP addresses and ports into human-readable information.

Let’s see how this works with a real-time example. Imagine you’re monitoring network traffic, and you see an IP address like 192.0.2.100 hitting your server. Without enrichment, that’s just a number. But with a Vector Enrichment Table, that 192.0.2.100 could be instantly translated to {"country": "US", "city": "New York", "isp": "Example ISP"} within your logs. This makes analysis infinitely easier.

Here’s a simplified Vector configuration snippet that uses two enrichment tables:

[sources.http_source]
  type = "http_server"
  address = "0.0.0.0:8080"

[transforms.enrich_geoip]
  type = "enrichment"
  inputs = ["http_source"]
  tables = ["geoip_table"]
  # This transform adds fields from 'geoip_table' to matching events.
  # The key to match on is implicitly 'source_ip' if not specified.

[transforms.enrich_csv]
  type = "enrichment"
  inputs = ["enrich_geoip"] # Chain enrichment transforms
  tables = ["csv_table"]
  match_field = "destination_port" # Match on a different field
  # This transform adds fields from 'csv_table' to matching events based on 'destination_port'.

[sinks.console_sink]
  type = "console"
  inputs = ["enrich_csv"]
  # This sink outputs the enriched events to the console.

And here are the corresponding enrichment table definitions:

[[enrichment_tables.geoip_table]]
  type = "geoip"
  # This table uses a MaxMind GeoIP2 database.
  # The 'database' field points to the actual .mmdb file.
  database = "/opt/vector/geoip/GeoLite2-City.mmdb"
  # The 'lookup_field' specifies which field in the incoming event to use for the lookup.
  # If not specified, it defaults to 'source_ip'.
  lookup_field = "source_ip"
  # The 'fields' array specifies which data points to extract from the GeoIP database.
  fields = ["country.iso_code", "city.names.en", "autonomous_system.organization"]

[[enrichment_tables.csv_table]]
  type = "csv"
  # This table reads data from a CSV file.
  file = "/opt/vector/csv/port_mapping.csv"
  # The 'match_field' specifies the column in the CSV to use for matching.
  # This must correspond to the 'match_field' in the transform.
  match_field = "port"
  # The 'fields' array specifies the columns to read from the CSV.
  fields = ["port", "service_name", "protocol"]

The geoip_table uses a GeoIP database (typically a .mmdb file from MaxMind) to translate IP addresses into geographical and network information. The csv_table reads a standard CSV file, allowing you to map arbitrary fields (like port numbers, user IDs, or internal service names) to descriptive values.

When an event flows through the enrich_geoip transform, Vector takes the value from the source_ip field (by default) and looks it up in the geoip_table. If a match is found, it extracts the specified country.iso_code, city.names.en, and autonomous_system.organization fields and adds them as new fields to the event.

The enrich_csv transform then takes these already enriched events and performs another lookup, this time using the destination_port field from the incoming event and matching it against the port column in port_mapping.csv. If a match occurs, it adds service_name and protocol to the event.

This chaining is powerful. You can enrich once with GeoIP, then again with a custom CSV for internal mapping, and so on. The key is that each enrichment transform can specify its own tables and, crucially, its own match_field if you don’t want to use the default source_ip or destination_ip.

The most surprising true thing about Vector’s enrichment is that it doesn’t just add new fields; it can also overwrite existing ones if the field names are the same. This means you can use enrichment to normalize or standardize data points across different sources. For example, if you have a country field coming from two different sources, you could use a GeoIP enrichment table to ensure that both sources populate the country field with the same ISO code format, effectively overriding any inconsistencies.

This capability is particularly useful when dealing with data from diverse systems that might use different naming conventions or formats for the same conceptual data. By carefully defining your enrichment tables and the fields you extract, you can create a unified, clean dataset ready for analysis, logging, or alerting.

The next concept you’ll likely run into is handling multiple matches within a single enrichment table, especially with CSVs, and how Vector’s max_matches and fallback options influence this behavior.

Want structured learning?

Take the full Vector course →