The most surprising thing about migrating to a new version of Vector is how often the core problem isn’t the new features, but the subtle deprecations and the way existing components interact with them.
Let’s say you’re upgrading from Vector 0.25.1 to 0.26.0. You’ve got a standard pipeline: file source, filter transform, and http sink.
# vector.toml (v0.25.1)
[sources.my_logs]
type = "file"
include = ["/var/log/app.log"]
[transforms.filter_errors]
type = "filter"
inputs = ["my_logs"]
condition = 'true' # Example: filter out everything for now
[sinks.my_http_sink]
type = "http"
inputs = ["filter_errors"]
endpoint = "http://localhost:8080/logs"
method = "POST"
When you fire up Vector 0.26.0 with this config, you might hit a wall. The http sink might start failing with a cryptic "connection refused" or, more subtly, just silently drop events.
The core issue here isn’t that http sink is broken, but that Vector 0.26.0 introduced a default tls.server_name behavior that can conflict with older configurations or specific server setups.
Common Causes and Fixes
-
TLS Server Name Mismatch/Implicit TLS:
- Diagnosis: Check your
httpsink configuration. If you don’t havetls.enabled = trueexplicitly set, but yourendpointishttps://..., Vector 0.26.0 might be trying to negotiate TLS with a server that doesn’t expect it, or it might be using an incorrectserver_namefor SNI. Conversely, if you do havetls.enabled = truebut notls.server_nameand the server requires SNI, it can fail. - Fix:
- Option A (No TLS needed): If your endpoint is actually HTTP, change
endpoint = "https://..."toendpoint = "http://...". - Option B (TLS needed, no SNI): If your endpoint is HTTPS and the server doesn’t require SNI, explicitly disable SNI by adding
tls.server_name = ""to yourhttpsink configuration:[sinks.my_http_sink] type = "http" inputs = ["filter_errors"] endpoint = "https://localhost:8080/logs" method = "POST" tls.enabled = true tls.server_name = "" # Explicitly disable SNI - Option C (TLS needed, specific SNI): If your endpoint is HTTPS and requires a specific SNI hostname (e.g.,
api.example.com), set it:[sinks.my_http_sink] type = "http" inputs = ["filter_errors"] endpoint = "https://localhost:8080/logs" method = "POST" tls.enabled = true tls.server_name = "api.example.com" # Specify the SNI hostname
- Option A (No TLS needed): If your endpoint is actually HTTP, change
- Why it works: Vector 0.26.0 changed the default behavior for
tls.server_namein thehttpsink. Previously, it might have inferred it from theendpointor left it unset. Now, it defaults to using the hostname from theendpointfor SNI. Explicitly settingtls.enabled,tls.server_name = "", or a specifictls.server_nameensures the TLS handshake proceeds as expected by your server.
- Diagnosis: Check your
-
Deprecated
regex_replaceinremap:- Diagnosis: If you’re using the
remaptransform and haveregex_replacefunctions within it, they might be failing or producing unexpected results. You’ll see errors like "unknown functionregex_replace" or malformed output data. - Fix: Replace
regex_replacewith the newerregex::replacesyntax. The old syntax wasregex_replace("pattern", "replacement", input_field). The new syntax isregex::replace(input_field, "pattern", "replacement"). Note the field is the first argument now.# Old (v0.25.1 and earlier) # condition = 'regex_replace("ERROR", "WARN", .message)' # New (v0.26.0+) condition = 'regex::replace(.message, "ERROR", "WARN")' - Why it works: The
remaplanguage evolved.regex_replacewas a standalone function, but it’s been integrated as a method on theregexobject for better organization and consistency with other string manipulation functions.
- Diagnosis: If you’re using the
-
Source/Sink Configuration Changes (e.g.,
multilineinfilesource):- Diagnosis: If your
filesource was previously configured for multiline logs (e.g., Java stack traces) and you’re seeing events being split incorrectly or missed entirely after the upgrade, check themultilineconfiguration. - Fix: In
filesources, themultilineconfiguration has been refined. Themodeparameter is now more explicit. If you were usingmode = "grok"ormode = "regex"without specifying thepattern, you might need to. For example, to match lines that don’t start with a timestamp:[sources.my_logs] type = "file" include = ["/var/log/app.log"] multiline.enabled = true # Example: If lines start with a timestamp like 2023-10-27T10:00:00.123Z multiline.pattern = "^(?![0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]{3}Z)" multiline.timeout_ms = 1000 multiline.max_lines = 500 - Why it works: The
multilineconfiguration in thefilesource has been updated to be more robust and explicit, particularly around how patterns are defined and applied. Ensuring yourpatterncorrectly identifies the start of a new log entry (or the continuation of an old one) is key.
- Diagnosis: If your
-
json_parserTransform Defaults:- Diagnosis: If you have a
json_parsertransform and your JSON is no longer being parsed correctly, or you’re getting unexpectednullvalues, check themax_depthandmax_field_countparameters. - Fix: Increase
max_depthormax_field_countif your JSON is deeply nested or has many fields.[transforms.parse_json] type = "json_parser" inputs = ["my_logs"] max_depth = 30 # Increased from default 10 max_field_count = 1000 # Increased from default 500 - Why it works: To prevent denial-of-service attacks and excessive resource consumption from malformed or overly complex JSON,
json_parserhas default limits. If your legitimate logs exceed these, you’ll need to tune them upwards.
- Diagnosis: If you have a
-
timestampTransform Changes:- Diagnosis: If events are being dropped or delayed because their timestamps aren’t being correctly parsed or set, investigate your
timestamptransform. New versions might have stricter parsing or different default behaviors forfallback_to_processing. - Fix: Ensure your
formatstring in thetimestamptransform precisely matches your log’s timestamp format. If you previously relied onfallback_to_processingand it’s no longer behaving as expected, explicitly setfallback_to_processing = trueand ensure yourtargetfield is correctly specified.[transforms.add_timestamp] type = "timestamp" inputs = ["filter_errors"] source = "timestamp_field" # Assuming your log has a timestamp like "2023-10-27T10:00:00Z" format = "%Y-%m-%dT%H:%M:%SZ" target = "event_timestamp" fallback_to_processing = true - Why it works: The
timestamptransform’s parsing logic and fallback mechanisms can be updated for performance or correctness. Explicitly defining theformatandfallback_to_processingensures predictable behavior.
- Diagnosis: If events are being dropped or delayed because their timestamps aren’t being correctly parsed or set, investigate your
-
logplexSink Behavior:- Diagnosis: If you’re using the
logplexsink and observe connection errors or malformed requests, it might be due to changes in how it handles TLS or authentication. - Fix: Ensure your
tokenis correctly configured and that any TLS settings match the target endpoint’s requirements. If the endpoint requires a specificserver_namefor TLS, add it.[sinks.logplex_out] type = "logplex" inputs = ["filter_errors"] endpoint = "https://logs.example.com/logs/v1" token = "YOUR_LOGPLEX_TOKEN" # Add if required by the endpoint # tls.server_name = "logs.example.com" - Why it works: The
logplexsink, likehttp, is subject to TLS and endpoint behavior changes. Explicitly configuring TLS parameters can resolve handshake issues.
- Diagnosis: If you’re using the
After applying these fixes, you might then encounter issues with downstream systems that are now receiving data in a slightly different format due to the upstream changes, or you might find yourself needing to optimize performance on newly added transforms.