Vector’s HTTP source is how you get data into Vector from external systems that can’t directly push to Vector’s other sources, like Kafka or files.

{
  "sources": {
    "my_webhook_source": {
      "type": "http_source",
      "address": "0.0.0.0:9999",
      "mode": "json"
    }
  },
  "transforms": {},
  "sinks": {
    "my_stdout_sink": {
      "type": "console",
      "inputs": ["my_webhook_source"],
      "encoding": "json"
    }
  }
}

This config sets up a source named my_webhook_source listening on all interfaces (0.0.0.0) on port 9999. It expects incoming data in JSON format. Any JSON events received on this port will be processed by Vector. The my_stdout_sink then immediately prints these events to Vector’s console, showing you what’s coming in.

To test this, you can use curl to send a JSON payload:

curl -X POST -H "Content-Type: application/json" -d '{"message": "hello from curl", "timestamp": 1678886400}' http://localhost:9999

You’ll see the JSON object appear in Vector’s output.

The HTTP source is fundamentally a web server that accepts POST requests. When a request comes in, Vector parses the request body according to the configured mode. json is the most common, but it also supports form (for application/x-www-form-urlencoded), text, and binary. The parsed data is then turned into Vector events and sent down the pipeline.

The address field is straightforward: it’s the network interface and port Vector will bind to. 0.0.0.0 means listen on all available network interfaces. If you only want it accessible from the machine Vector is running on, use 127.0.0.1.

The real power comes from how you can customize the ingestion. You can specify max_request_body_size to prevent huge payloads from overwhelming your system. For security, you can configure TLS by providing ssl_cert_file and ssl_key_file.

Here’s an example with TLS enabled:

{
  "sources": {
    "secure_webhook": {
      "type": "http_source",
      "address": "0.0.0.0:8443",
      "mode": "json",
      "ssl_cert_file": "/path/to/your/cert.pem",
      "ssl_key_file": "/path/to/your/key.pem",
      "max_request_body_size": "10MB"
    }
  },
  "transforms": {},
  "sinks": {
    "my_stdout_sink": {
      "type": "console",
      "inputs": ["secure_webhook"],
      "encoding": "json"
    }
  }
}

You’d then use curl with the --insecure flag (for testing) or a properly configured client:

curl -X POST -H "Content-Type: application/json" --insecure -d '{"user_id": 123, "action": "login"}' https://localhost:8443

The mode option dictates how Vector interprets the incoming request body. For json, it expects a single JSON object per request or an array of JSON objects. For form, it parses key=value&key2=value2 into a flat event. text treats the entire body as a single string field, and binary passes the raw bytes through.

One subtle but crucial aspect is how Vector handles multi-line JSON arrays or different content types. If you send an array of JSON objects ([{}, {}]), Vector will typically emit each object as a separate event. If you send text/plain, the entire body becomes a single string event.

When you send data to the HTTP source, each request is processed independently. Vector doesn’t inherently batch requests arriving at the same time unless your upstream system does. However, Vector does batch events internally before sending them to sinks, which is a separate optimization. The http_source itself is about receiving and parsing, not about the internal batching that happens later in the pipeline.

The healthcheck_path option allows you to specify a URL path (e.g., /health) that Vector will respond to with a 200 OK if it’s running. This is useful for load balancers to check the health of your Vector instance.

{
  "sources": {
    "health_checked_webhook": {
      "type": "http_source",
      "address": "0.0.0.0:9999",
      "mode": "json",
      "healthcheck_path": "/health"
    }
  },
  "transforms": {},
  "sinks": {
    "my_stdout_sink": {
      "type": "console",
      "inputs": ["health_checked_webhook"],
      "encoding": "json"
    }
  }
}

If you then curl http://localhost:9999/health, you’ll get a 200 OK response, while a request to / would still be processed as an event.

The most surprising thing is that the HTTP source doesn’t just passively listen; it actively manages its own HTTP server lifecycle, including TLS negotiation and request parsing, and it does this with minimal configuration for basic use cases.

Consider a scenario where you have a SaaS product sending webhooks. You’d configure Vector with an http_source on a public-facing IP and port, specify mode: "json", and then point your SaaS product’s webhook URL to http://your-vector-host:port. Vector then becomes the ingress point for all those events, ready to be processed and forwarded.

The host field within the incoming event, if not explicitly overridden by a proxy or load balancer adding X-Forwarded-For headers, will reflect the direct IP address of the client making the request. This can be useful for auditing or rate-limiting based on source IP.

When using reverse proxies or load balancers in front of Vector, ensure they are configured to pass through the original Host header and any relevant X-Forwarded-* headers. Vector respects these headers, and they can influence how events are tagged if you’re using certain transforms or fields.

The next concept to explore is how to handle different event formats beyond simple JSON, such as newline-delimited JSON (NDJSON) or custom-delimited text, and how Vector’s parsing modes and transforms can adapt to these variations.

Want structured learning?

Take the full Vector course →