Splunk’s REST API lets you trigger and manage searches programmatically, but its real magic is in how it orchestrates distributed search execution across your indexers.

Let’s watch a search job come to life. Imagine you’ve got a Splunk cluster with a search head (SH) and a few indexers (IDX1, IDX2). You send a request to the SH’s /services/search/jobs endpoint to run index=_internal | head 10.

curl -k -u admin:changeme \
     -X POST \
     -d "search=index=_internal | head 10" \
     https://your_splunk_sh:8089/services/search/jobs

The SH receives this. It doesn’t run the search itself. Instead, it acts as a conductor. It breaks down your search into parts that can be run in parallel. For index=_internal | head 10, it might tell IDX1 to find the first 10 events in _internal and IDX2 to do the same. This is where the distributed magic happens; Splunk intelligently dispatches search commands to the indexers that hold the relevant data.

The SH then waits for results to stream back from the indexers. It collects these partial results, merges them, and presents them as a single, coherent output. This merging is crucial. For a command like stats count by host, the SH might ask IDX1 for its local counts and IDX2 for its local counts, then combine those counts to give you the global result.

The key levers you control are within the POST request itself:

  • search: The actual SPL query.
  • earliest_time, latest_time: Define the time window for your search. You can use absolute timestamps (e.g., 1678886400) or relative offsets (e.g., -1h).
  • app: The Splunk app context for the search. Defaults to the app of the user making the request.
  • owner: The user who "owns" the search job.
  • search_mode: Can be normal (default), smart (optimizes based on search type), or verbose (includes more metadata).

Consider a search that uses a lookup file. The SH will first retrieve the lookup data, then distribute the search and the lookup data to the indexers. This ensures that even though the indexers don’t permanently store the lookup, they have access to it for the duration of the search job.

The job object returned by the initial POST request is your control center. You can poll its /services/search/jobs/{search_id} endpoint to check its status (queued, dispatching, running, done, failed). Once done, you can retrieve the results from /services/search/jobs/{search_id}/results. You can also cancel a running job using a DELETE request to that same endpoint.

When you use output_mode=json in your results request, you’re not just getting raw events. You’re getting a structured JSON object that includes metadata about the search itself – things like the number of results, the time taken, and even the fields extracted. This metadata is often overlooked, but it’s essential for understanding the performance and scope of your search.

The most surprising thing is how Splunk handles complex, multi-stage searches. For example, index=_internal | stats count by host | sort -count. The SH doesn’t just send stats count by host to indexers and then sort -count to the SH. It intelligently breaks it down, potentially sending stats count by host to indexers, collecting those intermediate results, and then sending the sorted results back to the user. The dispatching logic is sophisticated enough to push as much work as possible to the indexers while respecting the dependencies between search stages.

The next rabbit hole is understanding how Splunk optimizes search dispatching for different SPL commands and how to tune those optimizations.

Want structured learning?

Take the full Splunk course →