Elasticsearch kNN Search: Vector Search in Elasticsearch (2026)

Elasticsearch’s kNN search isn’t just about finding similar items; it’s fundamentally about transforming discrete data points into continuous geometric representations where proximity is relevance.

Let’s see this in action. Imagine you have a dataset of product descriptions and you want to find similar products.

First, we need to index our data with a dense_vector field.

PUT /products
{
  "mappings": {
    "properties": {
      "description": {
        "type": "text"
      },
      "description_vector": {
        "type": "dense_vector",
        "dims": 768,  // The dimensionality of our vectors
        "index": true,
        "similarity": "cosine" // Or "dot_product", "l2_norm"
      }
    }
  }
}

Now, let’s add some documents. We’ll use a hypothetical sentence embedding model to generate the description_vector.

POST /products/_doc
{
  "description": "A comfortable, ergonomic office chair with lumbar support.",
  "description_vector": [0.1, 0.5, -0.2, ..., 0.8] // 768 dimensions
}

POST /products/_doc
{
  "description": "This plush armchair is perfect for relaxing in the living room.",
  "description_vector": [0.3, 0.6, -0.1, ..., 0.7] // 768 dimensions
}

POST /products/_doc
{
  "description": "A high-performance gaming keyboard with mechanical switches.",
  "description_vector": [-0.7, 0.1, 0.9, ..., -0.3] // 768 dimensions
}

To perform a kNN search, we query for a target vector and ask Elasticsearch to find the k nearest neighbors.

GET /products/_search
{
  "knn": {
    "field": "description_vector",
    "query_vector": [0.2, 0.55, -0.15, ..., 0.75], // Vector representing "a comfortable seat"
    "k": 2,
    "num_candidates": 50 // Controls the trade-off between accuracy and performance
  },
  "_source": ["description"]
}

The knn query block tells Elasticsearch to use the description_vector field, compare against query_vector, and return the top k=2 most similar documents. num_candidates is a crucial parameter for approximate kNN (ANN) algorithms, which Elasticsearch uses by default for performance. It dictates how many candidate vectors are examined during the search; a higher number means more accuracy but slower search.

Elasticsearch’s kNN search is powered by libraries like HNSW (Hierarchical Navigable Small Worlds) for efficient approximate nearest neighbor search. HNSW builds a multi-layered graph where each layer is a graph of nodes (your vectors). Searching starts at the top layer, traversing the graph towards the query vector. When it can no longer get closer, it drops down to the next layer and continues. This graph structure allows for sub-linear search times, making it feasible to search millions or billions of vectors. The index: true setting in the mapping tells Elasticsearch to build this HNSW graph.

When you set index: true on a dense_vector field, Elasticsearch doesn’t just store the vectors; it builds an HNSW graph. This graph is an approximation, meaning it doesn’t guarantee finding the absolute nearest neighbors, but it finds them very quickly. The similarity parameter (cosine, dot product, L2 norm) defines the distance metric used to calculate how "close" two vectors are. Cosine similarity, for instance, measures the angle between vectors, ignoring their magnitude, which is often useful for text embeddings where the direction of the vector captures semantic meaning.

The num_candidates parameter directly influences the quality of the HNSW graph traversal during search. A higher num_candidates means the search algorithm explores more of the graph, increasing the likelihood of finding vectors that are truly closer to the query vector, but at the cost of increased latency. Conversely, a lower num_candidates speeds up the search by limiting the exploration, but might miss some of the closest neighbors. This is the primary knob for tuning the accuracy-performance trade-off in ANN search.

The internal representation of the HNSW graph is complex, involving nodes, edges, and multiple layers. Each node in the graph corresponds to a document’s vector, and edges connect nodes that are considered "close" based on the chosen similarity metric. When a search query arrives, Elasticsearch starts at a randomly chosen entry point in the top layer of the graph and iteratively moves to the neighbor node that is closest to the query vector. This process continues until no further improvement can be made in the current layer, at which point the search descends to the next layer and repeats. This multi-layer, greedy traversal is what makes HNSW so efficient for large-scale vector similarity search.

If you notice that your kNN searches are returning fewer results than expected, or that the results are not as relevant as they should be, it’s often because the HNSW graph hasn’t been fully built or optimized. This can happen with newly indexed data or after significant updates. Running a _forcemerge operation on the index, specifically targeting the HNSW segments, can help consolidate the graph structure and improve search accuracy. For example, POST /products/_forcemerge?max_num_segments=1 will merge all segments into a single segment, ensuring the HNSW graph is as complete as possible for that index.

You’ll next encounter issues with managing the trade-off between search latency and recall as your dataset grows and query performance becomes critical.

More Deep Dives in Vector Databases