Vector databases don’t actually store vectors for searching. They store metadata and a pointer to the vector, which is usually in a separate, specialized vector index structure optimized for nearest-neighbor lookup.

Let’s see how this looks with a real-time upsert. Imagine we have a products collection in our vector database, with a product_id (UUID), description (string), and an image_vector (a 128-dimensional float vector).

// Incoming data for upsert
{
  "records": [
    {
      "id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
      "fields": {
        "description": "A bright red, ergonomic office chair.",
        "image_vector": [0.1, 0.2, ..., 0.9] // 128 floats
      }
    },
    {
      "id": "f0e9d8c7-b6a5-4321-fedc-ba9876543210",
      "fields": {
        "description": "A sleek, minimalist standing desk with adjustable height.",
        "image_vector": [0.9, 0.8, ..., 0.1] // 128 floats
      }
    }
  ]
}

When this JSON hits the database’s API endpoint for a batch upsert, here’s what’s happening under the hood:

  1. Metadata Ingestion: The id and description fields are written to the primary storage engine of the database. This is typically a key-value store or a document store, optimized for transactional consistency and quick retrieval of whole documents. Each record is assigned a unique internal ID.

  2. Vector Indexing Preparation: The image_vector is extracted. For each vector, the database prepares it for insertion into its specialized vector index. This index is not the primary storage. It’s a separate data structure, often an HNSW (Hierarchical Navigable Small Worlds) graph or an IVF (Inverted File Index), built to make Approximate Nearest Neighbor (ANN) searches lightning fast.

  3. Index Population: The prepared vectors are then added to the vector index. This is the computationally intensive part. For HNSW, it involves adding nodes to the graph and establishing connections based on vector similarity. For IVF, it involves assigning vectors to clusters. This process doesn’t necessarily happen synchronously with the metadata write. Many systems perform this in background threads or asynchronous jobs to avoid blocking the initial upsert API call.

  4. Pointer Association: Crucially, the vector index stores a pointer or a reference back to the internal ID of the record in the primary storage. When you search the vector index, you get back these internal IDs. You then use these IDs to fetch the full metadata (like description) from the primary store.

The "batch" aspect is a performance optimization. Instead of performing steps 1-3 for each record individually, the database buffers a certain number of records or a certain amount of data. Then, it processes them together. This reduces overhead from I/O, network calls, and the initialization costs of index building operations. It might pre-allocate memory for a batch of vectors, perform index updates in larger, more efficient chunks, and commit metadata writes in a single transaction.

The primary problem this solves is the "xy problem" of search. You don’t want to search all vectors every time; you want to find similar vectors quickly. The vector index is a specialized data structure for this specific problem, decoupled from the general-purpose storage for your data.

The most surprising true thing about high-throughput vector upserts is that the latency of the API call is often dominated by the metadata write latency, not the vector indexing. The vector index is usually built asynchronously in the background, allowing the initial upsert request to return quickly even if the vector is not immediately searchable. The database prioritizes getting the data safely stored first, then optimizing it for search.

If you’re seeing slow upserts, it’s usually because your primary metadata store is overloaded, your network is saturated with small writes, or the batching mechanism isn’t configured optimally (e.g., the batch size is too small, leading to too many individual operations).

Once your upserts are fast, the next challenge you’ll face is query latency under heavy load.

Want structured learning?

Take the full Vector-databases course →