Vitess Read-Write Splitting: Route Reads to Replicas (2026)

Vitess’s read-write splitting automatically directs read traffic to replica instances, offloading your primary database.

Let’s see it in action. Imagine you have a Vitess cluster managing a commerce keyspace with a primary and two replicas. Your application is issuing SELECT queries. Vitess, by default, will route these SELECT statements to the replica instances.

{
  "topology": {
    "cell1": {
      "keyspaces": {
        "commerce": {
          "shards": {
            "-80": {
              "primary": "zone1-vt-tablet-0000000100.zk.local:15999",
              "replicas": [
                "zone1-vt-tablet-0000000101.zk.local:15999",
                "zone1-vt-tablet-0000000102.zk.local:15999"
              ]
            }
          }
        }
      }
    }
  }
}

This JSON snippet, representing a simplified Vitess topology, shows a single shard -80 for the commerce keyspace. It clearly identifies the primary tablet and lists two replica tablets. When a SELECT query arrives at a vtgate process, vtgate consults this topology information. If the query is a read operation (like SELECT), it will choose one of the replica tablets (e.g., zone1-vt-tablet-0000000101.zk.local:15999) to execute it. Write operations (INSERT, UPDATE, DELETE) are always sent to the primary.

The core problem Vitess solves here is database overload. In a traditional setup, all traffic hits the single primary instance, leading to performance bottlenecks as read volume grows. By sending reads to replicas, you distribute the load. Replicas are typically configured to lag slightly behind the primary, which is acceptable for many read-heavy workloads where absolute real-time consistency isn’t critical. This lag is managed by Vitess’s replication stream, which watches the primary’s binlog and applies changes to the replicas.

The key levers you control are primarily within your vschema (Vitess Schema) and the configuration of your vtgate and vtctld services. For read-write splitting to work, your keyspace must be sharded or unsharded. The vschema defines how your data is sharded and which tables are served by which shards. Vitess uses this to determine which tablet to send traffic to.

For example, consider a vschema.json snippet for our commerce keyspace:

{
  "sharded": true,
  "vindexes": {
    "user_id_vdx": {
      "type": "numeric_hash",
      "params": {
        "shard_bits": "3"
      }
    }
  },
  "tables": {
    "users": {
      "column_vindexes": [
        {
          "column": "user_id",
          "name": "user_id_vdx"
        }
      ]
    },
    "orders": {
      "column_vindexes": [
        {
          "column": "user_id",
          "name": "user_id_vdx"
        }
      ]
    }
  }
}

This vschema indicates that commerce is sharded, and user_id (used by users and orders tables) is the vindex for sharding. Vitess will use this to route queries involving user_id to the appropriate shard. If the query is a read and the shard has replicas, reads go to them.

The magic happens within vtgate. When vtgate receives a query, it first determines which shard(s) the query needs to be routed to. This is based on the vschema and any provided bind variables (e.g., a WHERE user_id = ? clause). Once the shard is identified, vtgate checks the topology for that shard. If the query is a SELECT and the shard has replica tablets registered, vtgate will select one of the replicas to execute the query. It uses a simple round-robin strategy by default to distribute read load across available replicas for that shard.

The actual selection logic in vtgate involves looking at the TabletTypes registered for a given shard. By default, vtgate is configured to prefer REPLICA over PRIMARY for read operations. If there are no REPLICA tablets available for a shard, or if the query has specific requirements that necessitate reading from the primary (like certain types of transactions or queries that cannot tolerate replica lag), vtgate will fall back to the PRIMARY. This fallback behavior is crucial for maintaining correctness even when replicas are unavailable.

What most people don’t realize is that the "read-write splitting" is not a separate service or a complex algorithmic decision; it’s a direct consequence of vtgate’s query routing logic combined with the TabletTypes it discovers in the topology. vtgate is simply instructed to send SELECT queries to REPLICA tablets if they exist and are healthy, and INSERT/UPDATE/DELETE queries to PRIMARY tablets. The system is designed so that if REPLICA tablets are available, they are preferred for reads.

The next step is understanding how to configure and monitor replica lag.

More Deep Dives in Vitess