The most surprising thing about Vitess cross-shard transactions is that they often don’t use the two-phase commit (2PC) protocol you’re probably thinking of.

Let’s see Vitess in action. Imagine you have two unsharded keyspaces, commerce and products. You want to update an order in commerce and decrement the stock in products atomically.

-- Transaction started on a VTGate
BEGIN;

-- First statement, targets commerce keyspace
UPDATE commerce.orders SET status = 'shipped' WHERE order_id = 12345;

-- Second statement, targets products keyspace
UPDATE products.inventory SET stock = stock - 1 WHERE product_id = 67890 AND stock > 0;

-- If both succeed, COMMIT;

Vitess intercepts this. The VTGate, acting as the transaction coordinator, doesn’t immediately send these statements to the VReplication workers. Instead, it first records the intent of the transaction. For each statement, it identifies the target keyspace and table, and the specific row(s) being modified.

Here’s the crucial part: Vitess doesn’t typically initiate a full-blown distributed 2PC across all involved shards. That would be incredibly slow and complex. Instead, it uses a technique often called "optimistic concurrency control" with a "commit-wait" mechanism.

When you issue COMMIT;, the VTGate does the following:

  1. Per-Shard Commits: It sends the COMMIT command to each involved shard (VReplication worker) independently. Each shard attempts to commit its portion of the transaction locally.
  2. Write-Ahead Log (WAL) / Transaction Log: Crucially, before any shard commits locally, it writes the transaction details to its own durable transaction log. This log acts as the source of truth for recovery.
  3. Coordinator Waits: The VTGate waits for confirmation from all involved shards that they have successfully committed their local changes and durably logged them.
  4. Global Commit Record: Once the VTGate receives positive acknowledgments from all shards, it marks the entire distributed transaction as globally committed. This global commit record is also durably stored.

If any shard fails after durably logging its changes but before acknowledging success to the VTGate, the VTGate will detect the failure. At this point, the transaction is considered globally aborted from the perspective of the application. However, the changes on the successful shards are not automatically rolled back.

This is where the "atomic commit" comes into play, but it’s achieved through a recovery process, not a pre-commit phase. If a VTGate or VReplication worker crashes during the commit process:

  • Recovery Process: Upon restart, each VReplication worker consults its local transaction log.
  • Global State Check: It then queries the VTGate (or a distributed consensus system like etcd/Zookeeper if configured for global transaction coordination) to determine the final global state of the transaction it was part of.
  • Apply or Rollback: If the global state is "committed," the worker applies its logged changes. If the global state is "aborted," the worker rolls back any uncommitted local changes.

This system ensures atomicity by making the final decision (commit or abort) only after all participants have durably recorded their intent, and then enforcing that decision during recovery. It’s "optimistic" because it assumes most transactions will succeed and avoids the blocking nature of traditional 2PC. The "commit-wait" is the mechanism that ensures all participants agree before the VTGate declares the transaction globally successful.

The most common point of confusion is how Vitess guarantees atomicity when shards commit independently. The answer lies in the persistent transaction log on each shard and the subsequent recovery mechanism that enforces a globally agreed-upon outcome.

The next concept you’ll likely encounter is how Vitess handles transaction conflicts and deadlocks in this optimistic environment, especially when multiple transactions attempt to modify the same data concurrently.

Want structured learning?

Take the full Vitess course →