Vitess online DDL lets you perform schema changes on your MySQL databases without taking locks that block writes.
This is what it looks like in practice. Imagine you have a users table and need to add an index.
-- Original table
CREATE TABLE users (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255),
email VARCHAR(255)
);
-- Let's add an index on the email column using Vitess online DDL
-- You'd typically run this via vtctlclient or the API
-- Example command (simplified for illustration):
-- vtctlclient --server <vtctld_host>:<vtctld_port> ApplySchema -keyspace <your_keyspace> -sql "ALTER TABLE users ADD INDEX idx_email (email)"
When you issue an ALTER TABLE statement through Vitess, it doesn’t immediately run the SQL against your MySQL instances. Instead, it orchestrates a multi-step process. First, it creates a new table with the desired schema, then it copies data from the old table to the new one, and finally, it swaps the tables. Crucially, during the data copying phase, it uses triggers to capture and apply ongoing changes from the original table to the new one, ensuring consistency. This whole process happens in the background, and your application can continue reading and writing to the users table without interruption. Once the data is fully synced and validated, Vitess performs a quick atomic rename to switch the old table for the new one.
The core problem Vitess online DDL solves is the downtime traditionally associated with schema changes on large, high-traffic MySQL databases. A standard ALTER TABLE statement on a busy table can acquire an ACCESS EXCLUSIVE lock, blocking all reads and writes for the duration of the operation, which can be hours for large datasets. This is unacceptable for most modern applications. Vitess bypasses this by managing the schema change externally, performing the heavy lifting (table creation, data copy) without touching the live production table until the very last, atomic step.
Internally, Vitess uses a dedicated schema_changer process that interacts with each MySQL instance. When you request an online DDL, Vitess creates a temporary table with the new schema. Then, it initiates a background process to copy data from the original table to the temporary one. To ensure data consistency during this copy, it installs triggers on the original table that capture INSERT, UPDATE, and DELETE operations and apply them to the temporary table. Once the data copy is complete and the temporary table is fully synchronized, Vitess coordinates a swift, atomic table rename operation on each MySQL instance. This rename is typically a metadata-only operation and is nearly instantaneous, thus avoiding any significant locking or downtime.
The key levers you control are the ALTER TABLE statements themselves and the configuration of the Vitess vtctld and vtgate components, particularly around the online_ddl feature. You can specify gh-ost or pt-online-schema-change as the underlying tool, though Vitess’s native implementation is increasingly preferred. You can also configure timeouts, retry strategies, and the specific keyspaces and tables that Vitess should manage for online DDL.
The ALTER TABLE statement itself is the primary interface. For example, to add a VARCHAR column:
-- Applying a new column
-- vtctlclient --server <vtctld_host>:<vtctld_port> ApplySchema -keyspace <your_keyspace> -sql "ALTER TABLE users ADD COLUMN created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
Or to change a column’s type, which is a more complex operation that Vitess handles gracefully:
-- Changing column type (Vitess handles the data migration)
-- vtctlclient --server <vtctld_host>:<vtctld_port> ApplySchema -keyspace <your_keyspace> -sql "ALTER TABLE users MODIFY COLUMN name VARCHAR(512)"
Vitess’s online DDL system is not merely a wrapper around existing tools; it’s a deeply integrated component that understands Vitess’s sharding and replication topology. It ensures that schema changes are applied consistently across all shards of a keyspace, coordinating the operation at the keyspace level rather than just per-instance. This is crucial for maintaining data integrity and application availability in a distributed database system.
One critical aspect often overlooked is how Vitess manages the lifecycle of these online DDL operations. If a DDL job fails mid-way, Vitess doesn’t leave your schema in an inconsistent state. It has built-in mechanisms for detecting failures, attempting retries, and, if necessary, rolling back the operation gracefully. This is achieved by tracking the state of each DDL job in vtctld’s internal state and using the same trigger and shadow table mechanism to revert changes if a rollback is initiated.
The next concept to explore is how Vitess handles more complex schema evolutions, such as dropping columns or changing primary keys, and the implications of these on application code.