Vitess Aggregation Pushdown is a feature that can dramatically speed up queries involving aggregate functions (like COUNT, SUM, AVG) across multiple shards by pushing those calculations down to the individual shards before combining the results.
Let’s see Vitess aggregation pushdown in action. Imagine you have a users table sharded by user_id, and you want to count how many users are in each country. Without pushdown, Vitess would fetch all user records from all shards to the vtgateserver, and then perform the COUNT(DISTINCT country) operation. This is incredibly inefficient for large datasets.
Here’s a simplified vtgate log snippet illustrating what happens without aggregation pushdown for SELECT country, COUNT(*) FROM users GROUP BY country:
vtgate(1) ... query="SELECT country, COUNT(*) FROM users GROUP BY country"
vtgate(1) ... dispatching to shard shard1: SELECT country, COUNT(*) FROM users GROUP BY country
vtgate(1) ... dispatching to shard shard2: SELECT country, COUNT(*) FROM users GROUP BY country
vtgate(1) ... received results from shard1
vtgate(1) ... received results from shard2
vtgate(1) ... merging results: COUNT(*) from shard1 + COUNT(*) from shard2
Notice how the GROUP BY and COUNT(*) are executed on each shard. This is the core idea of pushdown. With aggregation pushdown enabled, Vitess is smart enough to realize that COUNT(*) can be computed independently on each shard, and only the counts need to be sent back to vtgate for a final summation.
The transformed vtgate log with pushdown enabled would look more like this:
vtgate(1) ... query="SELECT country, COUNT(*) FROM users GROUP BY country"
vtgate(1) ... dispatching to shard1: SELECT country, COUNT(*) FROM users GROUP BY country
vtgate(1) ... dispatching to shard2: SELECT country, COUNT(*) FROM users GROUP BY country
vtgate(1) ... received partial counts from shard1: { 'USA': 1000, 'CAN': 500 }
vtgate(1) ... received partial counts from shard2: { 'USA': 1500, 'MEX': 300 }
vtgate(1) ... merging partial counts: { 'USA': 1000 + 1500, 'CAN': 500, 'MEX': 300 }
vtgate(1) ... final result: { 'USA': 2500, 'CAN': 500, 'MEX': 300 }
This is a massive win. Instead of transferring millions of rows, Vitess only transfers a few aggregated rows per shard.
The problem Vitess aggregation pushdown solves is the scalability bottleneck of performing operations on a centralized server when the data is distributed. Traditional sharded databases often struggle with cross-shard aggregations because the coordinator node becomes a hot spot for data processing and network traffic. Vitess’s approach distributes this processing burden.
Internally, Vitess’s query planner identifies aggregate queries where the GROUP BY keys are consistent across shards or the aggregate function itself is commutative and associative (like SUM, COUNT, AVG). When such a query is executed, vtgate rewrites the query. It transforms the original query into a form that can be executed on each shard, and then it defines a "merge query" that combines the results from the shards. This merge query is what vtgate executes locally after receiving partial results.
The exact levers you control are primarily through the vtgate configuration. Aggregation pushdown is generally enabled by default in modern Vitess versions, but you can influence its behavior. The key setting is vtgate.enable_aggregation_pushdown, which defaults to true. You can explicitly set this to false in your vtgate configuration file if you need to disable it for specific troubleshooting or if you encounter unexpected behavior.
# Example vtgate.yaml snippet
enable_aggregation_pushdown: true
Another crucial aspect is the vtgate version. Older versions might not support pushdown for all aggregate functions or GROUP BY clauses. Always ensure you are on a recent, stable Vitess release to benefit from the latest query optimizations.
The magic behind aggregation pushdown isn’t just about sending fewer rows; it’s about the query planner’s ability to decompose a complex query. For example, a query like SELECT COUNT(DISTINCT user_id), AVG(age) FROM users GROUP BY country might be decomposed. The COUNT(DISTINCT user_id) might be pushed down as COUNT(user_id) on each shard, and then the vtgate performs a COUNT(DISTINCT ...) on the partial counts. Similarly, AVG(age) would be calculated as SUM(age) / COUNT(*) on each shard, and then vtgate would combine SUM(total_age) and SUM(total_count) to compute the final AVG. This ability to break down and rebuild complex aggregates is what makes it so powerful.
While Vitess tries its best to push down aggregates, there are limitations. Queries involving complex subqueries, window functions, or certain types of non-deterministic functions might not be eligible for pushdown. The GROUP BY clause needs to be compatible with the sharding key, or the aggregate function must be one that Vitess knows how to merge correctly. If Vitess cannot push down an aggregation, it will fall back to the less efficient method of pulling all data to vtgate for processing.
The next concept you’ll likely encounter is how Vitess handles JOIN operations across shards, especially when combined with aggregations.