Skip to content

Dynamic Partition Pruning

New in 3.0.0

Dynamic Partition Pruning (DPP) is an optimization of JOIN queries of partitioned tables using partition columns in a join condition. The idea is to push filter conditions down to the large fact table and reduce the number of rows to scan.

The best results are expected in JOIN queries between a large fact table and a much smaller dimension table (star-schema queries).

Dynamic Partition Pruning is applied to a query at logical optimization phase using PartitionPruning and CleanupDynamicPruningFilters optimization rules.

Dynamic Partition Pruning optimization is controlled by spark.sql.optimizer.dynamicPartitionPruning.enabled configuration property.




Last update: 2020-11-07