Adaptive Query Execution (AQE)¶
Adaptive Query Execution (aka Adaptive Query Optimisation, Adaptive Optimisation, or AQE in short) is an optimisation of a physical query execution plan in the middle of query execution for alternative execution plans at runtime.
Adaptive Query Execution re-optimizes the query plan based on runtime statistics.
Quoting the description of a talk by the authors of Adaptive Query Execution:
At runtime, the adaptive execution mode can change shuffle join to broadcast join if it finds the size of one table is less than the broadcast threshold. It can also handle skewed input data for join and change the partition number of the next stage to better fit the data scale. In general, adaptive execution decreases the effort involved in tuning SQL query parameters and improves the execution performance by choosing a better execution plan and parallelism at runtime.
Adaptive Query Execution is disabled by default and can be enabled using spark.sql.adaptive.enabled configuration property.
InsertAdaptiveSparkPlan Physical Optimization¶
Adaptive Query Execution is applied to a physical query plan using the InsertAdaptiveSparkPlan physical optimization.
AdaptiveSparkPlanExec Physical Operator¶
Structured Streaming Not Supported¶
Adaptive Query Execution is not supported for streaming queries (Spark Structured Streaming).
Adaptive Query Execution notifies Spark listeners about a physical plan change using
Adaptive Query Execution uses logOnLevel to print out diagnostic messages to the log.
- An adaptive execution mode for Spark SQL by Carson Wang (Intel), Yucai Yu (Intel) at Strata Data Conference in Singapore, December 7, 2017