Skip to content

CoalesceShufflePartitions Adaptive Physical Optimization

CoalesceShufflePartitions is a physical optimization in Adaptive Query Execution.

CoalesceShufflePartitions can be turned on and off using spark.sql.adaptive.coalescePartitions.enabled configuration property.

Creating Instance

CoalesceShufflePartitions takes the following to be created:

CoalesceShufflePartitions is created when:

Executing Rule

apply(
  plan: SparkPlan): SparkPlan

spark.sql.adaptive.coalescePartitions.enabled Configuration Property

With spark.sql.adaptive.coalescePartitions.enabled configuration property disabled (false), apply does nothing and simply gives the input SparkPlan back unmodified.

apply makes sure that one of the following holds or does nothing (and simply gives the input SparkPlan back unmodified):

  1. All the leaves in the query plan are QueryStageExec physical operators
  2. No CustomShuffleReaderExec physical operator in the given query plan

apply collects all ShuffleQueryStageExec leaf physical operators in the input physical query plan.

apply makes sure that the following holds or does nothing (and simply gives the input SparkPlan back unmodified):

  1. All ShuffleQueryStageExec leaf physical operators use ShuffleExchangeExec unary physical operators that have canChangeNumPartitions flag enabled

apply...FIXME

apply is part of the Rule abstraction.

updateShuffleReads

updateShuffleReads(
  plan: SparkPlan,
  specsMap: Map[Int, Seq[ShufflePartitionSpec]]): SparkPlan

updateShuffleReads...FIXME

updateShuffleReads is used when:

  • FIXME
Back to top