DemoteBroadcastHashJoin Logical Optimization¶
Quoting What's new in Apache Spark 3.0 - demote broadcast hash join article:
This rule checks whether the nodes involved in the join have a lot of empty partitions. If it's the case, the rule adds a no broadcast hash join hint to prevent the broadcast strategy to be applied.
DemoteBroadcastHashJoin uses spark.sql.adaptive.nonEmptyPartitionRatioForBroadcastJoin configuration property for the threshold to demote a broadcast hash join.
DemoteBroadcastHashJoin takes the following to be created:
apply( plan: LogicalPlan): LogicalPlan
apply is part of the Rule abstraction.
shouldDemote Internal Method¶
shouldDemote( plan: LogicalPlan): Boolean
true when the ratio of the non-empty partitions to all the partitions (based on the
MapOutputStatistics) is below spark.sql.adaptive.nonEmptyPartitionRatioForBroadcastJoin configuration property.
shouldDemote is used when
DemoteBroadcastHashJoin optimization is executed.