Skip to content

ExtractJoinWithBuckets Scala Extractor

Destructuring BaseJoinExec

unapply(
  plan: SparkPlan): Option[(BaseJoinExec, Int, Int)]

unapply makes sure that the given SparkPlan is a BaseJoinExec and applicable.

If so, unapply getBucketSpec for the left and right join child operators.

unapply...FIXME

unapply is used when:

isApplicable

isApplicable(
  j: BaseJoinExec): Boolean

isApplicable is true when the following all hold:

  1. The given BaseJoinExec physical operator is either a SortMergeJoinExec or a ShuffledHashJoinExec

  2. The left side of the join has a FileSourceScanExec operator

  3. The right side of the join has a FileSourceScanExec operator

  4. satisfiesOutputPartitioning on the leftKeys and the outputPartitioning of the left join operator

  5. satisfiesOutputPartitioning on the rightKeys and the outputPartitioning of the right join operator

hasScanOperation

hasScanOperation(
  plan: SparkPlan): Boolean

hasScanOperation holds true for SparkPlan physical operators that are FileSourceScanExecs (possibly as the children of FilterExecs and ProjectExecs).

satisfiesOutputPartitioning

satisfiesOutputPartitioning(
  keys: Seq[Expression],
  partitioning: Partitioning): Boolean

satisfiesOutputPartitioning holds true for HashPartitioning partitionings that match the given join keys (their number and equivalence).

Bucket Spec of FileSourceScanExec Operator

getBucketSpec(
  plan: SparkPlan): Option[BucketSpec]

getBucketSpec finds the FileSourceScanExec operator (in the given SparkPlan) with a non-empty bucket spec but an empty optionalNumCoalescedBuckets. When found, getBucketSpec returns the non-empty bucket spec.


Last update: 2021-05-08