Skip to content

ShuffleQueryStageExec Adaptive Leaf Physical Operator

ShuffleQueryStageExec is a QueryStageExec with either a ShuffleExchangeExec or a ReusedExchangeExec child operators.

Creating Instance

ShuffleQueryStageExec takes the following to be created:

ShuffleQueryStageExec is created when:

ShuffleExchangeLike

shuffle: ShuffleExchangeLike

ShuffleQueryStageExec initializes the shuffle internal registry when created.

ShuffleQueryStageExec assumes that the given physical operator is either a ShuffleExchangeLike or a ReusedExchangeExec and extracts the ShuffleExchangeLike.

If not, ShuffleQueryStageExec throws an IllegalStateException:

wrong plan for shuffle stage:
[tree]

shuffle is used when:

Shuffle MapOutputStatistics Future

shuffleFuture: Future[MapOutputStatistics]

shuffleFuture requests the ShuffleExchangeLike to submit a shuffle job

Lazy Value

shuffleFuture is a Scala lazy value to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.

Learn more in the Scala Language Specification.

shuffleFuture is used when:

Materializing

doMaterialize(): Future[Any]

doMaterialize returns the Shuffle MapOutputStatistics Future.

doMaterialize is part of the QueryStageExec abstraction.

Cancelling

cancel(): Unit

cancel cancels the Shuffle MapOutputStatistics Future (unless already completed).

cancel is part of the QueryStageExec abstraction.

newReuseInstance

newReuseInstance(
  newStageId: Int,
  newOutput: Seq[Attribute]): QueryStageExec

newReuseInstance is...FIXME

newReuseInstance is part of the QueryStageExec abstraction.

MapOutputStatistics

mapStats: Option[MapOutputStatistics]

mapStats assumes that the resultOption (with the MapOutputStatistics) is already available or throws an AssertionError:

assertion failed: ShuffleQueryStageExec should already be ready

mapStats takes a MapOutputStatistics from the resultOption.

mapStats is used when:

Back to top