CollectMetricsExec Physical Operator¶
CollectMetricsExec
is a unary physical operator.
Creating Instance¶
CollectMetricsExec
takes the following to be created:
- Name
- Metric NamedExpressions
- Child physical operator
CollectMetricsExec
is created when BasicOperators execution planning strategy is executed (and plans a CollectMetrics logical operator).
Collected metrics Accumulator¶
CollectMetricsExec
registers an AggregatingAccumulator
accumulator under the name Collected metrics.
AggregatingAccumulator
is created for the metric expressions and the child physical operator's output attributes.
Executing Physical Operator¶
doExecute(): RDD[InternalRow]
doExecute
resets the Collected metrics Accumulator.
doExecute
requests the child physical operator to execute and RDD.mapPartitions
so that:
- A new per-partition
AggregatingAccumulator
(calledupdater
) is requested tocopyAndReset
- The value of the accumulator is published only when a task is completed
- For every row, the per-partition
AggregatingAccumulator
is requested to add it (that updates ImperativeAggregates and TypedImperativeAggregates)
doExecute
is part of the SparkPlan abstraction.
Last update: 2020-09-03