ColumnarToRowExec Physical Operator¶
ColumnarToRowExec
is a unary physical operator for Columnar Processing.
ColumnarToRowExec
supports Whole-Stage Java Code Generation.
Creating Instance¶
ColumnarToRowExec
takes the following to be created:
- Child physical operator
ColumnarToRowExec
requires that the child physical operator supportsColumnar.
ColumnarToRowExec
is created when ApplyColumnarRulesAndInsertTransitions physical optimization is executed.
Performance Metrics¶
Key | Name (in web UI) | Description |
---|---|---|
numInputBatches | number of input batches | Number of input batches |
numOutputRows | number of output rows | Number of output rows (across all input batches) |
Executing Physical Operator¶
doExecute(): RDD[InternalRow]
doExecute
is part of the SparkPlan abstraction.
doExecute
requests the child physical operator to executeColumnar and RDD.mapPartitionsInternal
over batches (Iterator[ColumnarBatch]
) to "unpack" to rows. doExecute
counts the number of batches and rows (as the metrics).
Generating Java Source Code for Produce Path¶
doProduce(
ctx: CodegenContext): String
doProduce
is part of the CodegenSupport abstraction.
doProduce
...FIXME
Input RDDs¶
inputRDDs(): Seq[RDD[InternalRow]]
inputRDDs
is a single RDD[ColumnarBatch]
that the child physical operator gives when requested to executeColumnar.
inputRDDs
is part of the CodegenSupport abstraction.
canCheckLimitNotReached Flag¶
canCheckLimitNotReached: Boolean
canCheckLimitNotReached
is always true
.
canCheckLimitNotReached
is part of the CodegenSupport abstraction.
genCodeColumnVector Internal Method¶
genCodeColumnVector(
ctx: CodegenContext,
columnVar: String,
ordinal: String,
dataType: DataType,
nullable: Boolean): ExprCode
genCodeColumnVector
...FIXME
genCodeColumnVector
is used when ColumnarToRowExec
physical operator is requested to generate Java source code for produce path.