Skip to content


[[creating-instance]] CodegenContext takes no input parameters.

[source, scala]

import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext val ctx = new CodegenContext

CodegenContext is <> when:

  • WholeStageCodegenExec physical operator is requested to[generate a Java source code for the child operator] (when WholeStageCodegenExec is[executed])

  • CodeGenerator is requested for a new CodegenContext

  • GenerateUnsafeRowJoiner is requested for a UnsafeRowJoiner

CodegenContext stores expressions that don't support codegen.

.Example of CodegenContext.subexpressionElimination (through CodegenContext.generateExpressions) [source, scala]

import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext val ctx = new CodegenContext

// Use Catalyst DSL import org.apache.spark.sql.catalyst.dsl.expressions._ val expressions = "hello""world") :: "hello""world") :: Nil

// FIXME Use a real-life query to extract the expressions

// CodegenContext.subexpressionElimination (where the elimination all happens) is a private method // It is used exclusively in CodegenContext.generateExpressions which is public // and does the elimination when it is enabled

// Note the doSubexpressionElimination flag is on // Triggers the subexpressionElimination private method ctx.generateExpressions(expressions, doSubexpressionElimination = true)

// subexpressionElimination private method uses ctx.equivalentExpressions val commonExprs = ctx.equivalentExpressions.getAllEquivalentExprs

assert(commonExprs.length > 0, "No common expressions found")

[[internal-registries]] .CodegenContext's Internal Properties (e.g. Registries, Counters and Flags) [cols="1,2",options="header",width="100%"] |=== | Name | Description

| classFunctions | [[classFunctions]] Mutable Scala Map with function names, their Java source code and a class name

New entries are added when CodegenContext is requested to <> and <>

Used when CodegenContext is requested to <>

| equivalentExpressions a| [[equivalentExpressions]] EquivalentExpressions

Expressions are added and then fetched as equivalent sets when CodegenContext is requested to <> (for <> with[subexpression elimination] enabled)

| currentVars | [[currentVars]] The list of generated columns as input of current operator

| INPUT_ROW | [[INPUT_ROW]] The variable name of the input row of the current operator

| placeHolderToComments | [[placeHolderToComments]][[getPlaceHolderToComments]]

Placeholders and their comments

Used when...FIXME

| references a| [[references]] References that are used to generate classes in the following code generators:

| subExprEliminationExprs | [[subExprEliminationExprs]] SubExprEliminationStates by expressions/[Expression]

Used when...FIXME

| subexprFunctions | [[subexprFunctions]] Names of the functions that...FIXME |===

=== [[generateExpressions]] Generating Java Source Code For Code-Generated Evaluation of Multiple Expressions (With Optional Subexpression Elimination) -- generateExpressions Method

[source, scala]

generateExpressions( expressions: Seq[Expression], doSubexpressionElimination: Boolean = false): Seq[ExprCode]

(only with[subexpression elimination] enabled) generateExpressions does <> of the input expressions.

In the end, generateExpressions requests every expressions to expressions/[generate the Java source code for code-generated (non-interpreted) expression evaluation].

generateExpressions is used when:

=== [[addReferenceObj]] addReferenceObj Method

[source, scala]

addReferenceObj(objName: String, obj: Any, className: String = null): String


NOTE: addReferenceObj is used when...FIXME

=== [[subexpressionEliminationForWholeStageCodegen]] subexpressionEliminationForWholeStageCodegen Method

[source, scala]

subexpressionEliminationForWholeStageCodegen(expressions: Seq[Expression]): SubExprCodes


subexpressionEliminationForWholeStageCodegen is used when HashAggregateExec is requested to generate a Java source code for whole-stage consume path (with grouping keys or not).

=== [[addNewFunction]] Adding Function to Generated Class -- addNewFunction Method

[source, scala]

addNewFunction( funcName: String, funcCode: String, inlineToOuterClass: Boolean = false): String


NOTE: addNewFunction is used when...FIXME

=== [[subexpressionElimination]] subexpressionElimination Internal Method

[source, scala]

subexpressionElimination(expressions: Seq[Expression]): Unit

subexpressionElimination requests <> to addExprTree for every expression (in the input expressions).

subexpressionElimination requests <> for the equivalent sets of expressions with at least two equivalent expressions (aka common expressions).

For every equivalent expression set, subexpressionElimination does the following:

. Takes the first expression and requests it to expressions/[generate a Java source code] for the expression tree

. <> and adds it to <>

. Creates a SubExprEliminationState and adds it with every common expression in the equivalent expression set to <>

NOTE: subexpressionElimination is used exclusively when CodegenContext is requested to <> (with[subexpression elimination] enabled).

=== [[addMutableState]] Adding Mutable State -- addMutableState Method

[source, scala]

addMutableState( javaType: String, variableName: String, initFunc: String => String = _ => "", forceInline: Boolean = false, useFreshName: Boolean = true): String


[source, scala]

val input = ctx.addMutableState("scala.collection.Iterator", "input", v => s"$v = inputs[0];")

NOTE: addMutableState is used when...FIXME

=== [[addImmutableStateIfNotExists]] Adding Immutable State (Unless Exists Already) -- addImmutableStateIfNotExists Method

[source, scala]

addImmutableStateIfNotExists( javaType: String, variableName: String, initFunc: String => String = _ => ""): Unit


[source, scala]

val ctx: CodegenContext = ??? val partitionMaskTerm = "partitionMask" ctx.addImmutableStateIfNotExists(ctx.JAVA_LONG, partitionMaskTerm)

NOTE: addImmutableStateIfNotExists is used when...FIXME

=== [[freshName]] freshName Method

[source, scala]

freshName(name: String): String


NOTE: freshName is used when...FIXME

=== [[addNewFunctionToClass]] addNewFunctionToClass Internal Method

[source, scala]

addNewFunctionToClass( funcName: String, funcCode: String, className: String): mutable.Map[String, mutable.Map[String, String]]


NOTE: addNewFunctionToClass is used when...FIXME

=== [[addClass]] addClass Internal Method

[source, scala]

addClass(className: String, classInstance: String): Unit


NOTE: addClass is used when...FIXME

=== [[declareAddedFunctions]] declareAddedFunctions Method

[source, scala]

declareAddedFunctions(): String


NOTE: declareAddedFunctions is used when...FIXME

=== [[declareMutableStates]] declareMutableStates Method

[source, scala]

declareMutableStates(): String


NOTE: declareMutableStates is used when...FIXME

=== [[initMutableStates]] initMutableStates Method

[source, scala]

initMutableStates(): String


NOTE: initMutableStates is used when...FIXME

=== [[initPartition]] initPartition Method

[source, scala]

initPartition(): String


NOTE: initPartition is used when...FIXME

=== [[emitExtraCode]] emitExtraCode Method

[source, scala]

emitExtraCode(): String


NOTE: emitExtraCode is used when...FIXME

=== [[addPartitionInitializationStatement]] addPartitionInitializationStatement Method

[source, scala]

addPartitionInitializationStatement(statement: String): Unit


NOTE: addPartitionInitializationStatement is used when...FIXME

Last update: 2020-11-15