Skip to content

Generate Unary Logical Operator for Lateral Views

Generate is a[unary logical operator] that is <> to represent the following (after a logical plan is[analyzed]):

  • expressions/[Generator] or GeneratorOuter expressions (by ExtractGenerator logical evaluation rule)

  • SQL's sql/[LATERAL VIEW] clause (in SELECT or FROM clauses)

[[resolved]] resolved flag is...FIXME

NOTE: resolved is part of[LogicalPlan Contract] to...FIXME.

[[producedAttributes]] producedAttributes...FIXME

[[output]] The catalyst/[output schema] of a Generate is...FIXME


Generate logical operator is resolved to[GenerateExec] unary physical operator in BasicOperators execution planning strategy.


Use generate operator from Catalyst DSL to create a Generate logical operator, e.g. for testing or Spark SQL internals exploration.

[source, scala]

import org.apache.spark.sql.catalyst.plans.logical._ import org.apache.spark.sql.types._ val lr = LocalRelation(', 'values.array(StringType))

// JsonTuple generator import org.apache.spark.sql.catalyst.expressions.JsonTuple import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.expressions.Expression val children: Seq[Expression] = Seq("e") val json_tuple = JsonTuple(children)

import org.apache.spark.sql.catalyst.dsl.plans._ // ← gives generate val plan = lr.generate( generator = json_tuple, join = true, outer = true, alias = Some("alias"), outputNames = Seq.empty) scala> println(plan.numberedTreeString) 00 'Generate json_tuple(e), true, true, alias 01 +- LocalRelation , [key#0, values#1]


=== [[creating-instance]] Creating Generate Instance

Generate takes the following when created:

  • [[generator]] expressions/[Generator] expression
  • [[join]] join flag...FIXME
  • [[outer]] outer flag...FIXME
  • [[qualifier]] Optional qualifier
  • [[generatorOutput]] Output[attributes]
  • [[child]] Child[logical plan]

Generate initializes the <>.

Last update: 2021-02-18