Skip to content

LocalRelation Leaf Logical Operator

LocalRelation is a leaf logical operator that represents a scan over local collections (and allow for optimizations so functions like collect or take can be executed locally on the driver and with no executors).

LocalRelation is <> (using <>, <>, and <> factory methods) when:

NOTE: Dataset is <> when the <> is exactly an instance of LocalRelation.

[source, scala]

val data = Seq(1, 3, 4, 7) val nums = data.toDF

scala> :type nums org.apache.spark.sql.DataFrame

val plan = nums.queryExecution.analyzed scala> println(plan.numberedTreeString) 00 LocalRelation [value#1]

import org.apache.spark.sql.catalyst.plans.logical.LocalRelation val relation = plan.collect { case r: LocalRelation => r }.head assert(relation.isInstanceOf[LocalRelation])

val sql = relation.toSQL(inlineTableName = "demo") assert(sql == "VALUES (1), (3), (4), (7) AS demo(value)")

val stats = relation.computeStats scala> println(stats) Statistics(sizeInBytes=48.0 B, hints=none)

LocalRelation is resolved to <> leaf physical operator when BasicOperators execution planning strategy is executed (i.e. plan a <> to a <>).

[source, scala]

import org.apache.spark.sql.catalyst.plans.logical.LocalRelation assert(relation.isInstanceOf[LocalRelation])

scala> :type spark org.apache.spark.sql.SparkSession

import spark.sessionState.planner.BasicOperators val localScan = BasicOperators(relation).head

import org.apache.spark.sql.execution.LocalTableScanExec assert(localScan.isInstanceOf[LocalTableScanExec])

[[computeStats]] When requested for <>, LocalRelation takes the size of the objects in a single row (per the <> schema) and multiplies it by the number of rows (in the <>).

=== [[creating-instance]] Creating LocalRelation Instance

LocalRelation takes the following to be created:

  • [[output]] Output schema[attributes]
  • [[data]] InternalRows
  • [[isStreaming]] isStreaming flag that indicates whether the <> comes from a streaming source (default: false)

While being created, LocalRelation makes sure that the <> are all <> or throws an IllegalArgumentException:

Unresolved attributes found when constructing LocalRelation.

=== [[apply]] Creating LocalRelation -- apply Object Method

[source, scala]

apply(output: Attribute*): LocalRelation apply( output1: StructField, output: StructField*): LocalRelation


NOTE: apply is used when...FIXME

=== [[fromExternalRows]] Creating LocalRelation -- fromExternalRows Object Method

[source, scala]

fromExternalRows( output: Seq[Attribute], data: Seq[Row]): LocalRelation


NOTE: fromExternalRows is used when...FIXME

=== [[fromProduct]] Creating LocalRelation -- fromProduct Object Method

[source, scala]

fromProduct( output: Seq[Attribute], data: Seq[Product]): LocalRelation


NOTE: fromProduct is used when...FIXME

=== [[toSQL]] Generating SQL Statement -- toSQL Method

[source, scala]

toSQL(inlineTableName: String): String

toSQL generates a SQL statement of the format:

VALUES [data] AS [inlineTableName]([names])

toSQL throws an AssertionError for the <> empty.

NOTE: toSQL does not seem to be used at all.

Last update: 2020-11-07