LocalRelation Leaf Logical Operator¶
LocalRelation
is a leaf logical operator that represents a scan over local collections (and allow for optimizations so functions like collect
or take
can be executed locally on the driver and with no executors).
LocalRelation
is <
-
ResolveInlineTables logical resolution rule is executed (and converts an UnresolvedInlineTable)
-
PruneFilters, ConvertToLocalRelation, and PropagateEmptyRelation, OptimizeMetadataOnlyQuery logical optimization rules are executed (applied to an analyzed logical plan)
-
<
>, < >, < > operators are used -
CatalogImpl
is requested for a Dataset from DefinedByConstructorParams data -
Dataset
is requested for the <> (and executes < > logical operators) -
StatFunctions
is requested to <> and < >
NOTE: Dataset
is <LocalRelation
.
[source, scala]¶
val data = Seq(1, 3, 4, 7) val nums = data.toDF
scala> :type nums org.apache.spark.sql.DataFrame
val plan = nums.queryExecution.analyzed scala> println(plan.numberedTreeString) 00 LocalRelation [value#1]
import org.apache.spark.sql.catalyst.plans.logical.LocalRelation val relation = plan.collect { case r: LocalRelation => r }.head assert(relation.isInstanceOf[LocalRelation])
val sql = relation.toSQL(inlineTableName = "demo") assert(sql == "VALUES (1), (3), (4), (7) AS demo(value)")
val stats = relation.computeStats scala> println(stats) Statistics(sizeInBytes=48.0 B, hints=none)
LocalRelation
is resolved to <
[source, scala]¶
import org.apache.spark.sql.catalyst.plans.logical.LocalRelation assert(relation.isInstanceOf[LocalRelation])
scala> :type spark org.apache.spark.sql.SparkSession
import spark.sessionState.planner.BasicOperators val localScan = BasicOperators(relation).head
import org.apache.spark.sql.execution.LocalTableScanExec assert(localScan.isInstanceOf[LocalTableScanExec])
[[computeStats]] When requested for <LocalRelation
takes the size of the objects in a single row (per the <
=== [[creating-instance]] Creating LocalRelation Instance
LocalRelation
takes the following to be created:
- [[output]] Output schema spark-sql-Expression-Attribute.md[attributes]
- [[data]] InternalRows
- [[isStreaming]]
isStreaming
flag that indicates whether the <> comes from a streaming source (default:false
)
While being created, LocalRelation
makes sure that the <
Unresolved attributes found when constructing LocalRelation.
=== [[apply]] Creating LocalRelation -- apply
Object Method
[source, scala]¶
apply(output: Attribute*): LocalRelation apply( output1: StructField, output: StructField*): LocalRelation
apply
...FIXME
NOTE: apply
is used when...FIXME
=== [[fromExternalRows]] Creating LocalRelation -- fromExternalRows
Object Method
[source, scala]¶
fromExternalRows( output: Seq[Attribute], data: Seq[Row]): LocalRelation
fromExternalRows
...FIXME
NOTE: fromExternalRows
is used when...FIXME
=== [[fromProduct]] Creating LocalRelation -- fromProduct
Object Method
[source, scala]¶
fromProduct( output: Seq[Attribute], data: Seq[Product]): LocalRelation
fromProduct
...FIXME
NOTE: fromProduct
is used when...FIXME
=== [[toSQL]] Generating SQL Statement -- toSQL
Method
[source, scala]¶
toSQL(inlineTableName: String): String¶
toSQL
generates a SQL statement of the format:
VALUES [data] AS [inlineTableName]([names])
toSQL
throws an AssertionError
for the <> empty.
NOTE: toSQL
does not seem to be used at all.