LogicalRelation Leaf Logical Operator -- Representing BaseRelations in Logical Plan¶
LogicalRelation
is a LeafNode.md[leaf logical operator] that represents a <
[source, scala]¶
val q1 = spark.read.option("header", true).csv("../datasets/people.csv") scala> println(q1.queryExecution.logical.numberedTreeString) 00 Relation[id#72,name#73,age#74] csv
val q2 = sql("select * from csv
.../datasets/people.csv
") scala> println(q2.queryExecution.optimizedPlan.numberedTreeString) 00 Relation[_c0#175,_c1#176,_c2#177] csv
LogicalRelation
is <
DataFrameReader
loads data from a data source that supports multiple paths (through SparkSession.md#baseRelationToDataFrame[SparkSession.baseRelationToDataFrame])DataFrameReader
is requested to load data from an external table using JDBC (through SparkSession.md#baseRelationToDataFrame[SparkSession.baseRelationToDataFrame])TextInputCSVDataSource
andTextInputJsonDataSource
are requested to infer schemaResolveSQLOnFile
converts a logical plan- FindDataSourceTable logical evaluation rule is executed
- hive/RelationConversions.md[RelationConversions] logical evaluation rule is executed
CreateTempViewUsing
logical command is requested to <> - Structured Streaming's
FileStreamSource
creates batches of records
[[simpleString]] The catalyst/QueryPlan.md#simpleString[simple text representation] of a LogicalRelation
(aka simpleString
) is Relation[output] [relation] (that uses the <
[source, scala]¶
val q = spark.read.text("README.md") val logicalPlan = q.queryExecution.logical scala> println(logicalPlan.simpleString) Relation[value#2] text
Creating Instance¶
LogicalRelation
takes the following when created:
- [[relation]] BaseRelation
- [[output]] Output schema
AttributeReferences
- [[catalogTable]] Optional CatalogTable
=== [[apply]] apply
Factory Utility
[source, scala]¶
apply( relation: BaseRelation, isStreaming: Boolean = false): LogicalRelation apply( relation: BaseRelation, table: CatalogTable): LogicalRelation
apply
<LogicalRelation
for the input BaseRelation (and CatalogTable or optional isStreaming
flag).
apply
is used when:
-
SparkSession
is requested for a DataFrame for a BaseRelation -
CreateTempViewUsing command is executed
-
ResolveSQLOnFile and FindDataSourceTable logical evaluation rules are executed
-
HiveMetastoreCatalog
is requested to convert a HiveTableRelation to a LogicalRelation over a HadoopFsRelation
=== [[refresh]] refresh
Method
[source, scala]¶
refresh(): Unit¶
NOTE: refresh
is part of spark-sql-LogicalPlan.md#refresh[LogicalPlan Contract] to refresh itself.
refresh
requests the FileIndex of a HadoopFsRelation
<
Note
refresh
does the work for HadoopFsRelation relations only.