Skip to content

LogicalRelation Leaf Logical Operator

LogicalRelation is a leaf logical operator that represents a BaseRelation in a logical query plan.

LogicalRelation is a MultiInstanceRelation.

Creating Instance

LogicalRelation takes the following to be created:

LogicalRelation is created using apply factory.

apply Utility

apply(
  relation: BaseRelation,
  isStreaming: Boolean = false): LogicalRelation
apply(
  relation: BaseRelation,
  table: CatalogTable): LogicalRelation

apply wraps the given BaseRelation into a LogicalRelation (so it could be used in a logical query plan).

apply creates a LogicalRelation for the given BaseRelation (with a CatalogTable and isStreaming flag).

import org.apache.spark.sql.sources.BaseRelation
val baseRelation: BaseRelation = ???

val data = spark.baseRelationToDataFrame(baseRelation)

apply is used when:

refresh

refresh(): Unit

refresh is part of LogicalPlan abstraction.

refresh requests the FileIndex (of the HadoopFsRelation) to refresh.

Note

refresh does the work for HadoopFsRelation relations only.

Simple Text Representation

simpleString(
  maxFields: Int): String

simpleString is part of the QueryPlan abstraction.

simpleString is made up of the output schema (truncated to maxFields) and the relation:

Relation[[output]] [relation]

Demo

val q = spark.read.text("README.md")
val logicalPlan = q.queryExecution.logical

scala> println(logicalPlan.simpleString)
Relation[value#2] text

Demo

The following are two logically-equivalent batch queries described using different Spark APIs: Scala and SQL.

val format = "csv"
val path = "../datasets/people.csv"
val q = spark
  .read
  .option("header", true)
  .format(format)
  .load(path)
scala> println(q.queryExecution.logical.numberedTreeString)
00 Relation[id#16,name#17] csv
val q = sql(s"select * from `$format`.`$path`")
scala> println(q.queryExecution.optimizedPlan.numberedTreeString)
00 Relation[_c0#74,_c1#75] csv

Last update: 2021-03-11
Back to top