StreamingRelation Leaf Logical Operator

StreamingRelation is a leaf logical operator (i.e. LogicalPlan) that represents a streaming source in a logical plan.

StreamingRelation is <> when DataStreamReader is requested to load data from a streaming source and creates a streaming query.

StreamingRelation Represents Streaming Source

val rate = spark.
  readStream.     // <-- creates a DataStreamReader
  load("hello")   // <-- creates a StreamingRelation
scala> println(rate.queryExecution.logical.numberedTreeString)
00 StreamingRelation DataSource(org.apache.spark.sql.SparkSession@4e5dcc50,rate,List(),None,List(),None,Map(path -> hello),None), rate, [timestamp#0, value#1L]

[[isStreaming]] isStreaming flag is always enabled (i.e. true).

import org.apache.spark.sql.execution.streaming.StreamingRelation
val relation = rate.queryExecution.logical.asInstanceOf[StreamingRelation]
scala> relation.isStreaming
res1: Boolean = true

[[toString]] toString gives the <>.

scala> println(relation)

NOTE: StreamingRelation is resolved (aka planned) to StreamingExecutionRelation (right after StreamExecution starts running batches).

=== [[apply]] Creating StreamingRelation for DataSource -- apply Object Method

[source, scala]

apply(dataSource: DataSource): StreamingRelation

apply creates a StreamingRelation for the given DataSource (that represents a streaming source).

apply is used when DataStreamReader is requested for a streaming query.

Creating Instance

StreamingRelation takes the following when created:

  • [[dataSource]] DataSource
  • [[sourceName]] Short name of the streaming source
  • [[output]] Output attributes of the schema of the streaming source