Skip to content

StreamingGlobalLimitExec Unary Physical Operator

StreamingGlobalLimitExec is a unary physical operator that represents a Limit logical operator of a streaming query at execution time.

[NOTE]

A unary physical operator (UnaryExecNode) is a physical operator with a single <> physical operator.

Read up on https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-SparkPlan.html[UnaryExecNode] (and physical operators in general) in https://bit.ly/spark-sql-internals[The Internals of Spark SQL] book.

[NOTE]

Limit logical operator represents Dataset.limit operator in a logical query plan.

Read up on https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-LogicalPlan-Limit.html[Limit Logical Operator] in https://bit.ly/spark-sql-internals[The Internals of Spark SQL] book.

StreamingGlobalLimitExec is a <>.

StreamingGlobalLimitExec supports <> output mode only.

The optional properties, i.e. the <> and the <>, are initially undefined when StreamingGlobalLimitExec is <>. StreamingGlobalLimitExec is updated to hold execution-specific configuration when IncrementalExecution is requested to prepare the logical plan (of a streaming query) for execution (when the state preparation rule is executed).

Creating Instance

StreamingGlobalLimitExec takes the following to be created:

  • [[streamLimit]] Streaming Limit
  • [[child]] Child physical operator (SparkPlan)
  • [[stateInfo]] StatefulOperatorStateInfo (default: None)
  • [[outputMode]] OutputMode (default: None)

StreamingGlobalLimitExec is created when StreamingGlobalLimitStrategy execution planning strategy is requested to plan a Limit logical operator (in the logical plan of a streaming query) for execution.

=== [[StateStoreWriter]] StreamingGlobalLimitExec as StateStoreWriter

StreamingGlobalLimitExec is a stateful physical operator that can write to a state store.

=== [[metrics]] Performance Metrics

StreamingGlobalLimitExec uses the performance metrics of the parent StateStoreWriter.

=== [[doExecute]] Executing Physical Operator (Generating RDD[InternalRow]) -- doExecute Method

[source, scala]

doExecute(): RDD[InternalRow]

NOTE: doExecute is part of SparkPlan Contract to generate the runtime representation of an physical operator as a recipe for distributed computation over internal binary rows on Apache Spark (RDD[InternalRow]).

doExecute...FIXME

=== [[internal-properties]] Internal Properties

[cols="30m,70",options="header",width="100%"] |=== | Name | Description

| keySchema a| [[keySchema]] FIXME

Used when...FIXME

| valueSchema a| [[valueSchema]] FIXME

Used when...FIXME

|===