FlatMapGroupsWithState Unary Logical Operator¶
FlatMapGroupsWithState
is a unary logical operator that represents the following operators in a logical query plan of a streaming query:
Note
A unary logical operator (UnaryNode
) is a logical operator with a single <
Read up on UnaryNode (and logical operators in general) in The Internals of Spark SQL online book.
Execution Planning¶
FlatMapGroupsWithState
is resolved (planned) to:
-
FlatMapGroupsWithStateExec unary physical operator for streaming datasets (in FlatMapGroupsWithStateStrategy execution planning strategy)
-
MapGroupsExec
physical operator for batch datasets (inBasicOperators
execution planning strategy)
Creating Instance¶
FlatMapGroupsWithState
takes the following to be created:
- State function (
(Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any]
) - Catalyst Expression for keys
- Catalyst Expression for values
- Grouping Attributes
- Data Attributes
- Output Object Attribute
- State
ExpressionEncoder
- OutputMode
-
isMapGroupsWithState
flag (default:false
) - GroupStateTimeout
- Child logical operator
FlatMapGroupsWithState
is created (using apply factory method) for KeyValueGroupedDataset.mapGroupsWithState and KeyValueGroupedDataset.flatMapGroupsWithState operators.
Creating SerializeFromObject with FlatMapGroupsWithState¶
apply[K: Encoder, V: Encoder, S: Encoder, U: Encoder](
func: (Any, Iterator[Any], LogicalGroupState[Any]) => Iterator[Any],
groupingAttributes: Seq[Attribute],
dataAttributes: Seq[Attribute],
outputMode: OutputMode,
isMapGroupsWithState: Boolean,
timeout: GroupStateTimeout,
child: LogicalPlan): LogicalPlan
apply
creates a SerializeFromObject
logical operator with a FlatMapGroupsWithState
as its child logical operator.
Internally, apply
creates SerializeFromObject
object consumer (aka unary logical operator) with FlatMapGroupsWithState
logical plan.
Internally, apply
finds ExpressionEncoder
for the type S
and creates a FlatMapGroupsWithState
with UnresolvedDeserializer
for the types K
and V
.
In the end, apply
creates a SerializeFromObject
object consumer with the FlatMapGroupsWithState
.
apply
is used for KeyValueGroupedDataset.mapGroupsWithState and KeyValueGroupedDataset.flatMapGroupsWithState operators.