Skip to content

StateManager

StateManager is the <> of <> that act as middlemen between state stores and the FlatMapGroupsWithStateExec physical operator used in Arbitrary Stateful Streaming Aggregation.

[[contract]] .StateManager Contract [cols="30m,70",options="header",width="100%"] |=== | Method | Description

| getAllState a| [[getAllState]]

[source, scala]

getAllState(store: StateStore): Iterator[StateData]

Retrieves all state data (for all keys) from the StateStore

Used when InputProcessor is requested to processTimedOutState

| getState a| [[getState]]

[source, scala]

getState( store: StateStore, keyRow: UnsafeRow): StateData


Gets the state data for the key from the StateStore

Used exclusively when InputProcessor is requested to processNewData

| putState a| [[putState]]

[source, scala]

putState( store: StateStore, keyRow: UnsafeRow, state: Any, timeoutTimestamp: Long): Unit


Persists (puts) the state value for the key in the StateStore

Used exclusively when InputProcessor is requested to callFunctionAndUpdateState (right after all rows have been processed)

| removeState a| [[removeState]]

[source, scala]

removeState( store: StateStore, keyRow: UnsafeRow): Unit


Removes the state for the key from the StateStore

Used exclusively when InputProcessor is requested to callFunctionAndUpdateState (right after all rows have been processed)

| stateSchema a| [[stateSchema]]

[source, scala]

stateSchema: StructType

State schema

Note

It looks like (in StateManager of the FlatMapGroupsWithStateExec physical operator) stateSchema is used for the schema of state value objects (not state keys as they are described by the grouping attributes instead).

Used when:

|===

[[implementations]] NOTE: <> is the one and only known direct implementation of the <> in Spark Structured Streaming.

NOTE: StateManager is a Scala sealed trait which means that all the <> are in the same compilation unit (a single file).