StateStoreOps is a Scala implicit class of a data RDD (of type
RDD[T]) to create a StateStoreRDD for the following physical operators:
Implicit Classes are a language feature in Scala for implicit conversions with extension methods for existing types.
Creating StateStoreRDD (with storeUpdateFunction Aborting StateStore When Task Fails)¶
mapPartitionsWithStateStore[U]( stateInfo: StatefulOperatorStateInfo, keySchema: StructType, valueSchema: StructType, indexOrdinal: Option[Int], sessionState: SessionState, storeCoordinator: Option[StateStoreCoordinatorRef])( storeUpdateFunction: (StateStore, Iterator[T]) => Iterator[U]): StateStoreRDD[T, U]
SparkContext to clean
mapPartitionsWithStateStore uses the <
NOTE: Function Cleaning is to clean a closure from unreferenced variables before it is serialized and sent to tasks.
SparkContext reports a
SparkException when the closure is not serializable.
mapPartitionsWithStateStore then creates a (wrapper) function to abort the
StateStore if state updates had not been committed before a task finished (which is to make sure that the
StateStore has been committed or aborted in the end to follow the contract of
TaskCompletionListener to be notified when a task has finished.
mapPartitionsWithStateStore is used when the following physical operators are executed: