Skip to content

ContinuousReader — Data Source Readers in Continuous Stream Processing

ContinuousReader is the <> of Spark SQL's DataSourceReader abstraction for <> in Continuous Stream Processing.

ContinuousReader is part of the novel Data Source API V2 in Spark SQL.

TIP: Read up on https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-data-source-api-v2.html[Data Source API V2] in https://bit.ly/spark-sql-internals[The Internals of Spark SQL] book.

[[contract]] .ContinuousReader Contract [cols="1m,3",options="header",width="100%"] |=== | Method | Description

| commit a| [[commit]]

[source, java]

void commit(Offset end)

Commits the specified offset

Used exclusively when ContinuousExecution is requested to <>

| deserializeOffset a| [[deserializeOffset]]

[source, java]

Offset deserializeOffset(String json)

Deserializes an offset from JSON representation

Used when ContinuousExecution is requested to <> and <>

| getStartOffset a| [[getStartOffset]]

[source, java]

Offset getStartOffset()

NOTE: Used exclusively in tests.

| mergeOffsets a| [[mergeOffsets]]

[source, java]

Offset mergeOffsets(PartitionOffset[] offsets)

Used exclusively when ContinuousExecution is requested to <>

| needsReconfiguration a| [[needsReconfiguration]]

[source, java]

boolean needsReconfiguration()

Indicates that the reader needs reconfiguration (e.g. to generate new input partitions)

Used exclusively when ContinuousExecution is requested to <>

| setStartOffset a| [[setStartOffset]]

[source, java]

void setStartOffset(Optional start)

Used exclusively when ContinuousExecution is requested to <>.

|===

[[implementations]] .ContinuousReaders [cols="1,2",options="header",width="100%"] |=== | ContinuousReader | Description

| ContinuousMemoryStream | [[ContinuousMemoryStream]]

| KafkaContinuousReader | [[KafkaContinuousReader]]

| RateStreamContinuousReader | [[RateStreamContinuousReader]]

| TextSocketContinuousReader | [[TextSocketContinuousReader]]

|===