Skip to content

OffsetSeq

OffsetSeq is the metadata managed by Hadoop DFS-based metadata storage.

OffsetSeq is <> (possibly using the <> factory methods) when:

Creating Instance

OffsetSeq takes the following when created:

=== [[toStreamProgress]] Converting to StreamProgress -- toStreamProgress Method

[source, scala]

toStreamProgress( sources: Seq[BaseStreamingSource]): StreamProgress


toStreamProgress creates a new StreamProgress and adds the streaming sources for which there are new offsets available.

NOTE: <> is a collection with holes (empty elements) for streaming sources with no new data available.

toStreamProgress throws an AssertionError if the number of the input sources does not match the <>:

There are [[offsets.size]] sources in the checkpoint offsets and now there are [[sources.size]] sources requested by the query. Cannot continue.

toStreamProgress is used when:

  • MicroBatchExecution is requested to <> and <>

  • ContinuousExecution is requested for <>

=== [[toString]] Textual Representation -- toString Method

[source, scala]

toString: String

NOTE: toString is part of the ++https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#toString--++[java.lang.Object] contract for the string representation of the object.

toString simply converts the <> to JSON (if an offset is available) or - (a dash if an offset is not available for a streaming source at that position).

=== [[fill]] Creating OffsetSeq Instance -- fill Factory Methods

[source, scala]

fill( offsets: Offset*): OffsetSeq // <1> fill( metadata: Option[String], offsets: Offset*): OffsetSeq


<1> Uses no metadata (None)

fill simply creates an <> for the given variable sequence of Offsets and the optional OffsetSeqMetadata (in JSON format).

fill is used when:


Last update: 2021-02-07