High-Level
Streams DSL
@jaceklaskowski
/
StackOverflow
/
GitHub
/
LinkedIn
The "Internals" Books:
books.japila.pl
## Agenda 1. [High-Level Streams DSL](#/intro) 1. [StreamsBuilder](#/streamsbuilder) 1. [KStream](#/kstream) 1. [KTable](#/ktable) 1. [GlobalKTable](#/globalktable) 1. [Consumed](#/consumed) 1. [Produced](#/produced) 1. [KafkaStreams](#/kafkastreams) 1. [Kafka Streams DSL for Scala](#/kafka-streams-dsl-for-scala) 1. [Demo: Processing Record Stream with KStream](#/demo-processing-record-stream-kstream) 1. [Demo: Kafka Streams DSL for Scala](#/demo-kafka-streams-dsl-for-scala)
## High-Level Streams DSL
(1 of 2)
1. **Streams DSL** is a high-level API built on top of the Processor API 1. Built-in abstractions for streams and tables * KStream * KTable * GlobalKTable 1. Declarative, functional programming style * Stateless transformations (e.g. **map**, **flatMap**, **filter**) * Stateful transformations (e.g. windowed aggregations) 1. Recommended for most developers, esp. beginners
## High-Level Streams DSL
(2 of 2)
1. Typical development flow 1. Read input streams (from Kafka topics) 1. Apply transformations 1. Write output streams (to Kafka topics or console) 1. Learn more in [The Internals of Kafka Streams](https://books.japila.pl/kafka-streams-internals/kstream)
## StreamsBuilder 1. **StreamsBuilder** is used to define a (processor) topology 1. Loads a record stream into a KStream 1. Loads a changelog stream into a KTable 1. Loads a changelog stream into a GlobalKTable 1. Streams from records in one or more Kafka topics 1. [StreamsBuilder API](http://kafka.apache.org/33/javadoc/org/apache/kafka/streams/StreamsBuilder.html)
## KStream 1. **KStream** is an abstraction of a **record stream** (of key-value pairs) 1. Data record represents a self-contained datum in the unbounded data set 1. Data records (in a record stream) are always interpreted as an "INSERT" * No record replaces an existing row with the same key 1. Examples: credit card transactions, page view events, server log entries 1. [KStream API](http://kafka.apache.org/33/javadoc/org/apache/kafka/streams/kstream/KStream.html)
## KTable 1. **KTable** is an abstraction of a **changelog stream** 1. Data records (in a record stream) are interpreted as an "UPSERT" * "UPDATE" of the last value for the same record key * "INSERT" if a given key doesn't exist yet 1. A record with a **null** value represents a "DELETE" or tombstone for the record's key 1. [KTable API](http://kafka.apache.org/33/javadoc/org/apache/kafka/streams/kstream/KTable.html)
## GlobalKTable 1. **GlobalKTable** is an abstraction of a **changelog stream** * Similarly to KTable 1. Fully replicated per KafkaStreams instance * Local GlobalKTable instance of each application instance will be populated with data from all partitions of the topic 1. Provides the ability to look up current values of data records by keys 1. Can only be used as right-hand side input for stream-table joins 1. [GlobalKTable API](http://kafka.apache.org/33/javadoc/org/apache/kafka/streams/kstream/GlobalKTable.html)
## Consumed 1. **Consumed** defines the optional parameters (metadata) that describe how to consume a record stream * Used in **stream**, **table**, **globalTable**, and **addGlobalStore** operators 1. Learn more in [The Internals of Kafka Streams](https://books.japila.pl/kafka-streams-internals/kstream/Consumed/)
## Produced 1. **Produced** defines the optional parameters (metadata) that describe how to produce a record stream * Used in **KStream.to** operator 1. Learn more in [The Internals of Kafka Streams](https://books.japila.pl/kafka-streams-internals/kstream/Produced/)
## KafkaStreams 1. **KafkaStreams** is a Kafka client 1. Performs continuous computation on data from one or more input topics 1. Sends output to zero, one, or more output topics 1. Executes a DAG topology of Processors 1. [KafkaStreams API](http://kafka.apache.org/33/javadoc/org/apache/kafka/streams/KafkaStreams.html)
## Kafka Streams DSL for Scala 1. **Kafka Streams DSL for Scala** is a wrapper over the existing Java APIs for Kafka Streams DSL 1. Making the Java APIs more usable in Scala * Better type inferencing * Enhanced expressiveness * Less boilerplates 1. Learn more * [Official documentation](https://kafka.apache.org/33/documentation/streams/developer-guide/dsl-api.html#scala-dsl) * [The Internals of Kafka Streams](https://books.japila.pl/kafka-streams-internals/scala)
# Demo ## Processing Record Stream
with KStream [Demo: Developing Kafka Streams Application](https://books.japila.pl/kafka-streams-internals/demo/developing-kafka-streams-application/)
# Demo ## Kafka Streams DSL for Scala [Scala API for Kafka Streams](https://books.japila.pl/kafka-streams-internals/scala/)
## Recap 1. [High-Level Streams DSL](#/intro) 1. [StreamsBuilder](#/streamsbuilder) 1. [KStream](#/kstream) 1. [KTable](#/ktable) 1. [GlobalKTable](#/globalktable) 1. [Consumed](#/consumed) 1. [Produced](#/produced) 1. [KafkaStreams](#/kafkastreams) 1. [Kafka Streams DSL for Scala](#/kafka-streams-dsl-for-scala) 1. [Demo: Processing Record Stream with KStream](#/demo-processing-record-stream-kstream) 1. [Demo: Kafka Streams DSL for Scala](#/demo-kafka-streams-dsl-for-scala)
# Questions? * Read [The Internals of Apache Kafka](https://books.japila.pl/kafka-internals/) * Read [The Internals of Kafka Streams](https://books.japila.pl/kafka-streams-internals) * Read [The Internals of ksqlDB](https://books.japila.pl/ksqldb-internals/) * Follow [@jaceklaskowski](https://twitter.com/jaceklaskowski) on twitter (DMs open) * Upvote [my questions and answers on StackOverflow](http://stackoverflow.com/users/1305344/jacek-laskowski) * Contact me at **jacek@japila.pl**