Level Up Your Kafka Skills in Just 5 Days | Join Season of Streaming On-Demand

Presentation

Using Modular Topologies in Kafka Streams to scale ksqlDB’s persistent queries

« Kafka Summit London 2022

ksqlDB is a streaming database that uses Kafka Streams to execute queries against data in Apache Kafka®. Historically, each query was compiled into its own Kafka Streams program to be executed inside the ksqlDB servers. As ksqlDB moved to support broader and more complex use cases, this query execution strategy became the bottleneck for scaling up the number of persistent queries. This talk will examine the problems faced and how we addressed them.

Using too many Kafka Streams instances requires too many resources in both threads and consumers. One way to avoid this is using Modular Topologies, which are coming to Kafka Streams in KIP-809. Modular Topologies allow us to dynamically change the workload of a Kafka Streams application while it’s running and share resources such as consumer/producer clients and processing threads. This makes it possible to use a single Kafka Streams runtime for multiple topologies that share consumers and threads across them. We will see in detail how this makes it possible for ksqlDB to consolidate queries into a shared Kafka Streams runtime.

Kafka Streams developers will take away from this talk an understanding of how to utilize ModularTopologies, and dynamically upgrade their Kafka Streams workload effectively.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how