Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resource Optimization

« Kafka Summit London 2024

Consumer scaling is a crucial element for many Apache Kafka users. Who doesn’t want to save money by efficiently managing their resources, shutting down unnecessary instances when there is no traffic, quickly scaling up during peak hours and while doing all of that - avoiding annoying and often unnecessary rebalancing.

To achieve all of this you need to understand how consumer assignment works, how nodes are affected by data load and what are common causes of rebalancing. But most importantly - what assignors to choose based on your use case and what metrics to use to measure your data load.

You wonder how we know what are good and bad practices? At Aiven, we've seen firsthand both successful and not-so-great approaches to consumer scaling and rebalancing. The insights we're sharing with you come directly from our experience working on many projects with Apache Kafka.

We’ll discuss metrics that are essential for understanding data load and deciding when to scale. We'll cover a variety of approaches you can take - from commonly used lag exporters, to Knative scalers that are based on concurrent requests and finally insights from our own experience developing a speed lag predictor that goes beyond the basics by calculating the velocity of data load changes. We’ll highlight advantages and disadvantages of each approach and when you should use it.

Next, we'll look at various assignors that are available and guide you on how to choose the most suitable one for your scenario. We'll pay special attention to the challenges faced by stateful applications and the potential pitfalls of frequent scaling, such as overloaded brokers.

Armed with this knowledge, you’ll have what is needed to build scalable systems, minimize downtime and save costs when working with Apache Kafka. Let's make your Kafka experience as smooth and efficient as possible!

Presenter

Olena Kutsenko

Aiven

Olena is a seasoned expert in data, sustainable software development, and teamwork. With a background in software engineering, she's led teams and developed mission-critical applications at Nokia, HERE Technologies, and AWS. Currently, she works at Aiven where she supports developers and customers in using open-source data technologies such as Apache Kafka, ClickHouse, and OpenSearch. She is also an international public speaker and regularly present at conferences around the world. She holds AWS Developer and Solutions Architect certifications.

Presenter

Olena Babenko

Aiven

Olena is a sStaff Software Engineer and a data engineer. Born in Ukraine, but now lives in Finland. For most of her career, she has worked for big companies such as Yandex and Zalando. Therefore, she understands that the problem of "too much data" does exist. This is why she has been a big fan of Kafka and Flink streaming for almost 4 years now. She believes that sharing knowledge is a win-win for both the audience and the speaker.

Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resource Optimization

Presenter

Olena Kutsenko

Presenter

Olena Babenko

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how