[Webinar] How to Protect Sensitive Data with CSFLE | Register Today

What’s New in Apache Kafka 2.7.0

Verfasst von

I’m proud to announce the release of Apache Kafka 2.7.0 on behalf of the Apache Kafka® community. The 2.7.0 release contains many new features and improvements. This blog post highlights some of the more prominent ones. Be sure to see the release notes for the full list of changes. You can also watch the release video for a summary of what’s new:

In this release, we’ve continued steady progress toward the task of replacing ZooKeeper in Kafka with KIP-497, which adds a new inter-broker API for altering the in-sync replica (ISR). This release also provides the addition of the Core Raft Implementation as part of KIP-595. Now there is a separate “Raft” module containing the core consensus protocol. Until integration with the controller (the broker in the Kafka cluster responsible for managing the state of partitions and replicas) is complete, there is a standalone server that you can use for testing the performance of the Raft implementation.

Of course, there are additional efforts underway toward replacing Zookeeper, with seven KIPs in active development to provide support for more partitions per cluster, simpler operation, and tighter security.

Tiered Storage work continues and unlocks infinite scaling and faster rebalance times via KIP-405.

Kafka broker, producer, and consumer

KIP-654: Aborted transaction with non-flushed data should throw a non-fatal exception

When a Java client producer aborts a transaction with any non-flushed (pending) data, a fatal exception is thrown. But aborting a transaction with pending data is in fact considered a normal situation. The thrown exception should be to notify you that records aren’t being sent, not that the application is in an unrecoverable state. KIP-654 introduces a new exception TransactionAbortedException, allowing you to retry if desired.

KIP-651: Support PEM format for private keys and SSL certificates and private key

Currently, Kafka only supports JKS or PKCS12 file-based key and trust stores when using SSL. While it’s no longer a standard for email, the Privacy-Enhanced Mail (PEM) is a standard format for storing and distributing cryptographic keys and certificates. KIP-651 adds support for PEM files for key and trust stores, allowing the use of third party providers relying on the PEM format.

KIP-612: Ability to limit connection creation rate on brokers

Creating connections adds CPU overhead to the broker. Connection storms can come from seemingly well-behaved clients and can stop the broker from performing other useful work. But now there is now a way of enforcing broker-wide and per-listener connection creation rates. The 2.7.0 release contains the first part of KIP-612, with per-IP connections rate limits expected to come in the 2.8.0 release.

KIP-599: Throttle create topic, create partition, and delete topic operations

The APIs to create topics, create partitions, and delete topics are operations that have a direct impact on the overall load in the Kafka controller. To prevent a cluster from being overwhelmed due to high concurrent topic and partition creations or topic deletions, there is a new quota limiting these operations. See KIP-599 for more details.

KIP-584: Versioning scheme for features

Apart from broker-client compatibility (for which Kafka has a strong record to ensure they do remain compatible), there are two main questions when new features become available in Kafka:

  1. How do Kafka clients become aware of broker capabilities?
  2. How does the broker decide which features to enable?

KIP-584 provides a flexible and operationally friendly solution for client discovery, feature gating, and rolling upgrades using a single restart.

KIP-554: Add broker-side SCRAM configuration API

With KIP-554, SCRAM credentials can be managed via the Kafka protocol and the kafka-configs tool was updated to use the newly introduced protocol APIs. This is another important step towards KIP-500 where ZooKeeper is replaced by a built in quorum..

KIP-497: Add inter-broker API to alter ISR

Currently, Kafka partition leader and ISR information is stored in ZooKeeper. Either the controller or a partition leader may update this information. Because either can update this state, there needs to be a mechanism for sharing this information, which can cause delays in reflecting ISR changes. The impact of these delays means that metadata requests may receive stale information.

In the 2.7.0 release, there is a new AlterIsr API, which gives the controller the exclusive ability to update the state of partition leaders and ISR. The chief benefit of this new API is that metadata requests will always reflect the latest state.

The addition of this API is a significant step forward in the process of removing ZooKeeper and the completion of KIP-500. For more information, see KIP-497.

KIP-431: Print additional fields from records with the ConsoleConsumer

Now you can print the headers on a ConsumerRecord with the ConsoleConsumer. See KIP-431 for more details.

Kafka Connect

KIP-632: Add DirectoryConfigProvider

KIP-632 adds a DirectoryConfigProvider class to support users needing to provide secrets for keys stored in a container filesystem, such as a Kubernetes environment.

Kafka Streams

KIP-662: Throw exception when source topics of Kafka Streams application is deleted

Today, if a user deletes the source topic of a running Kafka Streams application, the embedded consumer clients gracefully shut down. This client shutdown triggers rebalancing until all StreamThreads of the Streams application gracefully exit, leaving the application completely shut down without any chance to respond to the error. With the addition of KIP-662, when a user deletes a source topic from a running Streams application, the app throws a MissingSourceTopicException, allowing for you to react to the error.

KIP-648: Renaming getter method for interactive queries

KIP-648 changes the getter methods for interactive query objects to follow the Kafka format of not using the get prefix.

KIP-617: Allow Kafka Streams state stores to be iterated backwards

Currently, when using an iterator over a Kafka Streams state store, you can only traverse elements from oldest to newest. When iterating over a windowed state store and the user desires to return the latest N records, there is no choice but to use the inefficient approach of traversing all the oldest records before getting to the desired newer records. KIP-617 adds support for iteration over a state store in reverse. Iterating in reverse makes a latest N records retrieval much more efficient.

KIP-616: Rename implicit SerDes instances in kafka-streams-scala

Kafka Streams now how better Scala implicit Serdes support with KIP-616.

KIP-613: Add end-to-end latency metrics to Kafka Streams

Currently, the actual end-to-end latency of a record flowing through Kafka Streams is difficult to gauge at best. Kafka Streams now exposes end-to-end metrics, which will be a great help for enabling users to make design choices. See KIP-613 for more information.

KIP-607: Add metrics to Kafka Streams to report properties of RocksDB

The current metrics exposed by Kafka Streams for RocksDB do not include information on memory or disk usage. Now in 2.7.0, Kafka Streams reports properties RocksDB exposes by default. See KIP-607 for more details.

KIP-450: Sliding window aggregations in the DSL

Kafka Streams implements session windows, tumbling windows, and hopping windows as windowed aggregation methods. While hopping windows with a small advance time can imitate the behavior of a sliding window, this implementation’s performance is poor because it results in many overlapping and often redundant windows that require expensive calculations. With the addition of sliding windows via KIP-450, Kafka Streams now provides an efficient way to perform sliding aggregations.

Conclusion

To learn more about what’s new in Apache Kafka 2.7 and to see all the KIPs included in this release, be sure to check out the release notes and highlights in the release video.

To download Apache Kafka 2.7.0, visit the project’s download page.

Of course, this release would not have been possible without a huge effort from the community. A big thank you to everyone involved in this release, including the following 116 people (according to the git shortlog) who contributed either code or documentation:

A. Sophie Blee-Goldman, Chia-Ping Tsai, John Roesler, David Jacot, Jason Gustafson, Matthias J. Sax, Bruno Cadonna, Ismael Juma, Guozhang Wang, Rajini Sivaram, Luke Chen, Boyang Chen, Tom Bentley, showuon, leah, Bill Bejeck, Chris Egerton, Ron Dagostino, Randall Hauch, Xavier Léauté, Kowshik Prakasam, Konstantine Karantasis, David Arthur, Mickael Maison, Colin Patrick McCabe, huxi, Nikolay, Manikumar Reddy, Jorge Esteban Quilcate Otoya, Vito Jeng, bill, Bob Barrett, vinoth chandar, feyman2016, Jim Galasyn, Greg Harris, khairy, Sanjana Kaundinya, Ning Zhang, Aakash Shah, Andras Katona, Andre Araujo, Andy Coates, Anna Povzner, Badai Aqrandista, Brian Byrne, Dima Reznik, Jeff Kim, John Thomas, Justine Olshan, Lee Dongjin, Leonard Ge, Lucas Bradstreet, Mario Molina, Michael Bingham, Rens Groothuijsen, Stanislav Kozlovski, Yuriy Badalyantc, Levani Kokhreidze, Lucent-Wong, Gokul Srinivas, Mandar Tillu, Gal Margalit, tswstarplanet, Evelyn Bayes, Micah Paul Ramos, vamossagar12, Ego, Navina Ramesh, Nikhil Bhatia, Edoardo Comar, Nikolay Izhikov, Dhruvil Shah, Nitesh Mor, Noa Resare, David Mao, Raman Verma, Cheng Tan, Adam Bellemare, Richard Fussenegger, Rob Meng, Rohan, Can Cecen, Benoit Maggi, Sasaki Toru, Shaik Zakir Hussain, Shailesh Panwar, Sharath Bhat, voffcheg109, Thorsten Hake, Auston, Vikas Singh, Ashish Roy, Arjun Satish, xakassi, Zach Zhang, albert02lowis, Antony Stubbs, Ankit Kumar, gnkoshelev, high.lee, huangyiming, Andrew Egelhofer, jeff kim, jiameixie, Andrew Choi, JoelWee, Jesse Gorzinski, Alex Diachenko, Ivan Yurchenko, manijndl7, Igor Soarez, Gonzalo Muñoz, sbellapu, serjchebotarev, and Adem Efe Gencer

This post was originally published by Bill Bejeck on The Apache Software Foundation blog.

  • Bill has been a software engineer for over 18 years. Currently, he is working at Confluent on the DevX team. Previously, Bill was an engineer on the Kafka Streams team for three-plus years. Before Confluent, he worked on various ingest applications as a U.S. Government contractor using distributed software such as Apache Kafka, Spark, and Hadoop. Bill has also written a book about Kafka Streams titled "Kafka Streams in Action" and is working on a 2nd edition that should be available Spring 2024.

Ist dieser Blog-Beitrag interessant? Jetzt teilen