Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | April 2016

Verfasst von

Gwen ShapiraEngineering Manager, Confluent

Apr 1, 2016Lesedauer: 3 min

The Apache Kafka community was crazy-busy last month. We released a technical preview of Kafka Streams and then voted on a release plan for Kafka 0.10.0. We accelerated the discussion of few key proposals in order to make the release, rolled out two release candidates, and then decided to put the release on hold in order to get few more changes in.

Kafka Streams tech preview! If you are interested in a new, lightweight, easy-to-use way to process streams of data, I highly recommend you take a look.
If you are interested in the theory of stream processing, check out Making Sense of Stream Processing – download the eBook while it’s still available. The book is written by Martin Kleppmann and if you’ve been interested in Kafka and stream processing for a while, you know his work is always worth reading.
Wondering what will be included in 0.10.0 release? Worried if there are any critical issues left? Take a look at our release plan.
Pull request implementing KIP-36 was merged. KIP-36 adds rack-awareness to Kafka. Brokers can now be assigned to specific racks and when topics and partitions are created, and the replicas will be assigned to nodes based on their rack placement.
Pull request implementing KIP-51 was merged. KIP-51 is a very small change to the Connect REST API, allowing users to ask for a list of available connectors.
Pull request implementing KIP-45 was merged. KIP-45 is a small change to the new consumer API which standardizes the types of containers accepted by the various consumer API calls.
KIP-43, which adds support for standard SASL mechanisms in addition to Kerberos, was voted in. We will try to get this merged into Kafka in release 0.10.0.
There are quite a few KIPs under very active discussions:
- KIP-4, adding an API for administrative actions such as creating new topics, requires some modifications to MetadataRequest.
- KIP-35 adds a new protocol for getting the current version of all requests supported by a Kafka broker. This protocol improvement will make it possible to write Kafka clients that will work with brokers of different versions.
- KIP-33 adds time-based indexes to Kafka and supporting both time-based log purging and time-based data lookup.

Databaseline blog compared stream processing technologies that are Apache projects. But there is more than one way to compare stream processing technologies. Google compared the various implementations of the Apache Beam API, using a somewhat different set of criteria.
Congratulations to our friends at Data Artisans, the company behind Apache Flink, for raising series A.
Kostas Pardalis blogged about his experiences and provides a practical guide for developing a Kafka connector. DataMountaineers released a command line interface for Kafka Connect.
Rajini, a very active Kafka contributor and the brains behind KIP-43, did a short write-up on the benefits of running Kafka in a cloud service.
MicroServices and Kafka play together, based on QCon talk by our own, Ben Stopford, is an excellent read
Confluent announced their first set of Apache Kafka training courses available to the public. More classes will be added soon in other cities.

That’s all for now! Got a newsworthy item? Let us know. If you are interested in contributing to Apache Kafka, check out the contributor guide to help you get started.

Gwen Shapira is a Software Enginner at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specialises in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of books including “Kafka, the Definitive Guide”, and a frequent presenter at data related conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.

Ist dieser Blog-Beitrag interessant? Jetzt teilen

Connecting the Dots: Simplifying Multi-API Data Flows into Apache Kafka®

Dec 1, 2025

This blog introduces the concept of API chaining — a method where data is collected by sequentially calling multiple related APIs. The response from one API is used to construct the request for the next, creating a chain that enables richer, more contextual data collection.

Sparsh Gupta

Why Apache Kafka® Migration Costs Are Often Underestimated

Oct 29, 2025

Planning an Apache Kafka® migration? Learn how to estimate migration expenses, reduce costs, and compare self-managed vs managed real-time data platforms with expert insight.

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | April 2016

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Verfasst von

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Ist dieser Blog-Beitrag interessant? Jetzt teilen

Connecting the Dots: Simplifying Multi-API Data Flows into Apache Kafka®

Why Apache Kafka® Migration Costs Are Often Underestimated

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Ist dieser Blog-Beitrag interessant? Jetzt teilen

Confluent-Blog abonnieren

Connecting the Dots: Simplifying Multi-API Data Flows into Apache Kafka®

Why Apache Kafka® Migration Costs Are Often Underestimated