[Webinar] Bringing Flink to On-Prem and Private Clouds. Register Now
I’m brand new to writing KIPs (Kafka Improvement Proposals). I’ve written two so far, and my hands sweat every time I hit send on an email with ‘[DISCUSS] KIP’ in the title. But I’ve also learned a lot from the process: about Apache Kafka® internals, the process of writing KIPs, the Kafka community, and the most important motivation for developing software: our end users.
What did I actually write? Let’s review KIP-941 and KIP-1020. They’re byte-size 😉, but they’ll help put what I’m saying into context.
I wrote my first KIP after being approached by a community member who had identified this issue and was wondering if I could pitch in. Check out this block of user code:
You’re probably getting itchy looking at the repeated else if
’s. This is an example of what you’d have to write before this KIP was implemented. This RangeQuery
class is part of the Interactive Query API, and it provides methods for setting the upper and lower bounds of the range.
Web client requests have query parameters that often define the bounds of a range query. Those query parameters are sometimes null
, meaning you want all the data from either or both ends. Originally you’d have to write the above if-else-chain of logic to avoid getting a NullPointerException
since the RangeQuery
class used the Optional.of
method which does not allow null parameters.
The proposed change to the Kafka Streams codebase was a one-liner. The original code…:
... became:
Now, if the developer passed in null
values to the RangeQuery
, it would perform a full scan, and the extra logic could be avoided. Even if this is only a small behavioral improvement, it’s a change in the public API contract and thus requires a KIP.
In 2023, I decided to do a little bit of docs cleanup in Kafka Streams. There were some new pieces of configuration to be added, some removed pieces of configuration to be removed, and some deprecated pieces to be marked deprecated in the docs.
When I made a pull request to get this done, the resulting discussion revealed that there was a piece of configuration in KStreams, window.size.ms
, that actually belonged to the client.
Although this would involve a small change, deprecation affects user-facing behavior, so a KIP was required.
I placed the KIP under discussion using the Apache Kafka mailing list, which is the first step of the process after writing a KIP draft. During the discussion, it came up that windowed.inner.serde.class
was similar to window.size.ms
, and that it should be eventually moved to a TimeWindowedDe/Serializer
class as well, after deprecation. I edited the KIP in response. The next step in KIP writing is to submit it to a vote in the mailing list. After the KIP receives 3 binding votes, it becomes “Accepted” and can be worked on.
This is why I love writing KIPs: it forces me to think through the changes I’m proposing. There’s a template for writing KIPs, and it includes several headings:
Here’s where the KIP is marked “Under discussion”, “Accepted”, or “Rejected”. You’ll also put the pertinent links to the JIRA ticket and discussion thread here. As I write this, KIP-941 has been accepted and implemented, and KIP-1020 has been accepted, and implementation will begin soon.
What’s the problem? What’s bothering people? What stands to be improved? This is where you lay out the issue. If you write a KIP, don’t shortcut this section and spend enough time on it to make it good. This does not necessarily mean your explanation needs to be long, but it should be clear and thorough because this is the best way to convince others upfront about the importance of the problem you are trying to address, and thus you can get their buy-in early on.
This heading, along with “Compatibility, Deprecation, and Migration Plan” makes you think about how your change affects Kafka developers. When a developer uses Kafka, what will be different for them after your proposed change? Will it break the way they use it after an upgrade?
For both the KIPs I wrote, there were no breaking changes, and, in fact, there should not be immediate breaking changes proposed in KIPs.
For the first KIP I wrote, there were no breaking changes. As for KIP-1020, moving or removing a piece of configuration is a breaking change– therefore it proposed deprecation first, as well as a proposal to check for new pieces of configuration and fall back on the old ones to avoid a breaking change.
In KIP-1020, marking a piece of configuration as deprecated generates a warning but does not break the experience for the Kafka developer. With the deprecation comes the implication that the configuration should be removed in some future release, but the point is to avoid a breaking change and protect Kafka Streams users.
With KIP-941, the developers who had written the unwieldy but necessary code block would still see it functioning, it just introduced new functionality for developers who were using RangeQuery
next.
This is where you outline the changes to the codebase that will be required. Of course, it’s more of a high-level overview description (it’s not a pull request). This section will explain how the feature works in detail and how it fits into the current architecture of the code. For more difficult KIPs, this is the main section of a KIP.
This is another heading that requires reflection on the effect your change has on the end user. If the developer upgrades, will there be a breaking change? Is the improvement the type that requires a deprecation to warn of a future breaking change?
Here, the community is asking about your plan for testing, especially regarding end-to-end tests. I find this section to be a helpful reminder to think about testing. Often, this brings up new aspects of the problem I wouldn’t have thought about if I hadn’t considered testing at the outset.
Were there other ways of solving this problem you considered? Are there links to rejected alternatives you can provide?
I mentioned I’m nervous when I send KIP emails. Let’s clarify: It’s because I’m new to the process, not because they’re received badly! In my experience, developers in the Kafka open source community are professional and serious about their work, which means if you write a KIP, the resulting discussion in the mailing list will be respectful and well-thought-out. Don’t be afraid of bringing your ideas to the Kafka community!
I was hasty when I wrote my first draft of KIP-1020. I thought, why not just remove this piece of configuration from StreamsConfig
and add it to ClientConfig
? Well, imagine you’ve been trying to implement that piece of configuration in StreamsConfig
, you update Kafka to a minor version, and all of a sudden window.size.ms is
… gone. What a terrible user experience that would be! A Kafka committer pointed this out to me as I was writing this KIP, and it made me think about how careful software developers need to be when introducing changes.
Now, if window.size.ms
were deprecated, it should eventually be removed. In fact, there is a one-year deprecation deadline for removal. That means if it is marked deprecated in a minor release in the 3.7+ range, then removal should be done by the 5.0 major release and not before.
This experience has taught me that the consideration of the developer is the most important principle to consider in writing software. After all, we’re not out here making changes purely for the sake of enjoying Java logic (although I’m sure some of us are, in part). Software is for end users! And therefore it’s orchestrated around what is good for them.
If you’ve been considering contributing to Apache Kafka, I’d like to encourage you to get out there and do it. Whether it’s because you’ve always felt some part of the user experience could be improved, or because you’d like to learn more about Kafka internals, the community could use your contributions. I’ll be honest– it’s a large codebase, and it can feel a bit uphill at first. (I’ve got a talk about navigating large repos if you’re interested). But it’s worth the investment. Come help!
Ever dealt with a misbehaving consumer group? Imbalanced broker load? This could be due to your consumer group and partitioning strategy!
Learn how to track events in a large codebase, GitHub in this example, using Apache Kafka and Kafka Streams.