[Webinar] Bringing Flink to On-Prem and Private Clouds. Register Now

Confluent Contributions to the Apache Kafka™ Client Ecosystem

作成者 :

If you are using Apache Kafka from a language other than Java one of the first questions you probably have is something like, “Why are there two (or five!) clients in my language, and which one should I use?” To answer that, it’s probably worth backing up and explaining a bit about how Kafka handles multi-language support, and how Kafka clients are developed.

Kafka unfortunately has a reputation as being a bit Java-centric. This dates from its origin at LinkedIn, which was primarily a JVM shop. When we were at LinkedIn it really wasn’t practical for us to focus on building clients for languages that we wouldn’t ourselves use. But we did focus on designing for multi-language from the beginning. As a result, Kafka has a well-specified, versioned, binary protocol that allows us to support backwards compatibility as the protocol evolves. This allows clients to be maintained separately from the main code base, as they can be released independently. It also means that you don’t have to upgrade all the apps that embed clients all at once when you upgrade Kafka, you can upgrade the Kafka cluster in place and then upgrade the client apps at your leisure.

This allowed a flourishing ecosystem of clients in different languages, maintained as independent open source projects. So the good news is that there is likely a client for whatever programming language you are using. The bad news is that not all these clients were of the same quality. Some were closer to experiments than they were to production code. The result was the experience of someone trying to adopt Kafka coming from outside the Java ecosystem was often poor. After all, if the client to a system is slow or unreliable or incomplete, then the system itself will appear to you slow or unreliable or incomplete.

One aspect of Kafka that makes building clients harder is the use of TCP and the fact that the client establishes a direct connection to multiple brokers in the Kafka cluster. Both of these features are essential for making Kafka the kind of ultra-fast low-overhead primitive you can dump massive amounts of data on, but both make the development of a really fast, high-quality client more involved.

When we started Confluent we knew that making Kafka really be a first class citizen in all languages meant doing something about the mixed quality of the clients. Here is the approach we’re taking in focusing our efforts.

The first thing we did was add a convenient open source REST layer for Kafka. This gives a simple HTTP interface for any language that doesn’t yet have a really high-quality client. This is one of a number of components that is open source, Apache Licensed, and freely available as part of Confluent Platform.

Next we worked on simplifying the Kafka consumer and the protocol that supported it. Kafka consumers are actually quite sophisticated, they let a pool of Kafka processes jointly divide up the work of consuming a topic. This pool of processes is called a consumer group. These groups are dynamic, they allow new processes to join the group and automatically detect the failure of any process in the group (rebalancing the load amongst the remaining consumers when this happens).

Prior Kafka versions required complex interaction with Zookeeper directly from the client to implement the consumer groups. This made the consumer quite complex since each consumer had to interact both with Kafka and negotiate a multi-step group protocol with zookeeper. In the 0.9 release we made this part of the core Kafka protocol, allowing us to encapsulate all the complexities of group management in a new consumer protocol the server provided. This new protocol makes it far easier to build consumers, as well as making the consumer protocol faster and more scalable. Using this protocol we were able to redesign the Java consumer, unifying two prior clients with a single more powerful interface.

In parallel we started working directly on getting high-quality native clients in major languages. Rather than reinvent the wheel, the approach we decided to take was to work with the C client, librdkafka, which was already extremely high quality and broadly adopted. We were lucky enough to have the author of the C client, the incomparable Magnus Edenhill, join us at Confluent. Magnus has helped to move this C library to use the new consumer protocol, and add full support for the security features Kafka itself added.

For other languages we knew we had two options — either do full, from-scratch clients in each language, or wrap the C client. We thought long and hard about each option. The challenge was that making a really fast producer or consumer actually requires very careful, low-level management of memory, buffering, and TCP interaction. The Java and C clients are both capable of producing hundreds of thousands or even millions of records per second. Doing this requires very careful buffering, batching, memory management, and good non-blocking network code directly in the client. Without this the client will end up naively sending a single request for each message, which will be hopelessly slow. Not only would duplicating this work in every language be very hard, in many languages that kind of low-level programming either isn’t possible or is simply dog slow. Worse, as we’d seen from the large ecosystem of semi-reliable clients, the detailed, transparent error handling a good client should do would always be a work in progress. Finally, keeping feature parity across so many clients, especially as we added features in Kafka itself would be an ongoing problem.

We decided to put our energy into the other approach and build support for other languages using the existing C client. This approach has a number of advantages:

  • The C client is rock solid and used in production in thousands of companies, any client using it inherits this
  • It’s disgustingly fast, and this translates to clients built using it.
  • It has support for security
  • It’s quite feature complete, including things like compression, batching, offset commit, the ability to seek to different positions in the log, etc.

The challenge, of course, with wrapping C code, is really making sure you package it well for major platforms so that users don’t end up needing to fiddle with things to get everything working. Looking at this we realized that solving this packaging problem was the far better approach towards getting a robust ecosystem than trying to rebuild from scratch in each language.

Using this approach we’ve released an open source Python client and recently added a Go client, both based on the existing C client. They’re both quite fast. We are fully supporting these clients and making sure they are kept in feature parity, and tested with each release of Confluent Platform and Kafka releases. We look forward to adding more clients that we help support this way, either ones we develop or ones that come out of the wider open source community. We’d also love any feedback on how to make these clients more complete, easier to get started with, better documented, or more idiomatic for their language.

We’ve seen this approach to Kafka client development has gained popularity with others as well: recently a node.js client and .Net client, also both based on the same C client have been released in open source. We haven’t yet fully evaluated these, but we do think these clients likely represent the best way to get a robust client for those languages based on our experience so far. As always a huge thanks to the companies and individuals who have pitched in in contributing those, those who developed the rest of the Kafka clients, and those who helped us find problems in our own. This is open source at its best.

This client effort is important to us because we want Kafka to be a general platform for streams across a company, and few companies use only one programming language. We’d love to hear what languages you think are most important to support well so we can help put our resources where they matter the most.

  • Jay Kreps is the CEO and co-founder of Confluent, the foundational platform for data in motion built on Apache Kafka. As a pioneer in a new category of data infrastructure, Confluent’s significant growth underscores the importance of data in motion across all industries. Prior to Confluent he was the lead architect for data and infrastructure at LinkedIn. He is the initial developer of several open source projects, including Apache Kafka.

このブログ記事は気に入りましたか?今すぐ共有