[Webinar] How to Protect Sensitive Data with CSFLE | Register Today

Apache Kafka and Kafka Streams at Berlin Buzzwords

Written By

At the beginning of June several Confluent team members attended Berlin Buzzwords 2016, where we gave three talks focused on stream processing and distributed computing. These talks, which we summarize further down below, fit right into the general excitement and interest in stream processing at Buzzwords and beyond. In fact, many of the sessions at Berlin Buzzwords were about Kafka or stream processing.

Neha Narkhede, co-founder and CTO of Confluent, gave the keynote Application Development and Data in the Emerging World of Stream Processing (video, slides). In her talk, Neha explained how the fundamental nature of application development will change as stream processing goes mainstream. Over the past years, a strong shift towards stream processing has driven the popularity of Apache Kafka. Making all the data of an organization available centrally as free-flowing data streams enables a company’s business logic to be represented as stream processing operations. Essentially, applications are stateful stream processors in this new world of stream processing. And to help application developers successfully make this important shift towards stream processing the Kafka community and Confluent created Kafka Streams, which is a powerful yet easy-to-use stream processing library that is part of the open source Apache Kafka project since the recently released Kafka version 0.10.

Berlin Buzzwords

Neha Narkhede starting day two of Berlin Buzzwords with her keynote on Applications in the Emerging World of Stream Processing

Michael Noll, product manager for Kafka Streams at Confluent, introduced Kafka Streams in more detail (video | slides). Michael covered the motivation and design of Kafka Streams and walked the audience through its concepts and key features. Notably, Kafka Streams was purposefully built to have a very low barrier to entry and easy operationalization (no cluster needed). It comes with an expressive API that allows developers to quickly write stream processing applications on top of Kafka that are highly scalable, fault-tolerant, and elastic out of the box. Now how can you get started using Kafka Streams? We recommend to take a look at our Kafka Streams demo applications and browse through the Kafka Streams documentation (e.g. our quickstart). If you want to take it a step further, you might want to download Confluent Platform 3.0, which includes Apache Kafka 0.10 with Kafka Streams alongside further components such as the management application Confluent Control Center, Kafka clients for C/C++ and Python as well as connectors to exchange data between Kafka and other systems such as databases or Hadoop.

Flavio Junqueira, co-creator of Apache ZooKeeper and infrastructure engineer in Confluent’s Kafka team, gave the talk Towards consensus on Distributed Consensus (video, slides). While keeping the discussion away from pure theory, Flavio revisited the distributed consensus problem in the light of fundamental academic results such as the relationship between state-machine replication and atomic broadcast, the equivalence between atomic broadcast and consensus, and the impossibility of consensus in asynchronous systems. Flavio discussed such primitives in the context of projects like Apache Kafka and Apache BookKeeper, highlighting that the core operation such systems use for replication are closely related to consensus, even though it is not directly perceived as being consensus. Although it might be possible to reduce the reliance on such primitives, distributed consensus is certainly not going away because it is really fundamental to many practical problems in the domain of distributed computing.

We hope you’ll enjoy these talks! If we raised your interest in stream processing and Kafka Streams, you may want to join our bi-weekly Ask Me Anything sessions on Kafka Streams and Kafka Connect. Simply drop drop us a note so that we can send you an invite. Of course you can also reach out to us in case you have further questions or want to follow-up.

  • Michael is a former principal technologist in the Office of the CTO at Confluent, the company founded by the original creators of Apache Kafka®. He focuses on longer-term product and technology strategy. Previously, Michael was the lead product manager for stream processing at Confluent, where his team created Kafka Streams and the streaming database ksqlDB. He is a well-known technology blogger in the big data community (www.michael-noll.com) and a committer/contributor to open source projects such as Apache Storm and Apache Kafka.

Did you like this blog post? Share it now