Kafka organizes data as immutable append-only logs at its core, and relied on external consensus services (a.k.a. Zookeeper) to manage the metadata --- such as topic-level configs, leader replicas and ISR information, received admin requests --- of these logs. In this talk, I will discuss a recent core initiative, that migrates the management of such metadata from external services into Kafka as its own special logs. More specifically, I will cover the following:
- Why we believe an internal consensus protocol provides Kafka more benefit than an external consensus service.
- Why we choose to build this internal "metadata log" based on the Raft protocol, instead of Kafka's current leader-follower replication mechanism.
- What are the key design decisions we made in its implementation, and how it is different from the standard Raft algorithm (KIP-595).
- How this Raft-based metadata log is leveraged by the new Quorum Controller (KIP-500).
Chinese
Japanese
Korean
Presenter
Guozhang Wang
ConfluentGuozhang Wang is a PMC member of Apache Kafka, and also a tech lead at Confluent leading the Kafka Streams team. He received his Ph.D. from Cornell University where he worked on scaling data-driven applications. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza.