[Virtual Event] GenAI Streamposium: Learn to Build & Scale Real-Time GenAI Apps | Register Now

BLOG

Apache Kafka

Data Products, Data Contracts, and Change Data Capture

Change data capture is a popular method to connect database tables to data streams, but it comes with drawbacks. The next evolution of the CDC pattern, first-class data products, provide resilient pipelines that support both real-time and batch processing while isolating upstream systems...

Adam Bellemare

Unlock Cost Savings with Freight Clusters–Now in General Availability

Confluent Cloud Freight clusters are now Generally Available on AWS. In this blog, learn how Freight clusters can save you up to 90% at GBps+ scale.

Rachel Groberman

Contributing to Apache Kafka®: How to Write a KIP

Learn how to contribute to open source Apache Kafka by writing Kafka Improvement Proposals (KIPs) that solve problems and add features! Read on for real examples.

Lucia Cerchie

Apache Kafka, Purgatory, and Hierarchical Timing Wheels

Oct 28, 2015

Apache Kafka has a data structure called the “request purgatory”. The purgatory holds any request that hasn’t yet met its criteria to succeed but also hasn’t yet resulted in an […]

Yasuhiro Matsuda

Log Compaction | Highlights in the Kafka and Stream Processing Community | October 2015

Oct 12, 2015

The amount of work that got done by the community in the last month is truly impressive, especially considering how many conferences took place in September. Let’s take a look at the highlights: The […]

Gwen Shapira

Apache Kafka Hits 1.1 Trillion Messages Per Day – Joins the 4 Comma Club

Sep 1, 2015

I am very excited that LinkedIn’s deployment of Apache Kafka has surpassed 1.1 trillion (yes, trillion with a “t”, and 4 commas) messages per day. This is the largest deployment of Apache […]

Neha Narkhede

Log Compaction | Highlights in the Kafka and Stream Processing Community | September 2015

Sep 1, 2015

September is the start of the fall conference season. Between Strata + Hadoop World New York and ApacheCon: Big Data Europe, there is plenty to keep us busy learning.

Gwen Shapira

Distributed Consensus Reloaded: Apache ZooKeeper and Replication in Apache Kafka

Aug 27, 2015

This post was jointly written by Neha Narkhede, original co-creator of Apache Kafka, and Flavio Junqueira, co-creator of Apache ZooKeeper. Many distributed systems that we build and use currently rely on dependencies like […]

Flavio Junqueira

Log Compaction | Highlights in the Kafka and Stream Processing Community | August 2015

Aug 12, 2015

Welcome to the first edition of Log Compaction, a monthly digest of highlights in the Apache Kafka and stream processing community. Today’s edition are the highlights from July and early […]

Gwen Shapira

Apache Kafka, Samza, and the Unix Philosophy of Distributed Data

Aug 1, 2015

One of the things I realised while doing research for my book is that contemporary software engineering still has a lot to learn from the 1970s. As we’re in such […]

Martin Kleppmann

Compression in Apache Kafka is now 34% faster

Jul 30, 2015

Apache Kafka is widely used to enable a number of data intensive operations from collecting log data for analysis to acting as a storage layer for large scale real-time stream […]

Yasuhiro Matsuda

Making Apache Kafka Elastic With Apache Mesos

Jul 16, 2015

This post has been written in collaboration with Derrick Harris from Mesosphere and Joe Stein, a Kafka committer. For an updated version of this article, please see Apache Mesos, Apache Kafka and […]

Neha Narkhede

Hands-Free Kafka Replication: A Lesson in Operational Simplicity

Jul 1, 2015

Building operational simplicity into distributed systems, especially for nuanced behaviors, is somewhat of an art and often best achieved after gathering production experience. Apache Kafka‘s popularity can be attributed in […]

Neha Narkhede

Using logs to build a solid data infrastructure (or: why dual writes are a bad idea)

May 29, 2015

This is an edited transcript of a talk I gave at the Craft Conference 2015. The video and slides are also available.

Martin Kleppmann

Compatibility Testing For Apache Kafka

May 1, 2015

Testing is one of the hardest parts of building reliable distributed systems. Kafka has long had a set of system tests that cover distributed operation but this is an area […]

Jay Kreps

Bottled Water: Real-time integration of PostgreSQL and Kafka

Apr 23, 2015

Summary: Confluent is starting to explore the integration of databases with event streams. As part of the first step in this exploration, Martin Kleppmann has made a new open source […]

Martin Kleppmann

A Comprehensive REST Proxy for Kafka

Mar 25, 2015

As part of Confluent Platform 1.0 released about a month ago, we included a new Kafka REST Proxy to allow more flexibility for developers and to significantly broaden the number […]

Ewen Cheslack-Postava

Use CCBLOG60 to get an additional $60 of free Confluent Cloud

Get started