Why Wait? Realtime Ingestion

« Current 2022

Historically, Pinterest data warehouse ingestion and indexing services were implemented on batch ETL and Kafka streaming respectively. As the product side leans more toward real-time and near-realtime data to innovate and compete, teams work together to revamp the ingestion and processing stack in Pinterest.

In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.

Presenter

Heng Zhang

Heng is a software engineer building large scale db / log ingestion and stream processing platforms around Apache Kafka and Apache Flink at Pinterest.

Why Wait? Realtime Ingestion

Presenter

Heng Zhang

Presenter

Chen Qin

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how