Level Up Your Kafka Skills in Just 5 Days | Join Season of Streaming On-Demand
Historically, Pinterest data warehouse ingestion and indexing services were implemented on batch ETL and Kafka streaming respectively. As the product side leans more toward real-time and near-realtime data to innovate and compete, teams work together to revamp the ingestion and processing stack in Pinterest.
In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.