[Virtual Event] GenAI Streamposium: Learn to Build & Scale Real-Time GenAI Apps | Register Now

Confluent Announces Infinite Retention for Apache Kafka in Confluent Cloud, Enabling More Accurate Insights and Richer Digital Experiences

The original creators of Apache Kafka launch the only fully managed event streaming service to provide unlimited storage of real-time and historic events

Mountain View, Calif. – July 1, 2020 – Confluent, Inc., the event streaming pioneer, today announced the next stage of its Project Metamorphosis initiative, which aims to build the next- generation event streaming platform any organization can put at the heart of their business. As part of the Infinite release, Confluent announces infinite retention, a new capability in Confluent Cloud that creates a centralized platform for all current and historic event streams with limitless storage and retention. Organizations can now democratize access to event data and ensure all events are stored and accessible as long as needed. By combining all relevant past and current event data, organizations can build richer application experiences and make more informed data-driven decisions.

“Without the context of historical data, it’s difficult to take action on real-time events in an accurate, meaningful way,” said Jay Kreps, co-founder and CEO, Confluent. “We’ve removed the limitations of storing and retaining events in Apache Kafka with infinite retention in Confluent Cloud. With event streaming as a business’s central nervous system, applications can pull from an unlimited source of past and present data to quickly become smarter, faster, and more precise.”

Contextually rich, personalized applications are in especially high demand as digital experiences have replaced in-person interactions during the pandemic. In order to build these sophisticated touch points, applications need input data on what is happening right now and how that relates to what happened in the past. This is incredibly challenging and prohibitively costly for existing data architectures, especially when new events move through organizations at gigabyte-per-second scale. And due to high storage costs and complexities in data balancing, events are typically retained in Apache Kafka_®_ for only seven days. This limits event streaming use cases, like year-over-year analysis and predictive machine learning, and is not often a long enough time for compliance reasons.

“Although Apache Kafka is widely used for event streaming, many limitations still exist because of high infrastructure costs associated with storing data for longer periods of time,” said Dave Menninger, SVP and Research Director, Ventana Research. “Being able to extend from days or weeks of retention to several years with less operational overhead, greatly increases the value event streaming brings to any organization.”

Introducing the Only Fully Managed Event Streaming Service with Unlimited Storage and Infinite Retention

With the new infinite retention capability in Confluent Cloud, Confluent solves the technical and economic strains put on organizations by the rapidly growing volume of real-time event streams. Organizations can now quickly and cost-effectively establish a central source of truth for all events across their entire ecosystem, unlocking more use cases for pervasive event streaming and mitigating the rising cost of Kafka storage.

Implement Event Streaming as the Central Nervous System for All Real-Time Data

In traditional data architectures, silos exist between storage systems that record past data and messaging services that process future events. On top of that, there are hundreds of in-house systems, SaaS applications, and microservices linked together by point-to-point connections that create huge operational burdens. With infinite retention in Confluent Cloud, organizations can easily build one central nervous system where all events flow through and can be stored. Event streaming can become a single source of truth for all other systems, making it easy to scale and ensure data integrity across an entire business.

Do More with Streaming Applications

Within Kafka, compute and storage are tightly interlocked making it difficult to retain high volumes of data while efficiently scaling storage as traffic grows. Infinite retention decouples compute and storage and also automates scaling so storage instantly grows based on traffic. Without storage limitations, organizations are able to leverage event streaming for more use cases like providing a persistent log of all events for compliance audits that require several years of data. Infinite retention also makes it possible to train machine learning models to make real-time predictions based on a historical stream of data. With more data to draw from, infinite retention can also improve the accuracy and intelligence of existing event streaming use cases like recommendation engines and customer 360 analytics.

Reduce Storage Costs and Billing Complexities

To meet storage demands and avoid downtime, businesses often overprovision clusters and end up overpaying for more infrastructure and compute than is needed. Infinite retention provides elastically scalable storage that automatically grows with your traffic, and with the benefits of Confluent Cloud, organizations only pay for data that is retained rather than what is pre-provisioned. High storage costs that traditionally come with retaining massive amounts of data is no longer a barrier to achieving pervasive event streaming.

Infinite retention is available in July for Confluent Cloud customers using AWS with rollout to additional cloud providers planned for this year.

Store Infinite Amounts of Data on Self-Managed Kafka with Tiered Storage in Confluent Platform

Tiered Storage in Confluent Platform was built upon innovations in Confluent Cloud. Released as a preview earlier this year, Tiered Storage has the potential to reduce storage costs by up to 70% and enable use cases across financial services and retail that improve customer experience, meet regulatory requirements for data retention, and improve machine learning models.

The community has proposed KIP-405 to bring tiered storage support to Apache Kafka, and Confluent engineers are helping the design proposal based on their experience with the Tiered Storage preview in Confluent Platform.

Additional Resources

Read Apache Kafka co-creator Jun Rao’s blog on the Project Metamorphosis Infinite release: https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/
Learn more about Project Metamorphosis: https://www.confluent.io/project-metamorphosis/
See how Confluent is helping its customers transform their businesses: https://www.confluent.io/customers/
Learn more about Confluent: https://www.confluent.io/

About Confluent

Confluent, founded by the original creators of Apache Kafka®, pioneered the enterprise-ready event streaming platform. With Confluent, organizations benefit from the first event streaming platform built for the enterprise with the ease of use, scalability, security, and flexibility required by the most discerning global companies to run their business in real time. Companies leading their respective industries have realized success with this new platform paradigm to transform their architectures to streaming from batch processing, spanning on-premises and multi-cloud environments. Confluent is headquartered in Mountain View and London, with offices globally. To learn more, please visit www.confluent.io. Download Confluent Platform and Confluent Cloud at www.confluent.io/download.

The preceding outlines our general product direction and is not a commitment to deliver any material, code, or functionality. The development, release, timing, and pricing of any features or functionality described may change.

Confluent and associated marks are trademarks or registered trademarks of Confluent, Inc.

Apache® and Apache Kafka® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by the Apache Software Foundation is implied by the use of these marks. All other trademarks are the property of their respective owners.