Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Introducing Confluent Cloud Freight Clusters

Written By

We’re excited to introduce Freight clusters—a new type of Confluent Cloud cluster designed for high-throughput, relaxed latency workloads that is up to 90% cheaper than self-managing open source Apache Kafka®. 

Freight clusters utilize the latest innovations in Confluent Cloud’s cloud-native engine, Kora, to deliver low cost networking by trading off ultra low latency performance. This makes Freight clusters a perfect fit for high volume use cases such as logging, telemetry, or feeding data lakes. For example, Freight clusters will work nicely in tandem with our recently announced Tableflow, since feeding data lakes are typically high-throughput workloads that can tolerate higher latencies.

Why the name Freight clusters? Well, much like a Freight ship, the data gets to its destination more slowly than say air travel, but the marginal cost to move a byte of data is much lower. Put simply, if your workload has up to a second or two to spare, you can save up to 90% by not paying for the low latency performance you don’t need.

In the rest of this blog post, we’ll cover the key insights and technologies that led us to build this new cluster type. 

Cloud replication can be cheap! (If you do it right)

By now, it’s (hopefully) no secret that the most costly part about running Apache Kafka at scale is networking, specifically inter-availability zone (inter-AZ) bandwidth costs. When data is produced to a Kafka broker, it turns around and replicates that data to other brokers in different AZs, ensuring that data stays durable and available even if AZs fail. The unfortunate trade-off for this durability is cost. The cost of inter-AZ data transfer from replication can balloon for high-throughput workloads, accounting for up to 90% of infrastructure costs when self-managing Apache Kafka.

At high throughputs, networking comprises the bulk of Kafka infrastructure costs

While Kora already utilizes object storage today, what if we used S3 (or ABS / GCS) directly as our primary storage and replication layer to eliminate inter-AZ replication of data (and its associated costs) between brokers? This is the essence of the next-generation “direct write” mode now in Kora that is at the heart of Freight clusters. 

Freight clusters provide a more cost-effective option to write directly to cloud object storage for relaxed latency workloads

Evolving Kora’s modular storage architecture

So how do Freight clusters work, and how are they architected to save you money?

To support Freight, we introduced a next-generation “direct write” mode to Kora where data is written directly to object storage services like S3, bypassing local storage and avoiding replication on the Kora brokers. These clusters replace expensive inter-AZ replication with inexpensive direct-to-object storage writes – trading latency (from sub-100ms in our other clusters to up to a second or two with Freight) for significantly reduced network costs.

Instead of replicating data across brokers, Kora has the ability to write directly to object storage with Freight clusters to avoid costs associated with inter-AZ replication

How do Freight clusters save you money?

In the direct write mode, the broker creates batches of produce requests and writes them directly to object storage before acknowledging back to the client. Brokers are stateless and effectively leaderless: any broker can serve a produce or fetch request for any partition (though we still route specific partitions to designated brokers for better batch and fetch performance). 

The direct write mode has no inter-AZ replication traffic between brokers, while retaining (and improving) the durability and availability our customers have come to expect from Confluent Cloud. 

These savings (as with most engineering trade-offs) don’t come for free. Since produce requests now wait for acknowledgments from object storage, there is the inherent trade-off of some latency for the up to 90% cost savings. 

There’s a lot more to say about the technology behind Kora’s new direct write architecture—stay tuned for a more detailed technical blog post in the near future. 

Auto-scaling clusters offer even more cost savings

Sizing and capacity planning is one of the most difficult (and expensive) elements of self-managing Apache Kafka. This often means users have to over-provision to accommodate for peak workloads, paying for resources that are often (heavily) underutilized. And improving the utilization of your Kafka clusters on your own comes with its own inherent operations overhead and burden. Capacity management and provisioning are completely taken care of for you with Kora—more to come on this in the future.

Capacity planning is difficult and results in excessive spend for underutilized resources

Freight clusters utilize the same Elastic CKUs (eCKUs) as Basic, Standard, and Enterprise clusters, allowing Freight clusters to auto-scale to the shape of your workload and lower your costs by not paying for more capacity than you need. This means Freight clusters are always right-sized for your workload (i.e., without user intervention), and you save money by only paying for the resources you use when you actually need them. And just like the rest of our serverless offerings, Freight is a truly fully managed service with no agents or brokers to manage, upgrade, or manually scale.

A peek into the future of Kora with a modular storage architecture

Hopefully, our vision for Kora’s modular storage architecture is becoming clear—the Kora engine now speaks many different kinds of storage to fit our users’ needs:

Replication-between-brokers with aggressive tiering to object storage (for low-latency workloads or “typical” throughput). We architected Kora to use cloud object storage, such as Amazon S3, as a major storage layer because nothing beats object stores on durability and cost. But most workloads need lower latency than object stores can provide, so the Kora architecture incorporates a replicated, fault-tolerant read-write cache in front of the object stores. Clients interact directly with the stateful Kora brokers, which replicate the data with low millisecond latency and aggressively write the data to object storage asynchronously.

Apache Iceberg tables. Kora materializes topics as Iceberg tables over object storage (with Tableflow). Tableflow allows Kora to move data seamlessly from the operational to the analytical estates of an organization (and vice versa). Iceberg has emerged as a market leader in the object store table formats, with broad adoption across the analytics tooling landscape.

Object storage direct-write. High throughput topics, moving huge amounts of data with Freight! Kora writes directly to object stores, avoiding all the cross-AZ data transfers that can escalate the costs of high throughput workloads. Freight is all about big workloads that can tolerate higher latencies.

Optimize your topics across clusters based on latency vs. cost and easily convert them to Iceberg tables with Tableflow

Freight clusters are a game changer, but it’s only one piece of the puzzle. The real benefit comes from combining all these capabilities into a single cloud service. Tableflow was the first big piece of the puzzle bringing together the operational estate of transactional systems and the analytical estate of big data. With the multi-modal Kora storage engine, we are building a unified data streaming platform that can cater to all workload types, with not only the Kafka API as the standard for streaming, but also Apache Iceberg as the emerging standard to feed your analytics systems. 

Save up to 90% at GBps+ scale with Freight clusters

To summarize, Freight clusters take advantage of an innovative “direct write” architecture in Kora where:

  • Brokers are stateless and any broker can serve produce or consume requests for any partition

  • Produce requests are batched at the broker and sent directly to object storage before acknowledgment, eliminating the need to replicate data between brokers and incur inter-AZ network charges

  • Freight clusters use Elastic CKUs, automatically scaling capacity up or down with your workload(s), so that you pay only for the resources you use when you actually need them

The cost efficiencies from these improvements—designed for high throughput use cases with relaxed latency requirements—are passed to our customers, resulting in up to 90% lower costs than self-managing Apache Kafka. 

Sign up for Early Access today

We’re excited for our users to get their hands on Freight clusters. Powered by the significant innovations in Kora’s architecture, Freight clusters are available in Early Access in select AWS regions—sign up here to learn more.

  • Marc Selwan is the staff product manager for the Kora Storage team at Confluent. Prior to Confluent, Marc held product and customer engineering roles at DataStax, working on storage and indexing engines for Apache Cassandra.

Did you like this blog post? Share it now