Level Up Your Kafka Skills in Just 5 Days | Join Season of Streaming
Introducing Confluent Platform 7.4 - enabling Apache Kafka® to scale to millions of partitions, simplifying architecture, accelerating time to market with self-service tooling and codified best practices for developers, and ensuring consistent and accurate data.
Building on the innovative feature set delivered in previous releases, Confluent Platform 7.4 makes enhancements to three categories of features:
Enhance scalability and simplify your architecture with production-ready KRaft support for new clusters.
Achieve faster time to value by providing a self-service control plane for developers with Confluent for Kubernetes Blueprints.
Ensure trusted, high-quality data streams by leveraging Data Quality Rules for Schema Registry.
This release blog post explores each of these enhancements in detail, taking a deeper dive into the major feature updates and benefits. As with previous Confluent Platform releases, you can always find more details about the features in the release notes.
Keep reading to get an overview of what’s included in Confluent Platform 7.4, or download Confluent Platform now if you’re ready to get started
Confluent Platform 7.4 comes with several enhancements to existing features, enhancing scalability while bringing a more simplified architecture, expanding DevOps automation, and improving data quality. Here are some highlights:
As part of this release, KRaft is now production-ready and generally available in Confluent Platform 7.4 for new deployments. Specifically, support has been added for Confluent for Kubernetes, Ansible, and Docker greenfield deployments. Following ZooKeeper’s removal in Apache Kafka 3.0, Confluent Platform 7.0 introduced Apache Kafka Raft Metadata mode (KRaft) in preview to make it easier to monitor and scale Kafka clusters to millions of partitions.
Part of KIP-500, KRaft was introduced to remove Kafka’s dependency on ZooKeeper for metadata management. Replacing external metadata management with KRaft greatly simplifies Kafka’s architecture by consolidating responsibility for metadata into Kafka itself, rather than splitting it between two different systems: ZooKeeper and Kafka.
This improves stability, simplifies the software, and makes it easier to monitor, administer, and support Kafka. It also allows Kafka to have a single security model for the whole system, along with enabling Kafka clusters to scale to millions of partitions through improved control plane performance with the new metadata management. The enhanced scalability also enables up to 10x improvement in recovery time for controlled shutdowns.
Support for Multi-Region Clusters/Observers or for upgrades from existing deployment is expected to be part of a future release. KIP-866 also adds the ability to migrate a Kafka cluster from ZooKeeper to KRaft mode as an early access feature. The migration copies the cluster metadata from ZooKeeper to the KRaft metadata log. During the migration, brokers are restarted in KRaft mode one at a time, allowing the whole migration process to happen without cluster downtime. As mentioned, this feature is early access in Apache Kafka 3.4 and should not be used in production yet.
Confluent for Kubernetes (CFK) is the platform of choice for deploying and managing Confluent Platform on Kubernetes infrastructures in a cloud-native, declarative manner. Confluent Platform component resources are managed by CFK using Custom Resource Definition (CRD) constructs. CRDs span both infrastructure and application-specific resources.
Today, while client team’s using Confluent's central platform has technical knowledge to manage Confluent Platform, many application teams lack resources to interact with the platform effectively. The central platform team must integrate with various enterprise systems, requiring responsibility for infrastructure setup, best practices development, and intuitive self-service tooling. Application teams require robust self-service tooling to streamline their usage of Confluent.
That's why we've introduced Blueprints with Confluent for Kubernetes 2.6, which are a new set of higher-level abstracts that allow platform teams to define a set of standardized ways (prod, staging, dev, qa, 0-RPO-availability, etc.) to deploy Confluent Platform. Our single control plane architecture built into CFK allows you to manage deployments in a single Kubernetes cluster or across multiple Kubernetes clusters. This also enables a self-service developer interface that allows application teams to create deployments and app resources through a simplified API. Blueprints also provide simplified APIs to handle complex networking and security environments.
The platform team utilizes Blueprints to deploy Confluent Platform, enabling the application team to interact solely with the application resources.
Additional benefits such as automation of credential management can be enabled using Blueprints. Automation around discovery of Confluent Platform components, updating certificates, and rolling clusters can be achieved via Blueprints.
To learn more, see our 5-minute video on Blueprints.
With CFK 2.6, customers can deploy and manage Confluent Platform on Kubernetes infrastructures with ease, while also benefiting from enhanced security, performance, and proactive support capabilities.
As organizations deliver against an expectation for “real-time everything,” it becomes important for engineering teams to trust the consistency, reliability, and quality of their data in motion. This drives the need for clear Data Quality Rules—a formal agreement between upstream and downstream components on the structure, semantics, and quality of data in motion. The upstream component enforces the Data Quality Rules, while the downstream component can assume that the data it receives conforms to the Data Quality Rules.
To support the usage and management of Data Quality Rules in Confluent Platform, we’ve added data quality rules—alongside tags and metadata—to Schema Registry. With this release, we’ve added data quality rules for domain validation and schema migration, allowing developers and architects to easily validate important, sensitive data and/or easily move from old data formats to new ones.
Domain validation rules are used to validate the value of a single field based on a boolean predicate. Plus, domain validation rules can have actions triggered on failure or on success of the rule.
Confluent Schema Registry supports domain validation rules execution based on Google Common Expression Language (CEL) that implements a common and simple semantics for expression evaluation.
For example, let's say we want to validate that the "ssn" field is only 9 characters long before we serialize a message. We can use a schema with the rule below, where the message is passed to the rule as a variable with the name "message".
Now that we've set up our domain validation rule, whenever a message is sent that has a "ssn" field that is not 9 characters in length, the serializer will throw an exception. Alternatively, we can have the message sent to a dead-letter queue topic via an action:
When using conditions, one can model arbitrary Event-Condition-Action (ECA) rules by specifying an action for "onSuccess" to use when the condition is true, and an action for "onFailure" to use when the condition is false. The action type for "onSuccess" defaults to the built-in action type NONE, and the action type for "onError" defaults to the built-in action type ERROR. One can explicitly set these values to NONE, ERROR, DLQ, or a custom action type. For example, one might want to implement an action to send an email whenever an event indicates a credit card is about to expire.
Complex schema migration rules are used to evolve a schema in an incompatible manner by applying transformations when consuming from a topic to translate the topic data from the old format to the new format. With this approach, the client does not need to be switched over to a different topic (the current alternative), but instead continues to read from the same topic, and a set of declarative migration rules will massage the data into the form that the consumer expects.
With the use of declarative migration rules, we can support a system in which producers separately use versions 1, 2, and 3 of a schema, which are all incompatible, and consumers that expect versions 1, 2, or 3 each see a message transformed to the desired version, regardless of what version the producer sent.
For example, let’s say we want to rename the field "ssn" as "socialSecurityNumber", an incompatible schema evolution rule in any schema language we support. Below are a set of migration rules to achieve this. Note that we use the JSONata function called "sift" to remove the field with the old name, and then use a JSON property to specify the new field.
For further technical information, be on the lookout for an upcoming blog series on Data Quality Rules.
Confluent Platform 7.4 is built on the most recent version of Apache Kafka, in this case, version 3.4. For more details about Apache Kafka 3.4, please read the blog post by Sophie Blee-Goldman or check out the video by Danica Fine below.
Download Confluent Platform 7.4 today to get started with the only cloud-native and complete platform for data in motion, built by the original creators of Apache Kafka.
Use the Confluent CLI and API to create Stream Designer pipelines from SQL source code.
Announcing the latest updates to Confluent’s cloud-native data streaming platform, centralized identity management, enhanced RBAC, Client Quotas, and more.