[Webinaire] La reprise après sinistre des systèmes basés sur Kafka | Inscrivez-vous dès maintenant

Apache Kafka with Control and Data Planes

Écrit par

Your mission-critical clusters

Apache Kafka® and Confluent Platform are commonly used to put mission-critical business data in motion to gain timely critical insights and make proactive business decisions.

For a mission-critical Kafka deployment, it's imperative to have security and governance controls in place. These controls require infrastructure and metadata. A good architecture segregates the infrastructure that supports the business application versus the infrastructure that supports security and governance controls.

Confluent Cloud is architected this way in order to provide secure, reliable, and cost-effective Kafka-as-a-Service. This blog post demonstrates how to architect Confluent Platform deployments in this manner.

Accordingly, you want to eliminate as much noise from neighboring applications as possible in order to run your cluster at maximum efficiency. Typically, you end up deploying your cluster on a dedicated environment or a network, and expect that your cluster will service only your client application data and its bandwidth will not be consumed by any noisy neighbors.

Normally, your Kafka or Confluent Platform-based cluster looks like this:

Whereas you want it to look like this

Minimizing internal noise

With your cluster hosted in a dedicated environment, the goal is to see if you can further reduce the noise that is generated within your cluster. You would like to minimize any internal messages that are stored on Kafka internal topics like messages for monitoring, security, and cluster administration. Here’s an example of the internal traffic that Control Center exerts: 

  • Control Center state store size ~50 MB/hr

  • Kafka log size ~500 MB/hr (per broker)

  • Average CPU load ~7 %

  • Allocated Java on-heap memory ~580 MB and off-heap ~100 MB

  • Total allocated memory including page cache ~3.6 GB

  • Network read utilization ~150 KB/sec

  • Network write utilization ~170 KB/sec

Likewise, Confluent Metadata Service also creates internal topics, exposes additional listener ports (for JWT token validation and for administration), and makes search queries to LDAP which can return a lot of records.

Can you also eliminate this noise? Yes, but you cannot completely eliminate internal topics used by brokers themselves. With Confluent Platform you can minimize it to a large extent. You can use most of your cluster bandwidth to process only your business data and keep auxiliary data and message exchange to a minimum.

Adding control and data planes to Kafka

With the advent of service mesh and containerized applications, the idea of the control and data plane has become popular. A part of your application infrastructure, such as a proxy or sidecar, is dedicated to aspects such controlling traffic, access, governance, security, and monitoring and is referred to as the control plane. Another part of your application infrastructure that is used purely for processing your business transactions is referred to as the data plane.

Can you do this for your Kafka cluster? For example:

Yes, of course (as shown above). With very little effort and configuration this setup can be achieved for your Kafka clusters. The complete picture looks like the following:

The primary benefit of this pattern is a clear segregation between clusters that process your business data and a cluster that is responsible for administration and security. You can also have multiple data clusters which can be managed by a single cluster in your control plane, centralizing administration and control. Further, you can free up your data plane cluster's bandwidth for processing your business data.

Setting up a control plane for a single data cluster can be of value if your data cluster is mission critical, like handling financial messages or transactions. Otherwise, the preferred approach is to use a control plane cluster to manage multiple data plane clusters. 

The following details each aspect of this setup.

Configure external monitoring 

Confluent Metrics Reporter is a component that is responsible for gathering JMX metrics and feeding it to internal topics on your Kafka brokers. You can have the metrics reporter feed these metrics to another/remote Kafka broker instead of the local Kafka broker to free up your data plane Kafka brokers from handling any metrics-related message traffic. The configuration snippets below show how this is done.

Data plane

A configuration snippet from the Kafka broker configuration file - server.properties

confluent.metrics.reporter.bootstrap.servers=<Control Plane Bootstrap Server URLs>
confluent.metrics.reporter.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="admin" password="admin-secret";
confluent.metrics.reporter.sasl.mechanism=PLAIN
confluent.metrics.reporter.security.protocol=SASL_PLAINTEXT

Control plane

Confluent Control Center can monitor multiple clusters as described in the documentation.

On your control plane cluster, you need to let Confluent Control Center discover the cluster(s) in your data plane so it can get metadata about your data plane cluster such as the cluster ID as well as the name that you give to your data cluster; in this case we call it the “data-plane.”

A configuration snippet from the Confluent Control Center configuration file - control-center-production.properties

confluent.controlcenter.kafka.data-plane.bootstrap.servers=<Data Plane Bootstrap Server URLs>
confluent.controlcenter.kafka.data-plane.cprest.url=http://<IP address>:8090,http://<IP address>:8090,http://<IP address>:8090
confluent.controlcenter.kafka.data-plane.metadata.basic.auth.user.info=kafka-broker:kb-secret
confluent.controlcenter.kafka.data-plane.metadata.bootstrap.server.urls=http://<IP address>:8090,http://<IP address>:8090,http://172.31.11.37:8090
confluent.controlcenter.kafka.data-plane.metadata.http.auth.credentials.provider=BASIC
confluent.controlcenter.kafka.data-plane.security.protocol=SASL_PLAINTEXT

Confluent Control Center is used here for demonstration purposes only. Alternatively, you could use the Prometheus JMX agent to export these metrics to your favorite monitoring platform (e.g., Prometheus/Grafana or Datadog, etc.). The point is that you have moved the metrics traffic away from your data plane Kafka brokers.

Configure external access control

The Confluent Metadata Service (MDS) is a component which is responsible for authorization across all client applications as well as Confluent Platform internal components. It is also responsible for authentication for your platform components such as Schema Registry or ksqlDB. However, it needs an identity provider (LDAP-based directory service) to refer to identity records. Now this seems like a lot of work, so why not delegate it to an external cluster (control plane) and free up your (data plane) cluster for solely processing your business data?

Confluent Metadata Service can be used to manage multiple clusters, however, this example shows just one cluster in the data plane, but you can manage multiple clusters using a single metadata server.  

Data plane

There are two sections below that provide an excerpt from the Kafka broker configuration file on the data plane. In the first section you can see how the broker is made to connect to a remote Metadata Service.

confluent.metadata.bootstrap.servers=<Control Plane Bootstrap Server URLs>
confluent.metadata.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="admin" password="admin-secret";
confluent.metadata.sasl.mechanism=PLAIN
confluent.metadata.security.protocol=SASL_PLAINTEXT
confluent.metadata.topic.replication.factor=3

In the configuration snippet below there are two things going on: first, a JWT token is obtained by authenticating with the remote (control plane) broker using HTTP basic authentication and second, the token is submitted to the token listener port (9092) to authenticate and get required access to the topic on the remote broker.

kafka.rest.bootstrap.servers=<IP address>:9092,<IP address>:9092,<IP address>:9092
kafka.rest.client.security.protocol=SASL_PLAINTEXT
kafka.rest.confluent.metadata.basic.auth.user.info=kafka-broker:kb-secret
kafka.rest.confluent.metadata.bootstrap.server.urls=http://<IP address>:8090,http://<IP address>:8090,http://<IP address>:8090
kafka.rest.confluent.metadata.http.auth.credentials.provider=BASIC
kafka.rest.enable=true
kafka.rest.kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
kafka.rest.public.key.path=/var/ssl/private/public.pem
kafka.rest.rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

Additional resources

In addition to externally controlling access, it is also possible to centralize audit logging to another Confluent Platform server cluster as described here: centralized audit logs. You can also use Confluent Cluster Registry to centrally manage your clusters.

The configuration files for this examples can be found at https://github.com/sanjaygarde/cp-kafka-with-planes, and includes:

Control and data planes in Confluent Cloud

As you can see, it's simple to externalize monitoring, authorization, and administration for  Confluent Platform-based clusters. However, if you are considering or using Confluent Cloud then you are in luck as this abstraction of control and data planes is inherent in Confluent Cloud. 

Each of your Kafka clusters are purely used for processing your business events (messages) only. For your metrics, security (authentication, identity federation/SSO, authorization, audit log, etc.), monitoring, and administration separate Kafka clusters are used. 

Moreover, with Confluent Cloud, Schema Registry, Kafka Streams, and ksqlDB are also hosted on separate clusters, leaving your Kafka cluster to process only your business data and assuring the full bandwidth of your cluster for your business.

If you’d like to try Confluent Cloud, new sign-ups receive $400 in free usage! And be sure to use the code CL60BLOG for an additional $60 of free usage (details).

  • Sanjay Garde is a senior customer success technical architect who has worked in the field with Confluent customers for more than three years. He brings a couple of decades of industry experience doing system design, system integration, and data engineering.

Avez-vous aimé cet article de blog ? Partagez-le !