[Webinar] How to Protect Sensitive Data with CSFLE | Register Today
Apache Kafka® cluster administrators often need to solve problems like how to onboard new teams, manage resources like topics or connectors, and maintain permission control over these resources. In this post, we will demonstrate how to use Confluent for Kubernetes (CfK) to enable GitOps with a CI/CD pipeline and delegate resource creation to groups of people without distributing admin permission passwords to other people in the organization.
Confluent for Kubernetes is well known as a cloud-native control plane for deploying and managing Confluent Platform in your private cloud environment. CfK can deploy Confluent components (brokers, Schema Registry, etc.) as well as resources on these components like schemas, topics, or RBAC bindings.
What is less known is that CfK can be used to manage resources (schemas, topics, or RBAC bindings) in a Confluent cluster deployed outside of Kubernetes. This functionality allows operators to create efficient CI/CD pipelines and delegates the approval of the creation of new resources to other people in the organization.
In a few steps, we will set up a new Confluent cluster (via cp-demo) and then configure CfK to be able to deploy our first set of resources.
It is not the main goal of this article to discuss networking in Kubernetes, so some shortcuts will be taken to simplify the communication from Kubernetes to servers outside of Kubernetes. Notably, the networking is managed in our shell script starter file.
An overview of the architecture for this demo can be seen in the diagram below.
The main goal of using CfK to control the Confluent Platform deployment is to easily put a CI/CD platform in place. The resource definitions can be stored in your version control system and you can use an automation process (Jenkins, GitHub Actions, etc.) to ensure the definitions are deployed to Kubernetes. As a result, the changes in each resource are versioned, controlled, and approved before being deployed. This allows Confluent Platform administrators to:
Easily delegate the approval and merging of pull requests, reducing the process and manual tasks they need to execute
Use tools like Kustomize, Argo CD, or Flux CD to simplify the creation of the resources by putting in place a topic as a service system, where each topic owner will manage access to their own topic
Our documentation covers how to configure your Confluent Platform cluster. We will use the cp-demo Docker environment to start a complete cluster. Cp-demo contains brokers, Schema Registry, Control Center, and permissions that are preconfigured and ready to use.
Our Kubernetes cluster will have the CfK operator and the resources. For the demo, we used a multi-platform tool called Kind which has all we need to start a Kubernetes cluster on our machine.
There are several custom resources in this example:
The commercial component of Confluent controls the lifecycle of custom resources. More details are available in the Confluent for Kubernetes documentation. This essential component enables you to generate Kubernetes resources that will seamlessly transform into topics, schemas, RBAC bindings, or even connectors.
These native Kubernetes resources store Confluent cluster password and TLS certificate data. These resources need to be set only once and will not be stored in our version control system. Confluent Platform administrators will create these resources, eliminating the need to share credentials with others.
This resource stores the connection data for the Kafka cluster, and it will be referenced in other resources. Using this custom resource definition reduces the verbosity of other resources so we do not have to repeat ourselves.
These resources will be created on demand and are the key point of this demo. Allowing topics, schemas, and rolebindings to be created as resources serves as the base to offer topics as a service.
Furthermore, connector-as-a-service is also a common request and can be achieved using this approach.
1. Get the repo and go to the directory.
2. To make this example easier to begin, please use the shell script to quickly start cp-demo. This script will adapt the hosts file, if needed, to be able to use some Docker capabilities:
3. Launch the Kubernetes cluster.
4. Install CfK operator.
5. Create the bearer secret. In our example, we are using the superUser from the cp-demo example. This secret needs to be created once and the user set here will be the one used by CfK to log in to the Confluent Platform cluster.
6. Create the TLS configuration. This configuration is needed only if your cluster is using TLS, as we are in the cp-demo.
1. Create the KafkaRestClass resource.
In this resource, we are using the authentication and TLS secrets created in the previous steps. Also, the endpoint to the MDS is defined here.
2. Create the topics and schema resources.
This file has a lot of important facets, so let's check each resource separately:
We create a demo-topic-1
with four partitions and two replicas for each partition, we also define this resource to be created in the KafkaRestClassRef cp-demo-access
, the same one we defined in the step above. It is important to mention that CfK allows you to update resources too. For example, you can change dynamic config properties:
To create a schema, we need to create a config map defining the schema, and then we use the config map to create the schema resource. The schema is created in Schema Registry, so we need to provide the connection details to access Schema Registry (endpoint, authentication details, and TLS configuration).
3. Create a connector resource.
This connector resource has everything we need to create a connector:
As part of the schema creation, we create a resource that provides the data needed by Connect, this data goes under the configs part (for the sake of simplicity, we did not provide all the data here). An important point to highlight, since the connector should be created in the connect worker, we need to provide the Connect REST details (endpoint, authentication details, and TLS configuration).
4. Create the role-binding resource.
Let's dig into the details of this file:
We are creating two bindings here: we are giving the role ResourceOwner
to group KafkaDevelopers on topics prefixed by demo-topic
. In the second resource, we are giving the role ResourceOwner
to group KafkaDevelopers on a connector called demo connector-1
.
We created a topic with a schema and gave resourceOwner
to a determined group, so if we log in with a user on this group, we should see only the topic with user permissions.
As a first step, let's check that the resources are correctly created:
All resources should be on state “Created” or “Succeeded.”
Go to http://localhost:9021/
User: alice / password: alice-secret
Go to cluster > Topics > demo-topic-1
Only this topic is visible
Check the schema and it should be there
We checked all the permissions and they were correctly applied. So in short:
The topic was created
The schema was created
The connector was created
Permissions were correctly applied
It is time to clean our environments:
Once we started our Confluent cluster and configured our Kubernetes cluster, it was fairly simple to create resources on Kubernetes and see that the CfK took on the job of creating these resources on our Confluent cluster. How can we extend this demo to the real world?
Using Kustomize (or any other similar tool) and different pipelines, we can create the resource definitions once and apply them in different Kubernetes clusters, ultimately deploying the resources in different environments.
Any Confluent cluster administrator can delegate the PR merge to another team or person and reduce their daily tasks.
Storing the resources definition in a repository allows us to audit who created, deleted, or modified any resource and when that action was done and who approved it.
CfK is a supported Confluent feature, so users can rely on Confluent support and maintenance of this feature.
You can learn more about this topic with the following resources:
Documentation: Confluent for Kubernetes
Video: Use GitOps as an efficient CI/CD pipeline for Data Streaming
Blog post: Self-Service GitOps for Confluent Cloud
Demo: Confluent for Kubernetes
Operating critical Apache Kafka® event streaming applications in production requires sound automation and engineering practices. Streaming applications are often at the center of your transaction processing and data systems, requiring […]
GitOps can work with policy-as-code systems to provide a true self-service model for managing Confluent resources. Policy-as-code is the practice of permitting or preventing actions based on rules and conditions defined in code. In the context of GitOps for Confluent, suitable policies...