Building a GenAI App? Learn Tips in This Webinar! | Register Now

New Security Tools to Protect Your New Year’s Resolutions

Written By
  • Philip WangTechnical Product Marketing Manager, Confluent

In the ever-evolving landscape of data streaming and processing, securing and efficiently connecting services has become paramount as it ensures the integrity, confidentiality, and availability of data amidst increasing threats and complexity in cyber environments. Effective security measures must protect against data breaches and losses, fostering trust among stakeholders and complying with legal and regulatory standards. In addition, efficient connections enable seamless interoperability and real-time processing capabilities, critical for optimizing operations and driving innovation.

Confluent, a data streaming platform built on Apache Kafka®, has been at the forefront of these initiatives, offering robust solutions to secure data traffic and ensure seamless connectivity. This blog explores three pivotal features: Mutual TLS) authentication for dedicated clusters, Private Link for Schema Registry, and Private Link for Apache Flink®, illustrating how they collectively fortify Confluent's position as the leader in data streaming.

With Confluent’s latest security features, Kafka and Flink developers can:

  • Simplify client authentication, requiring trusted certificates to access Kafka clusters with mTLS authentication.

  • Ensure data quality, securing access to Schema Registry without egressing to the public internet with PrivateLink for Schema Registry.

  • Enhance stream processing, developing and monitoring real-time data streams securely with Private Link for Flink.

Let’s explore how these capabilities come to life.

Protect sensitive data and reduce risk with mTLS authentication

Mutual TLS (mTLS) extends the traditional Transport Layer Security (TLS) protocol by requiring both the client and server to authenticate each other. This two-way authentication ensures a higher degree of security, specifically in scenarios where data in transit needs protection against interception or tampering.

mTLS offers enhanced security by adding a layer of authentication, ensuring both parties in a communication are verified, which is very valuable in industries with strict data protection and privacy regulations, aiding in compliance. Additionally, by securely identifying each endpoint of the communication channel, mTLS substantially reduces the risk of man-in-the-middle (MITM) attacks, making it a robust solution for mitigating fraud risk.

Leverage ready-to-use templates to jump-start building Kafka applications with Confluent.

Implementing mTLS in Confluent ensures that connections between components such as producers, consumers, and brokers are securely encrypted and mutually authenticated, protecting sensitive data streams.

How does it work? 

First, the client starts a TLS handshake with the Confluent Cloud cluster, which then asks for the client’s certificate. The client sends its certificate for verification, and the cluster reciprocates with Let’s Encrypt certificates for the client to verify, achieving mutual authentication. For granular access control post-TLS handshake, the cluster verifies if the client’s certificate metadata matches any pre-configured certificate identity pool principals for authorization. Certificate identity pool is a type of Confluent principal for mapping client certificates to permissions in Confluent Cloud. Failure to match any certificate identity pools results in an authorization error and the connection is terminated if any step fails. 

Ready to get started with mTLS?

Here’s an example of how to configure mTLS in Confluent Cloud. The subsequent steps require a Certificate Authority (CA) PEM file, a client certificate, and a Kafka client. If you do not have a CA readily available, you can generate one using OpenSSL on the command line by following these steps:

1. Create the root key. This step generates a private key encrypted with a strong password.

  • openssl genrsa -aes256 -out rootCA.key 4096

2. Create the self-signed root certificate. For testing purposes, the -nodes flag, allows you to examine the created certificate:

  • openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.pem

3. Create a client certificate file. First, generate a client private key:

  • openssl genpkey -algorithm RSA -out client.key -aes256

Next, create a certificate signing request (CSR):

  • openssl req -new -key client.key -out client.csr

Finally, sign the CSR using the root certificate:

  • openssl x509 -req -in client.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -out client.pem -days 500 -sha256

4. Verify the client certificate. Check that the client certificate can be verified by the CA:

  • openssl verify -CAfile rootCA.pem client.pem

Once the CA is ready, follow the below steps to start configuring mTLS in Confluent Cloud:

  1. In the console, navigate to Accounts & Access. Then, hover to the new Workload Identities. This is where you can find your identity providers along with their provider types (like certificates).

    ‎ 

    1. Here you will create a new provider and select the Certificate Authority authentication type, which will be used for a X.509 client certificate authentication.

      ‎ 

    2. Apply a name for the identity provider, such as “mTLS Knowledge Share” and upload a PEM formatted Certificate Authority (CA) file, which will verify the client certificates. As an additional option, you can also upload a Certification Revocation List (CRL) file or URL to ensure revoked client certificates are rejected.

      ‎ 

    3. Once validated and saved, you can now add an identity pool that will contain a certificate identifier, a certificate metadata mapping, and also RBAC roles to enable access to topics and consumer groups.

      ‎ 

    4. For the identity pool setup, you will add a name and certificate identifier as “CN (up to 255 char)” and set the filter as “CN=`test`”. Afterward, you will assign this pool to a Confluent Cloud cluster, where it serves as a ResourceOwner allocated to all Topics and Consumer Groups.

      Validate and save your new identity provider with mTLS.

      ‎ 

  2. Once the identity provider and identity pool are created, you are ready to connect to a dedicated Kafka cluster using a valid client certificate with mTLS authentication.

    ‎ 

  3. To test the connection, you will generate a client keystore and a client properties file in the same place as your Apache Kafka client, to list the topics in the cluster.

    ‎ 

    1. Create the Client Keystore

      ‎ 

      Run the following command to export the client certificate into a .p12 file:

      ‎ 

      openssl pkcs12 -export -in client.pem -inkey client.key -out client.p12

      ‎ 

    2. Client Properties File

      ‎ 

      Create a client.properties file with the following configuration:

      ‎ 

      security.protocol=SSL ssl.keystore.location=client.p12 ssl.keystore.type=PKCS12 ssl.keystore.password=<keystore password> ssl.key.password=<key password>

      ‎ 

  4. You can now run the following command to list topics in the cluster:

    ‎ 

    kafka-topics.sh --bootstrap-server <bootstrap URL> --command-config client.properties --list

Ensure and secure data quality with Private Link for Schema Registry

Schema Registry is an essential component of the Confluent ecosystem, providing a serving layer for your metadata. It enables the definition, storage, and retrieval of schemas related to Kafka messages, ensuring data consistency and compatibility.

Initially, Schema Registry within Confluent Cloud was only accessible via public internet endpoints. Over the years, however, there's been growing demand for the option to connect Schema Registry through a private endpoint, enabling access from within a customer’s virtual private cloud (VPC) without the public internet.

Customers seek private access for several reasons:

  1. Integrating the Schema Registry endpoint within the private network that hosts all VPC-based client applications. This is often due to internal security policies that prohibit external internet access from within the VPC, ensuring all network traffic remains internal. In addition, companies with schema creation operations situated on private networks require a private endpoint for Schema Registry to facilitate schema updates as part of their operational workflows.

    ‎ 

  2. Enhancing security by preventing cloud resources from being available over the public internet. In situations where API keys may be compromised, not having resources accessible via the public internet provides an extra layer of security.

    ‎ 

  3. Simplifying connectivity between Confluent resources that leverage Schema Registry. Previously, users would self-manage a Schema Registry broker that connects to various fully managed Confluent Cloud features, thus complicating the connectivity process.

The integration of Private Link with Confluent's Schema Registry represents a significant advancement toward more robust connectivity and security. Private Link allows secure and private access to the Schema Registry within a virtual network, keeping traffic off the public internet and reducing exposure to potential threats. Additionally, numerous critical tools in Confluent Cloud that leverage Schema Registry (including Schema Validation, usage of schemas by Connectors, Flink, and ksqlDB) require private connectivity. Private Link now allows your connections to Schema Registry to remain shielded from the public internet at all times.

Leveraging Private Link leads to a variety of beneficial outcomes. First, enhanced security and privacy are achieved as Private Link ensures that data does not traverse the public internet, thus reducing potential cyberattacks. Additionally, companies can simplify their network architecture, resulting in reduced costs and technological complexity, especially in use cases that traditionally require VPNs or dedicated pipelines. Also, direct connectivity through Private Link often results in lower latency and higher throughput, enhancing overall application performance.

Currently, Schema Registry Private Link supports only AWS Dedicated clusters and Enterprise clusters.

Create a PrivateLink Attachment (PLATT) to establish a connection between your client applications and Schema Registry.

Develop and monitor stream processing with Private Link for Apache Flink 

Apache Flink is renowned for its ability to process streams of data in real time. Confluent’s integration with Flink via Private Link further augments this capability, ensuring that sensitive data processed by Flink remains secure and isolated from public access.

Flink requires PrivateLink Attachment (PLATT), facilitating secure connections between various client interfaces (such as Confluent Cloud Console UI, Confluent CLI, Terraform, apps leveraging Confluent REST API) and Flink. Flink-to-Kafka is routed internally within Confluent Cloud; therefore, PLATT is primarily focused on submitting Flink statements and retrieving results from the client.

Flink with PrivateLink Attachment enhances the security landscape for Kafka clusters and Flink states and workspaces, requiring secure and private connectivity to access these critical resources. Flink statements can now access topics within both Dedicated and Enterprise clusters, covering all network types, including public, PrivateLink, VPC peering, and AWS Transit Gateway. Nevertheless, access to these clusters is closely regulated through strict access control, preventing unauthorized access and risk of data leaks to the public internet.

Both Flink statements and workspaces often contain sensitive information. Under Private Link, creating, viewing, and accessing private statements and workplaces cannot be done via public networking. With private networking enabled, users can read both public and private data. However, to mitigate the risk of data exfiltration when using Flink private networking, statements are restricted to only writing to clusters also operating on private networks. Should users want to write to a public cluster, utilizing Flink within a different environment without PrivateLink Attachment is advised.

The new addition of Private Link for Flink provides the following benefits:

  • Security: Just like with the Schema Registry, data traffic between Flink and other Confluent components is kept off the public internet.

  • Low latency: Direct connectivity ensures minimal delay in data processing, crucial for real-time analytics.

  • Scalability: Flink's scalability combined with the secure and efficient connectivity provided by Private Link means businesses can scale their operations without compromising on security or performance.

Currently, Confluent Cloud for Apache Flink® supports only private networking for AWS Dedicated clusters and Enterprise clusters.

Leverage PrivateLink Attachment (PLATT) to connect your client applications and Flink.

What’s next?

Confluent's incorporation of Mutual TLS, along with Private Link for both Schema Registry and Flink, showcases our commitment to providing a secure, efficient, and user-friendly data streaming platform. These features not only enhance security but also improve connectivity and performance, enabling businesses to leverage real-time data streaming and processing in a more secure and efficient manner. As data becomes increasingly central to business operations, the importance of such robust solutions cannot be understated. By prioritizing both security and connectivity, Confluent continues to set the standard for data streaming platforms.

Confluent Mutual TLS and Private Link for Flink are available now. Private Link for Schema Registry is available in Limited Availability and will be broadly available this year. mTLS support for other Confluent services, including Schema Registry, will be available later. We’re looking for feedback and input from our customers and users before making this extension generally available on Confluent Cloud and Confluent Platform in 2025. We look forward to expanding and adding exciting new features in quarters to come, especially in the security space, providing our users a safe, productive and efficient experience with real-time data streaming.

Want to learn more? Visit the product page for more information. Join Confluent Community and subscribe to our biweekly newsletter for the latest Kafka and Flink learning materials, news, community meetups and events, useful terminal hacks, and some fun finds from around the web.

If you’re new to Confluent and haven’t already, sign up for a free trial of Confluent Cloud and create your first cluster to explore new topics and create streaming pipelines and applications. New sign-ups receive $400 to spend within Confluent Cloud during their first 30 days. Use the code CCBLOG60 for an additional $60 of free usage.* If a self-managed deployment is more suitable for your use case, sign up for a free trial of Confluent Platform to get started with the only cloud-native and comprehensive platform for data in motion, built by the original creators of Apache Kafka.

‎ 

Apache®, Apache Kafka®, Kafka®, Apache Flink®, Flink®, the Kafka logo, and the Flink logo are registered trademarks of the Apache Software Foundation.

  • Philip is a technical product marketer at Confluent, responsible for technical content creation of product launches, in-depth tutorials, keynotes, and tradeshows.

Did you like this blog post? Share it now