[Webinar] How to Protect Sensitive Data with CSFLE | Register Today
Compliance requirements often dictate that services should not store secrets as cleartext in files. These secrets may include passwords, such as the values for ssl.key.password, ssl.keystore.password, and ssl.truststore.password configuration parameters (as shown below), or any other sensitive data in the configuration files or log files. Here is a snippet from a properties file with standard SSL configurations that users often don’t want in cleartext:
security.inter.broker.protocol=SSL ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks ssl.keystore.password=test1234 ssl.key.password=test1234 ssl.truststore.location=/var/private/ssl/kafka.server.truststore.jks ssl.truststore.password=test1234
For Apache Kafka®, in which services read configuration files on startup, the question arises: how should you protect these secrets? Before Confluent Platform 5.3, you would have considered:
However, there was still a risk that a bad actor—a rogue employee, shoulder surfer, or hacker—could gain access to those configuration files or log files, which contain the cleartext secrets. Taking on a bit more complexity, you could encrypt data at the storage layer with encrypted volumes using specialized kernel modules that support process-based ACLs, but still someone who gained access could potentially see the values in cleartext.
A safer approach is to encrypt secrets so that even if someone were to gain access to the files, the secrets would not even be in cleartext. Confluent Platform 5.3 introduces a simple solution for secret encryption. Secret Protection, a commercial feature, encrypts secrets within the configuration file itself and does not expose the secrets in log files. It extends the security capabilities originally introduced in KIP-226 for brokers and KIP-297 for Kafka Connect, and provides additional functionality for encrypting the secrets across all of Confluent Platform. Now you can deploy end-to-end Secret Protection in your production event pipeline, including the brokers, Connect, KSQL, Confluent Schema Registry, Confluent Control Center, Confluent REST Proxy, etc.
Secret Protection uses envelope encryption, an industry standard for protecting encrypted secrets through a highly secure method. First, a user specifies a master passphrase that is used, along with a cryptographic salt value, to derive a master encryption key. A separate data encryption key is generated, and the master encryption key is used to encrypt the data encryption key before storing it in a secure file.
Both the master key and data encryption key are then used to encrypt the secrets in the configuration files. The service can decrypt these secrets because KIP-421 provides automatic resolution of indirect variables. The end result is that even if someone gains access to a configuration file, all they would be able to see are encrypted secrets, and they have no way to decrypt them without knowing the master encryption key.
To get started with this feature, we will step through a few examples. For a list of all relevant commands, please review the Confluent command line interface (CLI) reference.
Before you start:
In the most common use case, you would want to encrypt passwords. The security tutorial provides an example of how to enable security features on Confluent Platform, but that takes extra steps to generate the keys and certificates and to add the TLS configurations. Therefore, instead of encrypting a password, we will encrypt a basic configuration parameter, but the steps are exactly the same.
First, choose your master encryption key passphrase, a phrase that is much longer than a typical password and is easily remembered as a string of words. Enter this passphrase into a file (e.g., /path/to/passphrase.txt), to be passed into the CLI, to avoid logging history showing the passphrase. Then choose the location of where the secrets file will reside on your local host (not where the Confluent Platform services run—e.g., /path/to/secrets.txt). The secrets file will contain encrypted secrets for the master encryption key, data encryption key, and configuration parameters along with their metadata, such as which cipher was used for encryption. Now, you are ready to generate the master encryption key:
$ confluent secret master-key generate --local-secrets-file /path/to/secrets.txt --passphrase @/path/to/passphrase.txt
Save the master key. It cannot be retrieved later. +------------+----------------------------------------------+ | Master Key | Nf1IL2bmqRdEz2DO//gX2C+4PjF5j8hGXYSu9Na9bao= | +------------+----------------------------------------------+
As the output indicates, the master encryption key cannot be retrieved later so make sure to save it somewhere. Export this key into the environment on the local host as well as every host that will have a configuration file with secret protection:
$ export CONFLUENT_SECURITY_MASTER_KEY=Nf1IL2bmqRdEz2DO//gX2C+4PjF5j8hGXYSu9Na9bao=
To protect this environment variable in a production host, you can set the master encryption key at the process level instead of at the global machine level. For example, you could set it in the systemd overrides for executed processes, restricting the environment directives file to root-only access.
Let’s use a configuration parameter available in a configuration file example that ships with Confluent Platform. We will encrypt the parameter config.storage.topic in $CONFLUENT_HOME/etc/schema-registry/connect-avro-distributed.properties.
First, make a backup of this file, because the CLI currently does in-place modification on the original file. Then choose the exact path for where the secrets file will reside on the remote hosts where the Confluent Platform services run. Now, you are ready to encrypt this field:
# Value before encryption $ grep "config\.storage\.topic" connect-avro-distributed.properties config.storage.topic=connect-configs
# Encrypt it # remote-secrets-file: /path/to/secrets-remote.txt confluent secret file encrypt --local-secrets-file /path/to/secrets.txt --remote-secrets-file /path/to/secrets-remote.txt --config-file connect-avro-distributed.properties --config config.storage.topic
# Value after encryption $ grep "config\.storage\.topic" connect-avro-distributed.properties config.storage.topic = ${securepass:/path/to/secrets-remote.txt:connect-avro-distributed.properties/config.storage.topic}
As you can see, the configuration parameter config.storage.topic setting was changed from connect-configs to ${securepass:/path/to/secrets-remote.txt:connect-avro-distributed.properties/config.storage.topic}. This is a tuple that directs the service to look up the encrypted value of the file/parameter pair connect-avro-distributed.properties/config.storage.topic from the /path/to/secrets-remote.txt secrets file.
View the contents of the local secrets file /path/to/secrets.txt. It now contains the encrypted secret for this file/parameter pair along with the metadata such as which cipher was used for encryption:
$ cat /path/to/secrets.txt ... connect-avro-distributed.properties/config.storage.topic = ENC[AES/CBC/PKCS5Padding,data:CUpHh5lRDfIfqaL49V3iGw==,iv:vPBmPkctA+yYGVQuOFmQJw==,type:str]
You can also decrypt the value into a file:
$ confluent secret file decrypt --local-secrets-file /path/to/secrets.txt --config-file connect-avro-distributed.properties --output-file decrypted.txt $ cat decrypted.txt config.storage.topic = connect-configs
You may have a requirement to update secrets on a regular basis, to help them from getting stale. The configuration parameter config.storage.topic was originally set to connect-configs. If you need to change the value in the future, you can update it directly using the CLI. In the CLI below, pass in a file /path/to/updated-config-and-value that has written config.storage.topic=newTopicName to avoid logging history that shows the new value:
$ confluent secret file update --local-secrets-file /path/to/secrets.txt --remote-secrets-file /path/to/secrets-remote.txt --config-file connect-avro-distributed.properties --config @/path/to/updated-config-and-value
The configuration file connect-avro-distributed.properties does not change, because it’s just a pointer to the secrets file. However, the secrets file has a new encrypted value for this file/parameter pair. With the dynamic broker configuration of KIP-226, some configuration parameters can be updated without a broker restart. For other parameters and services, it will need to be restarted:
$ cat /path/to/secrets.txt ... connect-avro-distributed.properties/config.storage.topic = ENC[AES/CBC/PKCS5Padding,data:CblF3k1ieNkFJzlJ51qAAA==,iv:dnZwEAm1rpLyf48pvy/T6w==,type:str]
That’s cool, but does it work? Try it out yourself. Run Kafka and start the modified Connect worker with the encrypted value of config.storage.topic=newTopicName:
# Start ZooKeeper and a Kafka broker $ confluent local start kafka
# Run the modified connect worker $ connect-distributed connect-avro-distributed.properties > connect.stdout 2>&1 &
# List the topics $ kafka-topics --bootstrap-server localhost:9092 --list __confluent.support.metrics __consumer_offsets _confluent-metrics connect-offsets connect-statuses newTopicName <<<<<<<
So far, we have covered how to create the master encryption key and encrypt secrets in the configuration files. We recommend that you operationalize this workflow by augmenting your orchestration tooling to enable Secret Protection on the destination hosts. There are four required tasks to do this:
These hosts may include Kafka brokers, Connect workers, Schema Registry instances, KSQL servers, Confluent Control Center, etc.—any service using password encryption. You can either do the secret generation and configuration modification on each destination host directly, or do it all on a single host and then distribute the encrypted secrets to the destination hosts. The CLI is flexible to accommodate either way.
You may also have a requirement to rotate the master encryption key or data encryption key on a regular basis. You can do either of these with the CLI, and the example below is for rotating just the data encryption key:
$ confluent secret file rotate --data-key --local-secrets-file /path/to/secrets.txt --passphrase @/path/to/passphrase.txt
We think security is one of the top priorities for our enterprise customers, and if you’d like to learn more about it, you can check out Dani Traphagen and Brian Likosar’s talk from Kafka Summit San Francisco: The Easiest Way to Configure Security for Clients AND Servers. They discuss Kafka security best practices and how to leverage a variety of security features, including Secret Protection, to appropriately lock down a cluster.
For more self-paced learning, feel free to explore our security tutorials as well:
We covered so much at Current 2024, from the 138 breakout sessions, lightning talks, and meetups on the expo floor to what happened on the main stage. If you heard any snippets or saw quotes from the Day 2 keynote, then you already know what I told the room: We are all data streaming engineers now.
We’re excited to announce Early Access for Confluent for VS Code. This Visual Studio integration streamlines workflows, accelerates development, and enhances real-time data processing, all in a unified environment. This post shows how to get started, and also lists opportunities to get involved.