[Webinar] Bringing Flink to On-Prem and Private Clouds. Register Now

How to Better Manage Apache Kafka by Removing Residue Data with Control Center Cleanup Script

作成者 :

This blog post is the fourth in a four-part series that discusses a few new Confluent Control Center features that are introduced with Confluent Platform 6.2.0. It focuses on removing residue data via a new cleanup script that helps you remove old Control Center instances easily. The series highlights the following new features that make managing Apache Kafka® clusters via Control Center an even smoother experience:

If you are not too familiar with Control Center, you can always refer to the Control Center overview first. Having a running Control Center instance at hand helps you explore the features discussed in this blog series better.

Tips:
To set up a simple Confluent Platform environment, including a Control Center instance, please refer to the quick start. If you prefer setting up a Confluent Cloud environment, please refer to the Cloud quick start. Keep in mind that some features discussed in this series are only available in Confluent Platform.


Now that you are ready, let’s delve into the fourth feature here in part 4: removing residue data with the Control Center Cleanup script.

What “removing residue data with cleanup script” is

With each version upgrade or ID update (explained more later), Control Center creates a new set of internal topics that correspond to the new Control Center instance. Consequently, after a Control Center upgrade/update, you may notice topics from old instances are left behind, cluttering the “Topics” overview page as shown below. These old topics are not used by the new Control Center instance, but they continue to take up disk space. Control Center does not automatically delete the old topics in order to avoid accidental removal of wanted data. Unfortunately, manual deletion of the old topics can make the Control Center upgrade/update process cumbersome and error prone.

Version 6.2.0 introduces a new cleanup script bin/control-center-cleanup that allows you to interactively delete the old instances’ residue—topics and local directories—easier and faster when you upgrade/update Control Center. With this new script, you can delete the old instances’ residue while the current instance of Control Center is running.

Topic residue from the Control Center upgrade of version 5.4.1 to 6.2.0
The example above shows the topic residue from the Control Center upgrade of version 5.4.1 to 6.2.0, where the old set of internal topics prefixed with _confluent-controlcenter-5-4-1-1 are left behind and coexist with the new set prefixed with _confluent-controlcenter-6-2-0-1.

The same issue occurs if you change the Control Center unique identifier using confluent.controlcenter.id in your properties file. Control Center unique identifiers are useful if you want multiple instances of Control Center to coexist on the same server. However, if you decide to keep only one instance after an identifier change, you will encounter the same data residue issues. For example, if you have Control Center version 6.2.0 and changed the ID from 1 to 2, then the old set of internal topics prefixed with _confluent-controlcenter-6-2-0-1 are left behind and coexist with the new set prefixed with _confluent-controlcenter-6-2-0-2.

How to run the cleanup script

Step 1 (important): Have a Control Center configuration file ready

The cleanup script requires a Control Center properties file to establish the initial connection to the Kafka cluster and to decide what the current running Control Center instance is in order to avoid deleting its data. The cleanup script uses:

  • confluent.controlcenter.name to determine the name of the running instance
  • The package that you’re running the script from to determine the version of the running instance
  • confluent.controlcenter.id to determine the unique identifier of the running instance
  • confluent.controlcenter.data.dir to determine the directory that contains local data of Control Center instances
Tip:
Please refer to the Control Center Configuration Reference for details.


For example, the following Control Center properties file etc/confluent-control-center/control-center.properties contains the following:

############################# Server Basics #############################
bootstrap.servers=localhost:9092
zookeeper.connect=localhost:2181
######################### Control Center Settings #########################
confluent.controlcenter.data.dir=/tmp/control-center
confluent.controlcenter.id=1
# using default confluent.controlcenter.name, “_confluent-controlcenter”

Therefore, running the cleanup script from package confluent-6.2.0, the script determines that the running instance is _confluent-controlcenter-6-2-0-1 (<running instance name>-<version>-<id>). It also determines that the local data of all instances reside in /tmp/control-center.

Step 2: Run the cleanup script

Assume that only the Control Center instance defined in the properties file—_confluent-controlcenter-6-2-0-1—is up and running.

Navigate to $CONFLUENT_HOME, run the script as ./bin/control-center-cleanup <props_file>, and you will get the following prompt:

Tip:
$CONFLUENT_HOME is the environment variable for your Confluent Platform directory. You can set it with export CONFLUENT_HOME=<path-to-confluent>, for example, export CONFLUENT_HOME=~/Downloads/confluent-6.2.0.

./bin/control-center-cleanup etc/confluent-control-center/control-center.properties
============================================================================
The cleanup script found the following instance: 
_confluent-controlcenter-6-2-0-1
We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup.

Here are the instances discovered for cleanup: _confluent-controlcenter-5-4-1-1 _confluent-controlcenter-5-4-1-2 Cleanup ALL of the instances above? [y/N]:

The script avoids cleaning the running instance—_confluent-controlcenter-6-2-0-1—and discovers that there are two old Control Center instances from version 5.4.1 available for cleanup, _confluent-controlcenter-5-4-1-1 and _confluent-controlcenter-5-4-1-2.

You can type y to clean all of the old instances without intermissions or prompts.

You can type N to receive a prompt individual instance cleanup instead.

Step 3: Individual instance cleanup

Assuming N is used in the previous step, you will receive the following prompt:

Do you want to cleanup _confluent-controlcenter-5-4-1-1 ? [y/N/dryRun]:

For each Control Center instance, you can type y to clean the instance’s topics and the instance’s local directories.

  • Instance topics: Every internal topic that is prefixed with the instance name. Refer to below for a few internal topics for instance name _confluent-controlcenter-5-4-1-1:
    _confluent-controlcenter-5-4-1-1-AlertHistoryStore-changelog
    _confluent-controlcenter-5-4-1-1-MetricsAggregateStore-changelog
    _confluent-controlcenter-5-4-1-1-cluster-rekey
    _confluent-controlcenter-5-4-1-1-expected-group-consumption-rekey
    _confluent-controlcenter-5-4-1-1-actual-group-consumption-rekey
    
  • Instance local directories: <confluent.controlcenter.data.dir>/<instance id>/cp-command/<instance name> and <confluent.controlcenter.data.dir>/<instance id>/kafka-streams/<instance name>. Refer to the table below for an example:
    /tmp/control-center/1 cp-command/ _confluent-controlcenter-5-4-1-1/
    kafka-streams/ _confluent-controlcenter-5-4-1-1/

You can type N to skip cleanup for the instance at hand.

You can type dryRun to see what topics and local directories will be deleted without any actual impact. After dryRun, you will be prompted to clean up the same instance again with option [y/N/dryRun] until you either type y or N, deciding to clean or skip the instance.

Step 4: Repeat until no more instances are left for cleanup

If you would like to avoid being prompted to clean each instance, type y in step 2, Cleanup ALL of the instances above? [y/N].

Steps 1–4: A complete example

A majority of the logs are omitted below, except for high-level logs. Lines that start with # are comments added later and are not part of the original log.

./bin/control-center-cleanup etc/confluent-control-center/control-center.properties
================================================================================
The cleanup script found the following instance: 
_confluent-controlcenter-6-2-0-1
We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup.
Here are the instances discovered for cleanup: _confluent-controlcenter-5-4-1-1 _confluent-controlcenter-5-4-1-2 Cleanup ALL of the instances above? [y/N]: N
Do you want to cleanup _confluent-controlcenter-5-4-1-1 ? [y/N/dryRun]: dryRun ----Dry run displays the actions which will be performed when running Streams Reset Tool---- Reset-offsets for input topics [_confluent-monitoring, _confluent-command, _confluent-metrics] Seek-to-end for intermediate topics [_confluent-controlcenter-5-4-1-1-cluster-rekey, _confluent-controlcenter-5-4-1-1-monitoring-message-rekey-store, _confluent-controlcenter-5-4-1-1-actual-group-consumption-rekey, _confluent-controlcenter-5-4-1-1-expected-group-consumption-rekey, _confluent-controlcenter-5-4-1-1-group-stream-extension-rekey, _confluent-controlcenter-5-4-1-1-monitoring-trigger-event-rekey, _confluent-controlcenter-5-4-1-1-MetricsAggregateStore-repartition, _confluent-controlcenter-5-4-1-1-metrics-trigger-measurement-rekey] Following input topics offsets will be reset to (for consumer group _confluent-controlcenter-5-4-1-1) (...) Following intermediate topics offsets will be reset to end (for consumer group _confluent-controlcenter-5-4-1-1) (...) Deleting all internal/auto-created topics for application _confluent-controlcenter-5-4-1-1 (...) Deleting intermediate topics (for consumer group _confluent-controlcenter-5-4-1-1) (...) Deleting local RocksDB data in /tmp/confluent/control-center/1 Deleting /tmp/confluent/control-center/1/cp-command/_confluent-controlcenter-5-4-1-1-command Deleting /tmp/confluent/control-center/1/kafka-streams/_confluent-controlcenter-5-4-1-1 Done. Finished dryRun for _confluent-controlcenter-5-4-1-1 . Do you want to clean it up? [y/N/dryRun]: y # Logs omitted. Same steps as above: # 1. For input topics, reset offsets to specified position (default EARLIEST) ← from Kafka Streams Reset Tool # 2. For intermediate topics, seek offsets to the end, LATEST ← from Kafka Streams Reset Tool # 3. Delete internal/auto-created topics ← from Kafka Streams Reset Tool # 4. Delete intermediate topics # 5. Delete local RocksDB data in directories Do you want to cleanup _confluent-controlcenter-5-4-1-2 ? [y/N/dryRun]: y # Logs omitted. Same 5 steps as above. ================================================================================

If you run the cleanup script again, you will see that _confluent-controlcenter-5-4-1-1 and _confluent-controlcenter-5-4-1-2 were cleaned up successfully and you won’t be prompted again.

./bin/control-center-cleanup etc/confluent-control-center/control-center.properties
================================================================================
The cleanup script found the following instance: 
_confluent-controlcenter-6-2-0-1
We believe this COULD be the instance defined in your config file so it will not be prompted for cleanup.
The cleanup script found no instances for cleanup. ================================================================================
Tip:
If you are curious and would like to learn a bit more, the cleanup script internally uses the Kafka Streams Application Reset Tool to reset offsets for input and intermediate topics and to delete internal/auto-created topics. Hence, you see logs like “Reset-offsets for input topics…” and “Seek-to-end for intermediate topics…”

Advantages of the cleanup script

Historically, Control Center has a reset script, bin/control-center-reset, which supports the cleanup of one instance at a time without any guidance prompts: The script only deletes the instance defined in the provided properties file and does not automatically discover other instances. Therefore, in order to maintain a clean Control Center environment, it is recommended that you run the reset script upon each version upgrade or unique identifier update.

Before we dive into the benefits of the cleanup script, the following provides a bit more detail about the reset script.

How to run the reset script

Just like the cleanup script, the reset script also requires a Control Center properties file. It is used to establish the initial connection to the Kafka cluster and to determine the Control Center instance to delete (the reset script only deletes the instance defined in your properties file). New with version 6.2.0, dryRun flag is now supported for the reset script:

bin/control-center-reset <props_file> [--dryRun]

With the dryRun flag, the script previews the topics and directories pertaining to the Control Center instance defined in your properties file, without actually deleting them.

Tip:
If you are running a reset script from an older Confluent Platform package, make sure you have the correct target instance defined in your properties file—there’s no trial run!

Local directories deleted before vs. after 6.2.0

It is important to note that prior to version 6.2.0, the reset script would clean local directories more “drastically.” It finds the unique identifier in the properties file, confluent.controlcenter.id, and deletes the entire ID directory, not just the directories of the target instance.

For example, if the unique identifier is 1, and you have two Control Center instances with ID 1, _confluent-controlcenter-5-4-1-1 (target instance to delete) and _confluent-controlcenter-6-2-0-1, then the entire ID directory /tmp/control-center/1 would be deleted, not just the directories of _confluent-controlcenter-5-4-1-1 (orange directories deleted):

Deleted directions shown deleted
This reset script issue is fixed in version 6.2.0, where only the target instance’s directories are deleted:
Target instances directories deleted

Benefits of the cleanup script

To summarize, despite the subtle differences between the two scripts, the reset script and the cleanup script are complete opposites. The former can only delete the Control Center instance defined in your Control Center configuration file, while the latter can automatically discover and delete any instances except the one defined in your configuration file. To maintain a clean environment, the reset script needs to be run each time before you start a new instance, while the cleanup script can run anytime (before or after a new instance and even only periodically). The cleanup script also provides a handful of guidance prompts, giving you full control over which instance(s) to delete.

There are a few benefits of the cleanup script that would make your Control Center upgrade/update process less error prone and cumbersome:

Reduced ops burden: Clean up multiple Control Center instances in one go

You are an operator and strive to maintain a clean environment by only keeping the necessary Control Center instances—you can now use the cleanup script to periodically delete all the unused instances in one go. No need to manually hunt down each Kafka topic or local data from old Control Center instances anymore.

Easier Control Center upgrades and config changes using only the latest Control Center configuration file

Imagine you are an operator and just performed a Control Center unique identifier update. With the reset script, you would need to modify the properties file to target the instance that you want to delete and repeat the process until all the old instances are deleted. Now with the cleanup script, you only need the latest properties file, which will single out the running instance and delete the old ones.

Let’s say you are an operator and just performed a Control Center version upgrade. With the reset script, in order to delete an old instance, you would need to run the script in the Confluent Platform package whose version matches the target instance. For example, to delete an instance of version 5.4.1, you need to run the script in Confluent Platform package 5.4.1; running the reset script in Confluent Platform package 5.4.2 would delete the 5.4.2 instance, not the target 5.4.1 instance. Now with the cleanup script, you can run the script in any Confluent Platform package that provides it, and it will automatically discover old instances to delete. No need to match the package version with the target instance!

Safety prompts to prevent operator mistakes

For operators who want to make sure they do not accidentally delete the wrong Control Center instances, the cleanup script provides guidance prompts to avoid accidental deletion.

Summary

In summary, removing residue data with the Control Center cleanup script allows you to maintain a clean environment by removing data from unused Control Center instances in one run, making the Control Center upgrade/update process more efficient and less error prone.

Get Started

To learn about other new features of Control Center 6.2.0, check out the remaining blog posts in this series:

  • Rinka Yoshida joined Confluent in 2020 as a backend developer for Confluent Control Center and currently works on projects that identify and grow users’ journeys with Confluent Platform. She earned a bachelor’s degree in computer science from the University of California, San Diego.

このブログ記事は気に入りましたか?今すぐ共有