Prädiktives maschinelles Lernen entwickeln, mit Flink | Workshop am 18. Dezember | Jetzt registrieren
Kafka Connect executes multiple connectors in the same process and does not offer any mechanism for the isolation of connectors. Multiple connectors share the same resources (vCPUs, MEM, etc.), which has the unfortunate side effect that excessive resource requests of single connectors impact the health of the rest of the Kafka Connect cluster.
This talk proposes a novel, cloud-native deployment model for Kafka Connect, which uses the different concepts of Kubernetes for executing, scaling, and isolating single Kafka Connect connectors. In a nutshell, we build unique container images for each Kafka Connect connector type. We run connectors as Kubernetes Deployments, which allows us to either set the number of connector instances (or tasks) manually or let Kubernetes scale the connectors elastically. We use Kubernetes’ Resource Management for declaring the resource requests and limits of single connector instances. As a consequence, we achieve fully self-contained connectors, a necessity for production deployments of Kafka Connect.
In a comprehensive evaluation, we compared the presented approach with a Strimzi-based deployment of Kafka Connect. We discuss the results and highlight the benefits and disadvantages of the presented approach (there's no free lunch!). We answer questions, such as: What’s the impact of running single-connector clusters on the overall resource consumption? How well does the elastic scaling of connectors work? Can single connectors go rogue without having an impact on the rest of the cluster?