Prädiktives maschinelles Lernen entwickeln, mit Flink | Workshop am 18. Dezember | Jetzt registrieren
Shaun Clowes, Confluent’s Chief Product Officer since December of last year, will tell you data is the lifeblood of modern organizations. But he will also tell you he isn’t a big fan of the word “data.”
“That’s because it means different things to different people,” Shaun said. “And for many it means data sitting in a warehouse. The problem with that data is it represents what the truth may have been at some point in the past. Real data is alive. And in a world that’s incredibly real time, stale data keeps you from being competitive. You can’t drive forward by looking into the rearview mirror.”
Read on to learn more about Shaun’s journey to Confluent, the importance of acting on data in real time, and the roadblocks that companies face when it comes to setting their data in motion.
Mekhala Roy: Could you share some of your past job experiences that prepared you for your role at Confluent?
Shaun Clowes: Prior to joining Confluent, I was the Chief Product Officer at Salesforce-owned MuleSoft, a company that integrates disparate enterprise software systems.
What I realized was that integrating applications is harder than it should be. Making them work together truly seamlessly means it needs to be real time, not batch-based.
That’s because you don’t want to end up with data that's stale or not properly synchronized—which leads to under-par customer and employee experience. And people underestimate the cost of fixing those problems.
I have also spent a lot of time in my career (including my time at Atlassian) working with developers. One thing developers are very good at is identifying when you are solving a problem in a whole new way that unlocks outlandish value.
For me, what Confluent does is unlock a whole new value, in a new way. It separates applications at arm's-length. There’s a whole new class of applications you can build once you adopt this approach.
Mekhala: What was the deciding factor that led you to join Confluent?
Shaun: My thought process was: people talk a lot about category creation, but what does it really mean?
For me, category creation means solving problems in new and unexpected ways to deliver outlandish value to customers and feeling like it is magical.
When I was joining Confluent, the question I asked myself was, “Is this magical?” And the more I spoke with customers and users and looked at the market, the answer became clear.
Confluent enables developers to be dramatically more productive. It sets the data in organizations free—sets it in motion. Data becomes an asset that people can use and reuse and remix and drive more value from.
Mekhala: What currently gets in the way of setting data in motion?
Shaun: I like to think of data as something that’s alive and free. Instead, what most organizations have today is data that used to be true.
Oftentimes, businesses worry that setting data in motion is not safe and that they will lose control of the data.
But even if they can control who has access to the data, there are other reasons that can hinder access. It's either their applications are real time and the data is stored in some database somewhere. Or, they need data from a different time frame, or joined with other pieces of data.
A huge amount of effort in the enterprise is spent literally just extracting data over and over again, changing it in slightly different ways, and then putting it somewhere else again. You can't solve this problem by doing more lifting and shifting of data. Even the world's fastest Sisyphus is still going to be dragging the rock up the hill over and over and over again.
Businesses also end up tightly coupling applications and systems.
What you really want to do is enable people to be able to interact with data as products and also contribute back to that same set of assets, make it richer and better, so others can drive value from this. And you want all of that to still be in real time.
Mekhala: What Confluent product features do you see delivering the most impact?
Shaun: It’s the underlying capabilities of Apache Kafka® that creates a strong foundation for us. But what dramatically amplifies those capabilities are Stream Governance and Stream Catalog, because they help organizations understand and govern the data they have. Most importantly, they enable different teams to collaborate on data products.
But the ability to know what exists without the ability to securely access it, and in a scalable way, is not useful. This is where our Role-Based Access Control and OAuth features come in handy.
Our portfolio of pre-built Kafka connectors makes it easy to connect to popular data sources and sinks. And tools like Stream Designer enable you to work with that data and build streaming data pipelines in minutes.
Mekhala: What upcoming product updates are you most excited about?
Shaun: I’m most excited about our integration of Flink into the product portfolio, which is currently in progress. We've been doing stream processing for a long time, but this is going to turbocharge all of that.
Our acquisition of Immerok also means we gained significant contributors associated with the Flink open source project.
Ultimately, what we want to do with Flink is what we did with Kafka. We want to make it easy for everybody to capture the value of this incredible stream processing tool, which has seen outstanding adoption in the industry.
Flink offers a very flexible platform for all types of stream processing—you can use it with DataTables API, Table API, SQL implementations, or SQL engine on top of it. You can build applications at any layer of abstraction—and run it from this one capable tool set that works for both batch and real-time streaming information.
Mekhala: What's next for Confluent from your perspective?
Shaun: We want to help our customers set their data in motion so they can unlock the full potential of their data.
We want organizations to have a healthy ecosystem of data—a world in which all of their data and systems work together in real time to deliver one coherent estate. The analytical datasets are the same as the online datasets and everything works from one distributed source of truth, the central nervous system.
Organizations today have thousands of systems that they manage—and they have those systems built up to support various business needs. The problem? Every system that you build makes your world more fragmented. Our aim with the central nervous system is to solve that problem to enable all of these systems to work in concert—to deliver really great outcomes.
Want to learn more about how to effectively use data in motion? Check out the “Innovation Insight for Streaming Data in Motion: The Collision of Messaging, Analytics, and DBMS” report from Gartner.
Check out Confluent’s data streaming resources hub for the latest explainer videos, case studies, and industry reports on data streaming.
From joining Confluent as one of the first engineers on the security team to now managing a team of four, Tejal has had incredible opportunities to learn and grow during her six years at the company.
Let’s learn more about Tejal and how Confluent fosters an environment of constant learning—all...
A year in at Confluent, Product Manager Surabhi Singh has learned a lot about data streaming—and even more about herself. In this fast-paced environment, Surabhi is highly motivated and committed to her work strategically planning, coordinating, and delivering product improvements for customers...