[Webinar] How to Protect Sensitive Data with CSFLE | Register Today

Thank You

Written By

Today, Confluent became a publicly traded company. This is a big milestone in the short life of our company. To the employees, customers, partners, investors, and the larger developer community who made this possible: Thank you!

I thought I’d share this letter as I think it gives a sense of where we came from, where we’re going, and why I think Confluent is in the early days of something really significant:

There is a saying that “a fox knows many things, but a hedgehog knows one big thing.” Confluent is a company that knows a very big thing, and I want to tell you a little bit about how we came to know it.

I got my start in this area while working at LinkedIn, helping them to rebuild their data infrastructure in the late 2000s. At that time there was a revolution happening in the practical knowledge of how to build distributed systems and cloud infrastructure, but there was little available commercially or as open source. We had to build what we needed from scratch.

What struck us at the time was that although there were a hundred different technologies for storing data, our most acute need was not a problem of data storage. What we needed to do was to unite all the different applications and data stores that made up a global social network into one coherent system, one that could react and respond continuously and in real-time to everything that occurred across a complex fabric of interconnected software systems. This need seemed like it would be a common enough problem, so we assumed that surely there must be some product or technology that addressed it and that we must just be ignorant of it. But there wasn’t! We spent years trying out existing products, reading through computer science papers, and brainstorming around this topic. What we came to realize was that no off-the-shelf solution existed. Perhaps more surprisingly, though this problem was really at the heart of creating a unified digital business, it hadn’t received even a fraction of the commercial or intellectual investment that data storage and databases had. When we realized this, we started to build.

A small team consisting of Jun Rao, Neha Narkhede, and myself, who later became the three co-founders of Confluent, built an initial version of an internal system called Kafka. We rolled it out at scale for early use cases at LinkedIn, handling data streams with billions of messages. But even then, our ambition was bigger. Kafka was built to be open source, and we wanted it to do much more than serve one use case in one company. Over the years we improved the software to handle hundreds and then thousands of use cases, donated it to the Apache Software Foundation, and helped to build the community of users and developers in the Silicon Valley tech world who were the initial adopters.

We found that these other companies were struggling with the same problems, and many of the biggest tech companies started to move to an architecture built around real-time streams—what we’d now call data in motion.

These tech companies were at the forefront of having businesses that are fully represented in software end-to-end. As a result, their applications were really different. They weren’t just made up of disjointed, disconnected parts. All the pieces of software had to integrate to carry out the activity of the business—to interact with users and execute the underlying business processes. In a company like that, it isn’t enough for data to sit in a pile, it has to flow continuously between all the other software systems that need it all the time.

As we were doing this we started to talk with companies well beyond the world of Silicon Valley tech, and we found that they were facing the same problems. They had the same pressures to evolve the digital side of their businesses, to integrate across existing applications and new initiatives. They, too, lacked the infrastructure to accomplish this. We knew that soon enough the emerging blueprint for tech companies would be the blueprint for all companies. We also knew this would be an opportunity we couldn’t let pass by, and we founded Confluent to address it.

Confluent is seen from the outside as a company that has grown very fast, and in some ways that is true. We take very seriously the full scope of the opportunity before us and we want to get to scale as quickly as possible to be able to capture it. But in another sense, we are moving very deliberately against a plan that has existed for longer than the company itself. Much of what we set out to build, and indeed much of what we are still building, was in our initial pitch deck for the company. Creating software systems that can operate at scale and serve in this foundational role in companies is a long-term journey. We are lucky to have a problem worthy of that attention and a team built carefully over the years to accomplish this goal. We think we are still in the very earliest stages of what is possible.

Today the data architecture of a company is as important in the company’s operations as the physical real estate, org chart, or any other blueprint for the business. This is the underpinning of a modern digital customer experience, and the key to harnessing software to drive intelligent, efficient operations. Companies that get this right will be the leaders in their industries in the decades ahead. We know that there is a foundational missing layer at the heart of data infrastructure that allows companies to harness data as it occurs—data in motion—and that this is critical in the next evolution of the architecture of companies. We think this new stack will evolve to be the central nervous system of every company and will be the single most strategic layer in the modern data world.

Confluent is a company created to accomplish that goal. We’re here to set data in motion.

Jay Kreps

  • Jay Kreps is the CEO and co-founder of Confluent, the foundational platform for data in motion built on Apache Kafka. As a pioneer in a new category of data infrastructure, Confluent’s significant growth underscores the importance of data in motion across all industries. Prior to Confluent he was the lead architect for data and infrastructure at LinkedIn. He is the initial developer of several open source projects, including Apache Kafka.

Did you like this blog post? Share it now