[Webinar] Bringing Flink to On-Prem and Private Clouds | Register Now
The Possibilities and Pitfalls of Writing Your Own State Stores. Building an event-driven system will inevitably lead you to exposing your data through APIs to make your data accessible to non-streaming solutions. At first glance, Kafka Streams provides statestores which we could use to build our APIs directly onto Kafka. But all store implementations are key/value based which is fine when you only retrieve information by key. API’s however require a bit more “searchability”.
Writing your own state store is certainly possible, but it is challenging. At KOR, we went through this process and implemented a state store on top of Nitrite Database, an embedded document database. This allows you not only to retrieve your documents using a key, but also to search through the values in the store using a MongoDB-like API. On the surface, state stores seem straightforward, but the devil is certainly in the details. How does partitioning fit into this story and how do you make sure everything keeps running smooth, even after restarting or scaling your applications?
We made the project open source for everyone out there wanting to try this approach, but most of all we want to tell you about the dragons we encountered. Join me in a journey of ups and downs that starts with a simple requirement (host an API), through implementing a custom state store and finishes off by describing the challenges we encountered getting our APIs deployed. Don’t expect all “roses and sunshine”. While hosting APIs on Kafka is possible, there are some consequences that we just couldn’t overcome … yet.