I’m happy to announce that Confluent will be hosting the first Stream Data Hackathon this April 25th in San Francisco! Apache Kafka has recently introduced two new major features: Kafka Connect for data integration and Kafka Streams for distributed, fault tolerant stream processing. Combined together, these form a powerful toolset for building real-time data pipelines and we’re hosting a hackathon to help the community build connectors and learn how to build Kafka Streams applications. Whether you’re a beginner with Kafka or a seasoned expert, join us to help improve the ecosystem of connectors, create proof of concept stream processing applications, and maybe win a prize in the process. Here are the key details:
WHEN: Monday, April 25, 2016 from 6:00 PM to 10:00 PM (PDT)
WHERE: Hilton San Francisco Union Square – Imperial B Ballroom – 333 O’Farrell Street, San Francisco, CA 94102 – View Map
We’ll have food, drinks, and prizes for participants, with Kafka developers on hand to help you with any questions. Already interested? Register here. Want to know more? Read on below for more details.
Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other data systems. It makes it simple to quickly define connectors that move large streaming datasets into and out of Kafka. Kafka Connect can ingest entire databases or collect metrics from all your application servers into Kafka topics, making the data available for stream processing with low latency. A sink connector can deliver data from Kafka topics into secondary indexes like Elasticsearch or into batch systems such as Hadoop for offline analysis.
Kafka Connect abstracts away the common problems every connector to Kafka needs to solve: schema management, fault tolerance, partitioning, offset management and delivery semantics, operations, and monitoring. This allows connector developers to focus on the details specific to the system they are copying data from, while relying on Kafka Connect to solve the hard problems. Connect users can pick from a repository of open-source connectors without having to worry about interoperability as well as have a single system to manage, monitor, and deploy several connectors on.
Kafka Streams is a library for building streaming applications, specifically applications that transform input Kafka topics into output Kafka topics. Kafka Streams has a very low barrier to entry, easy operationalization, and has a natural DSL for writing stream processing applications. It achieves this unique feature set by working directly with Kafka and leveraging the existing distributed, fault tolerant clients. By implementing stream processing as a library instead of a framework, it remains agnostic to resource management and configuration tools so it is easily adopted in any organization — write and deploy your stream processing applications like you would any other. And because it builds upon important concepts for stream processing such as properly distinguishing between event time and processing time, windowing support, and simple yet efficient management of application state, you get all the modern stream processing features you expect in a lightweight library.
Interested in the hackathon but not sure what to build? No problem. Here are a few ideas to get the creative juices flowing:
Get creative! Databases and message queues are obvious targets for connectors, but you can connect all sorts of systems, from loading Wikipedia edits into Kafka to creating JIRA tickets from messages in a Kafka topic. Kafka Streams applications could leverage and combine data from any set of available connectors.
Still not sure what to build? During registration you can include systems you’re interested in and we’ll help connect you with other participants so you can work in a team to come up with and implement a project.
Entries will be judged by a panel of judges at the end of the hackathon based on creativity, features, and completeness.
1st place: Lunch with Kafka co-creator and Confluent co-founder Jay Kreps
2nd and 3rd place: $100 giftcards for Amazon or iTunes
Everyone: T-Shirt and stickers
Prizes will only be awarded to entries that open source their code, making it available on a code sharing site like GitHub or Bitbucket and be willing to list it on the Kafka Connector Hub.
Is attendance restricted to Kafka Summit attendees?
No, this is a community event and anyone is welcome to register and participate.
Do I need to already be familiar with Kafka Connect and Kafka Streams?
No previous experience with Kafka Connect or Kafka Streams is required, but we encourage you to review some of the resources listed below to get some basic familiarity with the framework. This will let you focus on designing and writing your connector during the hackathon.
Do I need to know what I’m going to build before I arrive?
No, although it will help you get up and running more quickly if you come with a few ideas. We’ve provided some examples of possible projects in the “Project Ideas” section above to give you an idea of the types of systems you might want a connector for and applications you might build with Kafka Streams.
Can I work in a team?
Absolutely, and we encourage it! To help form teams, you can include projects you are interested in building with your registration. We’ll connect you with other participants with similar interests at the beginning of the event.
What type of food will be provided?
Light dinner and drinks.
Am I required to submit my code or open source it?
You are not required to do either, but you must publish your code under an open source license to be eligible for the prizes. We recommend the Apache v2 License, but other popular open source licenses are acceptable.
How complete are projects expected to be at the end of the hackathon?
The hackathon is just one evening, but enough time to get a prototype up and running. We hope this will motivate you to get started on a fully featured connector or Kafka Streams application, but the expectation is to only have a prototype by the end of the night.
Will a skeleton be provided to help get started?
Yes, a repository with a skeleton connector will be provided in the resources section before the event, and example applications for Kafka Streams can be found here. We encourage starting from a skeleton so you can make the most of the time during the hackathon.
Who will be available to provide help with the Kafka Connect and Kafka Streams?
Kafka committers, Kafka Connect and Kafka Streams developers, Confluent engineers, and community members will attend the event to help you go from design to implementation of your connector.
How will projects be judged?
Near the end of the hackathon we’ll ask you to give a brief overview of what you’ve built and provide us a link to the repository. No need for a fancy demo, just a quick summary. A small panel of judges will select the most outstanding project, based on creativity, features, and completeness.
The Stream Data Hackathon is a free event, but all attendees must register. For more details and to complete your registration, please click here.
The hackathon will be most productive if you’ve done a bit of prep work so you can get straight to coding. Here are some resources you might find useful:
Catch up on AI Day’s Keynote; AI Accelerator launch; panel discussion with Anthropic, AWS, MongoDB, Reworkd, and Confluent; GenAI RAG workshops; and a hackathon with a GenAI app showcase. Watch the full livestream on demand.
Join Confluent at AWS re:Invent 2024 to learn how to stream, connect, process, and govern data, unlocking its full potential. Explore innovations like GenAI use cases, Apache Iceberg, and seamless integration with AWS services. Visit our booth for demos, sessions, and more.