Considerations for Abstracting Complexities of a Real-Time ML Platform

« Current 2022

If you are a data scientist or a platform engineer, you probably can relate to the pains of working with the current explosive growth of Data/ML technologies and toolings. With many overlapping options and steep learning curves for each, it’s increasingly challenging for data science teams. Many platform teams started thinking about building an abstracted ML platform layer to support generalized ML use cases. But there are many complexities involved, especially as the underlying real-time data is shifting into the mainstream.

In this talk, we’ll discuss why ML platforms can benefit from a simple and ""invisible"" abstraction. We’ll offer some evidence on why you should consider leveraging streaming technologies even if your use cases are not real-time yet. We’ll share learnings (combining both ML and Infra perspectives) about some of the hard complexities involved in building such simple abstractions, the design principles behind them, and some counterintuitive decisions you may come across along the way.

By the end of the talk, I hope data scientists can walk away with some tips on how to evaluate ML platforms, and platform engineers learned a few architectural and design tricks.

Presenter

Zhenzhong Xu

Claypot AI

Z is currently co-founding a startup to make ML real-time. Previously, he led the real-time data infrastructure team at Netflix.

Considerations for Abstracting Complexities of a Real-Time ML Platform

Presenter

Zhenzhong Xu

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how