Apache Kafka is a key part of the Big Data infrastructure at Salesforce, enabling publish/subscribe and data transport in near real-time at enterprise scale handling trillions of messages per day. In this session, hear from the teams at Salesforce that manage Kafka as a service, running over a hundred clusters across on-premise and public cloud environments with over 99.9% availability. Hear about best practices and innovations, including:
- How to manage multi-tenant clusters in a hybrid environment
- High volume data pipelines with Mirus replicating data to Kafka and blob storage
- Kafka Fault Injection Framework built on Trogdor and Kibosh
- Automated recovery without data loss
- Using Envoy as an SNI-routing Kafka gateway
We hope the audience will have practical takeaways for building, deploying, operating, and managing Kafka at scale in the enterprise.
Presenter
Lei Ye
SalesforceLei Ye is a Principle Software Engineer at Salesforce focusing on infrastructure security, data lake and data processing. Lei Ye is a Principle Software Engineer at Salesforce focusing on infrastructure security, data lake and data processing.
Presenter
Paul Davidson
SalesforcePaul Davidson is a Software Architect at Salesforce, working on the team responsible for providing Kafka-as-a-Service across the organization. Paul was also the original developer of the open-source Mirus data replication tool, which Salesforce uses to provide high-volume cross-site data pipelines.