How Akamai process 10Gb/s of events in real-time using Kafka and Spark

How Akamai process 10Gb/s of events in real-time using Kafka and Spark

Description

Details

17:30 – 18:00 – Mingling and food 🙂
18:00 – 18:20 – Opening session
18:20 – 19:00 – How to implement Kafka Exactly-Once – Yulia Antonovsky – Senior II Software Engineer @ Akamai
19:00 – 19:40 – Deep dive into Spark 3 Data source read API – Kineret Raviv, Principal Software Developer @ Akamai

*********************** Note: ***********************
– The event will also be streamed live
– All sessions will be delivered in Hebrew
*****************************************************

Title: How to implement Kafka Exactly-Once
Abstract:
Managing Kafka transactions the right way and how to escape endless rebalance storms when running with hundreds of consumers on the same topic.
Let’s talk about Kafka batch processing and how complicated it can be. We will review the issues we faced while building our ingest infrastructure in Azure, processing big-data malicious traffic at Akamai.

Title: Deep dive into Spark 3 Data source read API
Abstract:
At Akamai, we developed a new custom Spark data source and would like to share with you how we did it!
Our input data to the analytics components are stored in a complex format, and we had to implement a custom data source to support this.
In this session, you will learn about Spark 3 data source API, what it is, what it contains, and how we applied it to our use case and architecture.

LakeFS

  • Get Started
    Get Started
  • Join our live webinar on October 12th:

    Troubleshoot and Reproduce Data with Apache Airflow
    +