Data Minutes #2: Scaling a Data Lake Gracefully with lakeFS

DataMinutes is the fastest event in the Microsoft Data Platform space yet!

Data Minutes #2: Scaling a Data Lake Gracefully with lakeFS

DataMinutes is the fastest event in the Microsoft Data Platform space yet!

Description

Data lakes offer unrivaled scalability and performance but are notoriously difficult to manage. Over time an analytics team will spend more time fighting with the technology, instead of deriving useful insights from their data.

We’ll cover best practices for managing large-scale data lakes. Specifically how the strategies of:

— Isolated Ingestion
— CI/CD data deployment
— Dataset Versioning

Provide important guarantees that allow a lake to be managed elegantly even as data and team sizes grow.

LakeFS

  • Get Started
    Get Started
  • Join our live webinar on October 12th:

    Troubleshoot and Reproduce Data with Apache Airflow
    +