Data Lifecycle Management – Applying Engineering Best Practices for Data

Learn better Data Lifecycle Management practices with lakeFS

Data Lifecycle Management – Applying Engineering Best Practices for Data

Learn better Data Lifecycle Management practices with lakeFS

Description

Update: Recording on Youtube!

Details

Come out to the Data Engineering group monthly meeting for a presentation followed by group discussions. This is a virtual meeting and all are welcome to attend.

Presentation: Data Lifecycle Management – Applying Engineering Best Practices for Data
Presenter: Itai David

Abstract

Today, when working with data lakes over object storage it is difficult to test changes in isolation, stage new data pipelines/ML models in parallel to production, ensure best practices, debug issues or revert in case of a quality issue.

lakeFS is an open source project that enables managing data the same way as code. Enabling isolated development, safe data ingestion and resilient production. lakeFS provides git-like capabilities such as branches, merges and commits on top of format agnostic data repositories kept on object storage.

Itai David is a software engineer with Treeverse, the company behind lakeFS; With over 15 years of experience in developing software. He is based out of Winnipeg, Canada – so is very jealous of the San Diego weather 😉

LakeFS

  • Get Started
    Get Started
  • Join our live webinar on October 12th:

    Troubleshoot and Reproduce Data with Apache Airflow
    +