Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Data Lifecycle Management – Applying Engineering Best Practices for Data

Learn better Data Lifecycle Management practices with lakeFS

Data Lifecycle Management – Applying Engineering Best Practices for Data

Learn better Data Lifecycle Management practices with lakeFS

Description

Update: Recording on Youtube!

Details

Come out to the Data Engineering group monthly meeting for a presentation followed by group discussions. This is a virtual meeting and all are welcome to attend.

Presentation: Data Lifecycle Management – Applying Engineering Best Practices for Data
Presenter: Itai David

Abstract

Today, when working with data lakes over object storage it is difficult to test changes in isolation, stage new data pipelines/ML models in parallel to production, ensure best practices, debug issues or revert in case of a quality issue.

lakeFS is an open source project that enables managing data the same way as code. Enabling isolated development, safe data ingestion and resilient production. lakeFS provides git-like capabilities such as branches, merges and commits on top of format agnostic data repositories kept on object storage.

Itai David is a software engineer with Treeverse, the company behind lakeFS; With over 15 years of experience in developing software. He is based out of Winnipeg, Canada – so is very jealous of the San Diego weather 😉

Git for Data – lakeFS

  • Get Started
    Get Started
  • Create a Dev/Test Environment for Data Pipelines Using Spark and Python in this LIVE WEBINAR -

    Register here
    +