Update: Recording on Youtube!
Details
Come out to the Data Engineering group monthly meeting for a presentation followed by group discussions. This is a virtual meeting and all are welcome to attend.
Presentation: Data Lifecycle Management – Applying Engineering Best Practices for Data
Presenter: Itai David
Abstract
Today, when working with data lakes over object storage it is difficult to test changes in isolation, stage new data pipelines/ML models in parallel to production, ensure best practices, debug issues or revert in case of a quality issue.
lakeFS is an open source project that enables managing data the same way as code. Enabling isolated development, safe data ingestion and resilient production. lakeFS provides git-like capabilities such as branches, merges and commits on top of format agnostic data repositories kept on object storage.
Itai David is a software engineer with Treeverse, the company behind lakeFS; With over 15 years of experience in developing software. He is based out of Winnipeg, Canada – so is very jealous of the San Diego weather 😉