Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Create a Dev/Test Environment for Data Pipelines Using Spark and Python

Delivering high-quality data requires strict testing of pipelines before deploying them into production.

Today, in order to test ETLs, one either needs to use a subset of the data, or is forced to create multiple copies of the entire data. Testing against sample data is not good enough. The alternative, however, is costly and time consuming.

In this webinar we will demonstrate how to develop and test on the entire production data set with zero-copy.

We’ll explore:

  1. How to set up your environment in under 5 minutes
  2. How to create multiple isolated testing environments without copying data
  3. How to easily run multiple tests on your environment using git-like operations (commit, branch, revert, etc.)


Iddo Avneri

VP Customer Success, lakeFS

Git for Data – lakeFS

  • Get Started
    Get Started
  • Who’s coming to Data+AI Summit? Meet the lakeFS team at Booth #69! Learn more about -

    lakeFS for Databricks