Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community
Guy Hardonag
Guy Hardonag Author

Guy has a rich background in software engineering, playing an...

Last updated on May 22, 2024

This tutorial aims to give you a fast start with lakeFS and use its git-like terminology in Spark. It covers the following:

  1. Quick start to install lakeFS using Docker Compose.
  2. How to create a repository, add files to it, create a branch and make changes to the repository using spark jobs.
  3. How to review changes before exposing them to consumers by merging to master.

This simple flow gives a sneak peak to how seamless and easy it is to make changes to data using lakeFS. Once you get the value of a resilient data flow, you can map it to many use cases within your data architecture from validating writes of raw data, to providing a safety net to your ETL pipelines or your ML (or other algorithmic logic) pipelines. You can pull the trigger, your master data lake is safe.

For more detailed information check out our documentation.

Git for Data – lakeFS

  • Get Started
    Get Started
  • Who’s coming to Data+AI Summit? Meet the lakeFS team at Booth #69! Learn more about -

    lakeFS for Databricks