Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

The data version control system of choice for

Experience the full benefits of working with lakeFS for Databricks. Sign up now and get 2 months free!

*Limited to the first 100 users

lakeFS, an official Databricks Technology Partner, offers a scalable data version control system for all your data use cases

Data Engineering

Integrate lakeFS with Databricks Workflows, Notebooks, Unity Catalog, or Delta Lake to increase data engineering velocity and ensure data quality.

ML & AI

Leverage lakeFS comprehensive ML support and seamless integration with Python Notebooks, MLflow, and Git to accelerate your machine learning projects.

Analytics Engineering

Optimize your data management with lakeFS for Databricks. Develop ETLs with confidence, while ensuring production data safety and project timeline acceleration.

Use Git-like operations to gain control over your data

Data Engineering ML & AI Analytics Engineering

Increase your data engineering velocity

Isolated Pipeline Development

Develop data pipelines in isolation without interfering with production data

Write-Audit-Publish Data with Databricks Jobs

Manage data flows and ensure data quality before your pipelines are deployed to production

Multi Delta-Table Transactions

Record changes made to multiple Delta tables as part of a logical pipeline step and leverage multi-table time travel

Versioned Medallion Architecture

Utilize distinct repositories for Bronze, Silver, Gold layers and commit metadata to track data changes lineage

Enhance ML development for structured, semi-structured and unstructured data

Data Preparation in Isolation

Track all preprocessing changes and ensure only valid data reaches production

Parallel ML Experimentation

Run multiple experiments simultaneously, using different dataset versions, without duplicating data

Machine Learning Data Reproducibility

Maintain consistent datasets while adjusting model parameters and track them in the Databricks ML Experiments view

Fast Data Loading for Deep Learning Workloads

Localize data to reduce latency and cut costs by optimizing GPU utilization

Advanced Unstructured Data Filtering

Simplify model development by filtering objects using custom tags

Increase data quality with engineering best practices

Test Before Deploying

Run quality checks on your data before going to production

Effortless Team Collaboration

Easily share and edit specific data versions without stepping on each other’s toes

Fast Error Recovery

Quickly rollback to stability after data errors, ensuring smooth operations with minimal downtime

Reliable Data Audit Trails

Maintain a comprehensive log of all production data changes, including who made them and why

Keep Your Production Data Safe

Develop and test ETL changes on production data without actually modifying or copying data

Why choose lakeFS?

The lakeFS data version control system allows you to manage all your projects more efficiently, allowing you to streamline your data operations.

Support All Data Formats

Work with any of your data formats: plain text, open table, images, videos, you name it

Support all data formats
Scalable and performant

Scalable and Performant

lakeFS supports billions of objects with negligible influence on critical path storage operations

Data Stays in Place

No need to lift and shift - lakeFS manages your data wherever you store it,Cloud or On-Premises

Remote compute

Remote Compute

Use version control such as Spark, Spark SQL, Spark Streaming or any other distributed compute engine you choose

Quality Outcomes Guaranteed

Run quality checks over production data before deploying your pipelines

Quality outcomes guaranteed

Access lakeFS for Databricks

Be one of the first 100 to sign up and receive:

db-tick

First 2 months free

Full access to all the tools supported

Personalized configuration and setup with a dedicated lakeFS engineer

Sign up to lakeFS for Databricks

Git for Data – lakeFS

  • Get Started
    Get Started
  • Who’s coming to Data+AI Summit? Meet the lakeFS team at Booth #69! Learn more about -

    lakeFS for Databricks
    +