Transform your object storage into a Git-like repository
lakeFS enables you to manage your data lake the way you manga your code. Run parallel pipelines for experimentation and CI/CD for your data.
Features

Petabytes scale version control

Git-like operations: branch,
commit, merge, revert

Zero copy branching for
frictionless experiments

Full reproducibility of
data and code

Pre-commit/merge hooks for
data CI/CD

Instantly revert changes to data
Features

Petabytes scale version control

Git-like operations: branch,
commit, merge, revert

Zero copy branching for
frictionless experiments

Full reproducibility of
data and code

Pre-commit/merge hooks for
data CI/CD

Instantly revert changes to data
Works seamlessly with all modern data frameworks















Deploy in the Cloud or On-Prem





Works seamlessly with all modern data frameworks















Deploy in the Cloud or On-Prem





And any S3 Compatible Storage
Add Your Heading Text Here
The latest from our blog
lakeFS Hooks: Implementing CI/CD for Data using Pre-merge Hooks
Continuous integration of data is the process of exposing data to consumers only after ensuring it adheres to best practices such as...
Data Quality Testing: Ways to Test Data Validity and Accuracy
Introduction If Sisyphus had been a data analyst or a data scientist, the boulder she’d be rolling up the hill would have...
Concrete Graveler: Committing Data to Pebble SSTables
Introduction In our recent version of lakeFS, we switched to base metadata storage on immutable files stored on S3 and other common...