Transform your object storage into a Git-like repository

lakeFS enables you to manage your data lake the way you manage your code. Run parallel pipelines for experimentation and CI/CD for your data.

71AFFAD6-19BD-4594-8B91-4AF5D028DD87

2000+

Installations

0099AF18-3BC6-407E-B15B-9F5F3E59895B

1.5M+

daily API calls
per installation

57232551-E865-4BD9-8F32-E412F37497CF

1.8K

Stars

913E6922-FE0F-4E47-A6DA-4AAF83E95FB1

1600+

Community members

Features

check

Exabytes scale version control

check

Git-like operations: branch,
commit, merge, revert

check

Zero copy branching for
frictionless experiments

check

Full reproducibility of
data and code

check

Pre-commit/merge hooks for
data CI/CD

check

Instantly revert changes to data

Features

check

Petabytes scale version control

check

Git-like operations: branch,
commit, merge, revert

check

Zero copy branching for
frictionless experiments

check

Full reproducibility of
data and code

check

Pre-commit/merge hooks for
data CI/CD

check

Instantly revert changes to data

Works seamlessly with all modern data frameworks

Deploy in the cloud or on-prem

Google Cloud
AWS
Azure
Minio
cephXlakeFS

Works seamlessly with all modern data frameworks

Hive
Apache Spark
Kafka
mlFlow
Apache Zeppelin
Presto
Hadoop
Amazon Athena
Amazon Kinesis
Jupyter
Delta Lake
Apache Iceberg
Apache Hudi
DataBricks
Airflow

Deploy in the Cloud or On-Prem

Google Cloud
AWS
Azure
Minio
cephXlakeFS

And any S3 Compatible Storage

LakeFS

  • Get Started
    Get Started