Transform your object storage into a Git-like repository

lakeFS enables you to manage your data lake the way you manga your code. Run parallel pipelines for experimentation and CI/CD for your data.

lakeFS element

Features

check

Petabytes scale version control

check

Git-like operations: branch,
commit, merge, revert

check

Zero copy branching for
frictionless experiments

check

Full reproducibility of
data and code

check

Pre-commit/merge hooks for
data CI/CD

check

Instantly revert changes to data

Features

check

Petabytes scale version control

check

Git-like operations: branch,
commit, merge, revert

check

Zero copy branching for
frictionless experiments

check

Full reproducibility of
data and code

check

Pre-commit/merge hooks for
data CI/CD

check

Instantly revert changes to data

Works seamlessly with all modern data frameworks

Hive
Apache Spark
Kafka
mlFlow
Apache Zeppelin
Presto
Hadoop
Amazon Athena
Amazon Kinesis
Jupyter
Delta Lake
Apache Iceberg
Apache Hudi
DataBricks
Airflow

Deploy in the Cloud or On-Prem

Google Cloud
AWS
Azure
Minio
cephXlakeFS

Works seamlessly with all modern data frameworks

Hive
Apache Spark
Kafka
mlFlow
Apache Zeppelin
Presto
Hadoop
Amazon Athena
Amazon Kinesis
Jupyter
Delta Lake
Apache Iceberg
Apache Hudi
DataBricks
Airflow

Deploy in the Cloud or On-Prem

Google Cloud
AWS
Azure
Minio
cephXlakeFS

And any S3 Compatible Storage

Add Your Heading Text Here

The latest from our blog

LakeFS

  • Get Started
    Get Started