lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data
Version-Control Your Iceberg Tables and Everything Around It
Turn your data lake into a versioned, collaborative platform.
With lakeFS, you gain control over your Apache Iceberg tables
and all surrounding data: structured, semi-structured, or unstructured.
Trusted By:
Environment-Wide Version Control:
Manage Multimodal and Multi Table Data Effectively
Managing multiple Iceberg tables or Iceberg tables along with unstructured data?
The lakeFS Iceberg REST Catalog brings Git-style versioning to your entire environment. While Iceberg offers table-level control, lakeFS introduces environment-wide control to your entire data lake.
Version-Controlled Data Development
- Test changes across multiple tables without
impacting production - Merge safely using built-in conflict detection
- Create isolated feature branches for cross-table changes
Collaborative Data Development
- Enable multiple teams to develop in parallel
without stepping on each others’ toes - Build CI/CD-style workflows for Iceberg data,
automating data validation - Collaborate using pull requests on changes
to data and schema
Automate Data Contracts Enforcement with Hooks
- Validate schemas, partitions, and data
correctness before committing, catching issues early - Prevent downstream failures with custom
pre-commit hooks - Ensure referential integrity across multiple
Iceberg tables
Reproducibility at Scale
- lakeFS tags capture the entire state of your
environment – not just individual tables - Anchor ML experiments to concrete data
versions - Roll back mistakes instantly, across tables
and datasets - Navigate your data lake’s full version history
with confidence
Manage and Govern Access to Data
- Use detailed commit logs to track who
changed what and when - Enforce fine-grained access control
with RBAC policies - Recover quickly by rolling back commits
atomically and safely