We’re excited to introduce a powerful new capability in lakeFS Enterprise: the lakeFS Iceberg REST Catalog – a fully standards-compliant implementation of the Apache Iceberg REST Catalog specification.
With this release, lakeFS now enables seamless version control for both structured and unstructured data at any scale.
Think Git-style workflows, now for your Iceberg tables. No lock-in, no extra tools, just powerful data & AI engineering done right.
Open Standards, Zero Lock-In
The lakeFS Iceberg REST Catalog implements the official Apache Iceberg REST Catalog spec. That means:
- Works out-of-the-box with Apache Spark, Trino, Flink, and other engines that support REST Catalogs
- No proprietary formats or vendor lock-in
- No extra libraries or plugins required
Use Cases: Isolated development, Automation & Collaboration
With lakeFS Iceberg REST Catalog, you’ll be able to achieve:
Version-Controlled Data Development
- Create feature branches for table schema changes or data migrations
- Test modifications in isolation, across multiple tables
- Merge changes safely with conflict detection
Multi-Environment Management
- Use zero-copy branches to represent different environments (dev, staging, prod)
- Promote changes between environments through merges, with automated testing
- Maintain consistent table schemas and data across environments
Collaborative Data Development
- Multiple teams can work on different table features simultaneously without stepping on each other’s toes
- Maintain data quality through pre-merge validations
- Collaborate using pull requests on changes to data and schema
Manage and Govern Access to Data
- Use the detailed built-in commit log capturing who, what, and how data is changed
- Manage access using fine-grained access control for users and groups using RBAC policies
- Rollback changes atomically and safely to reduce time-to-recover and increase system stability
Reproducibility at Scale
Managing thousands of Iceberg tables across petabytes of data? lakeFS has you covered.
- Repositories version all namespaces and tables atomically
- Easily go back to any state of your catalog from any point in time
- Roll back mistakes instantly, across all affected tables
How does it work?
lakeFS Enterprise exposes an implementation of the REST catalog interface as published by the Apache Iceberg project.
Behind the scenes, a request to the catalog does the following:
- Given the table’s namespace, extract the repository and branch (i.e. for table my-repo.main.inventory.books, the repository would be my-repo and main would be the lakeFS reference). This could also be a tag name or commit hash – any lakeFS reference should work.
- Use lakeFS’ versioned metadata to store or retrieve the current Iceberg metadata file for the requested repository and reference
- On modification to a table, create a new metadata file and replace its pointer for the requested branch

This Approach has several benefits:
- It moves the versioning capabilities (branching, merging, committing, etc) out of the critical path: reading and writing from a table is done from the clients to the underlying object store without any data going through lakeFS itself.
- It leverages existing lakeFS primitives – building on top of a solid, proven foundation to atomically branch, commit and merge changes in large scales.
- It allows versioning both structured and unstructured data together, in the same repository, ensuring reproducibility regardless of data type
To learn more about the architecture and design of the Iceberg REST Catalog, see the official integration page on the lakeFS documentation.
Example: Using the lakeFS Iceberg REST Catalog with PyIceberg
Using the PyIceberg client, you can interact with the lakeFS REST Catalog just like any Iceberg-native catalog:
import lakefs
from pyiceberg.catalog import load_catalog
# Initialize the catalog
catalog = RestCatalog(name="lakefs-catalog", **{
'prefix': 'lakefs',
'uri': 'https://lakefs.example.com/iceberg/api',
'oauth2-server-uri': 'https://lakefs.example.com/iceberg/api/iceberg/api/v1/oauth/tokens',
'credential': f'AKIAlakefs12345EXAMPLE:abc/lakefs/1234567bPxRfiCYEXAMPLEKEY',
})
# List namespaces in a branch
catalog.list_namespaces(('repo', 'main'))
# Query a table
catalog.list_tables('repo.main.inventory')
table = catalog.load_table('repo.main.inventory.books')
arrow_df = table.scan().to_arrow()You can also retrieve and inspect tables directly:
branch = lakefs.repository('repo').branch('dev').create(source_reference='main')
# The table is now accessible in the new branch
dev_table = catalog.load_table(f'repo.{branch.id}.inventory.books')The lakeFS Iceberg REST Catalog works with any standard Iceberg client. See the official documentation for more examples and detailed usage instructions.
Get Started Today
The lakeFS Iceberg REST Catalog is available now as part of lakeFS Enterprise.
If you’re using Iceberg and need data versioning, reproducibility, production safety, and compliance, this is the way to do it.
Contact us for a free trial and see how lakeFS can power your data platform, structured or unstructured.


