Once you start using lakeFS, the files on your object store will form a new representation. The names and paths of the files on the object store will no longer look the same.
This article provides a high-level overview of the lakeFS file representation to help you understand the lakeFS file representation and how it supports data versioning.
How Data Versioning Works in lakeFS
The lakeFS data version control system allows the following Git-like operations:
| Operation | What It Does |
|---|---|
| Branch | A consistent repository copy isolated from other branches and changes. Branch creation is a metadata operation that doesn’t duplicate data |
| Commit | An immutable checkpoint providing a complete repository snapshot |
| Merge | Merges atomically update one branch with changes from another |
| Revert | This operation restores a repository to its prior commit |
| Tag | A pointed to one immutable commit with a meaningful name |
When you use lakeFS, the files in your object store will form a new structure. Other systems, such as Apache Iceberg, also create a new metadata structure.
File Storage Formats
For Commits: SSTable
Commits are stored as RocksDB-compatible SSTables. Three reasons made SSTables the storage format of choice:
- SSTables offer extremely high read throughput on modern hardware. Using commits representing a 200 million object repository (modeled after the S3 inventory of one of our design partners), we achieved close to 500k random GetObject calls per second. This provides a very high throughput/cost ratio, as high as can be achieved on public clouds.
- It’s a well-known storage format that makes it straightforward to develop and consume. The object store makes it accessible to data engineering tools for analysis and distributed computation, decreasing the silo effect of an operational database.
- The SSTable format supports delta encoding for keys, making them very space-efficient for data lakes where many keys share the same common prefixes.
Each lakeFS commit is represented as a set of non-overlapping SSTables that make up the entire keyspace of a repository at that commit.
For Metadata: Graveler
lakeFS metadata is encoded into Graveler, a format that offers a standard way to encode content-addressable Key/Value pairs.
Requirements for the storage format
lakeFS has more requirements for the storage format:
Being space and time-efficient when creating a commit –
Assuming a commit changes a single object out of a billion, there’s no need to write a full snapshot of the entire repository. Ideally, users should be able to reuse some data files that haven’t changed to make the commit operations (in both space and time) proportional to the size of the difference as opposed to the total repository size.
Allowing an efficient diff between commits –
This runs in time proportional to the size of commits’ difference and not their absolute sizes.
To support these requirements, lakeFS is based on a 2-layer Merkle tree that consists of:
- A set of leaf nodes (“Range”) addressed by their content address, and
- A “Meta Range,” which is a special range containing all ranges, representing an entire consistent view of the keyspace:
Representing References And Uncommitted Metadata
lakeFS always saves committed and uncommitted object data in your object store’s storage namespace. However, lakeFS object metadata may be stored in a key-value or object store.
Uncommitted (or “staged”) metadata is malleable and frequently written, unlike committed metadata. This also applies to “refs”—branches, which are pointers to an underlying commit, are adjusted on every commit or merge operation.
Both of these types of metadata are not only mutable but also require strong consistency guarantees while also being fault-tolerant. If we can’t access the current pointer of the main branch, a large portion of the system will be down.
Luckily, this is also much smaller than the committed metadata.
References and uncommitted metadata are currently stored on a key-value store for consistency guarantees. See a list of supported databases for more details.
Understanding lakeFS File Representation
Let’s take a look at the files in the object store once they are managed with lakeFS.
Creating a repository
The first step is to create a new repository via the web interface:

This creates an “empty” lakeFS repository understand-lakefs-repo sitting on an S3 bucket my-lakefs-managed-bucket:

If you look at the cloud provider’s side, you’ll see a single file created at this time (this example is from AWS):

The dummy file is created to check the permissions of the AWS role used by lakeFS to write into the bucket.
Importing files
Importing doesn’t copy files to the bucket. This means that if you change data in the original bucket, lakeFS can no longer manage metadata.
Let’s import data from a bucket s3://my-original-data, which contains a single directory product-reviews with two Parquet files:

lakeFS never copies the files from a folder like this.
Writing new objects to the repository
Going forward, you may want to add new objects to the repository. These will be written to the lakeFS-managed bucket.
You can upload the files using the web interface:

The files are named differently once uploaded via lakeFS. The references to these files are maintained in the range and metarange files for the different commits.
You uploaded the files but haven’t committed the changes yet. You can do that via the UI:

Once committed, new files are added under the _lakefs directory:

Wrap up
lakeFS stores files in a data directory and range and metarange files on the object store under the _lakefs directory, associating them with commits. Since lakeFS stores all files on your managed bucket in your cloud, private cloud or on-premises, there are numerous ways to achieve high availability. Your object store stores data, range, and metarange files regardless of where lakeFS runs.


