Fast data loading for deep learning workloads with lakeFS Mount
Virtually mount a lakeFS remote repository from your object storage onto a local directory and interact with the data as if it resides on your local filesystem
lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data
Virtually mount a lakeFS remote repository from your object storage onto a local directory and interact with the data as if it resides on your local filesystem
Use your favorite tools and libraries without disruption.
Reduce object storage roundtrips by 90%.
Accelerate your AI and ML workloads in no time!
Take any path from a lakeFS repository, on any branch, commit or tag, then mount and accelerate your AI/ML workloads!
Seamlessly scale from a few local files to millions without changing your tools or workflows. Use the same code from early experimentation all the way to production.
lakeFS Mount handles the most demanding workloads, supporting billions of files and offers fast data fetching. Choose from multiple strategies, be they “lazy” or “eager” and optimize your GPU utilization.
When mounting a path within a Git repo, Git will automatically track which version of the data got mounted, allowing code and input data to be linked together. Review any version of the data that the code was used on.
lakeFS enabled us to streamline and run 200+ dbt models in production, increase data deployment velocity, efficiently reproduce ML experiments, increase productivity of the data teams, and adhere to FDA compliance requirements
Watch an in-depth walkthrough of how to get
started in this 7-minute tutorial
lakeFS uses zero-copy branching to avoid data duplication. That is, creating a new branch is a metadata-only operation: no objects are actually copied. Only when an object changes does lakeFS create another version of the data in the storage.
The data you wish to version control will stay in place on your object storage. Onboarding data to lakeFS is done by creating the lakeFS metadata for your existing data while the data stays in place. While writing new data using lakeFS, the bucket you define for lakeFS on your object storage will be used to store that data.
We are extremely responsive on our Slack channel, and we make sure to prioritize the most pressing issues for the community. For SLA-based support, please contact us at support@treeverse.io.
lakeFS Mount is available for lakeFS Enterprise (cloud and on-prem) customers. You first need to contact our team and once your setup is complete, you’ll receive the steps necessary to access the lakeFS Mount binary.
lakeFS Mount supports Linux and MacOS. Windows support is on the roadmap.
You can use lakeFS’s existing Role-Based Access Control mechanism, which includes repository and path-level policies. lakeFS Mount translates filesystem operations into lakeFS API operations and authorizes them based on these policies.
When using lakeFS Mount, the volume of data accessed by the local machine influences the scale limitations more than the total size of the dataset under the mounted prefix. This is because lakeFS Mount uses a lazy downloading approach, meaning it only downloads the accessed files. lakeFS Mount listing capability is limited to performing efficiently for prefixes containing fewer than 8000 objects, but we are working to increase this limit.
Ensure your cache size is large enough to accommodate the volume of files being accessed.
It is perfectly safe to mount a lakeFS path within a Git repository. lakeFS Mount prevents git from adding mounted objects to the git repository (i.e when running git add -A) by adding a virtual .gitignore file to the mounted directory.
The .gitignore file will also instruct Git to ignore all files except .everest/source and in its absence, it will try to find a .everest/source file in the destination folder, and read the lakeFS URI from there. Since .everest/source is in source control, it will mount the same lakeFS commit every time!
Our creators and solution architects are happy to
demonstrate how lakeFS works and answer any
question that you may have
We’re also here
We use cookies to improve your experience and understand how our site is used.
Join our community of experts:
introduce yourself, share your knowledge and discover best practices from fellow peers