Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Product

Machine Learning Product Tutorials

lakectl local: How to work with lakeFS locally using Git

Oz Katz

The massive increase in generated data presents a serious challenge to organizations looking to unlock value from their data sets. Data practitioners have to deal with many consequences of the huge data volume, including manageability and collaboration. This is where data versioning can help. Data version control is crucial because it allows data teams to […]

Machine Learning Product

Guide to lakeFS lakectl local for machine learning 

Idan Novogroder

The bigger the data you deal with, the less it is possible to consume on a single system. lakeFS tackles this issue by allowing for the efficient administration of large-scale data stored remotely.  In addition to the capacity to manage massive datasets, lakeFS allows its users to carry out partial checkouts when working with certain

Product

lakeFS Cloud vs. lakeFS Enterprise: Comparison

Iddo Avneri

Choosing the Right Version Control for Your Data Lake lakeFS is an open-source project that brings Git-like version control mechanisms to your data lake. Many teams use the open-source solution available free of charge. However, some organizations might need additional features and expert support based on solid SLAs.  This is where the lakeFS Cloud and

Product Tutorials

lakeFS + Unity Catalog Integration: Step-by-Step Tutorial

Amit Kesarwani, Jonathan Rosenberg

Efficient data management is a critical component of any modern organization.  As data volumes grow and data sources become more diverse, the need for robust data catalog solutions becomes increasingly evident. Recognizing this need, lakeFS, an open-source data lake management platform, has integrated with Unity Catalog, a comprehensive data catalog solution by Databricks.  In this

Data Engineering Machine Learning Product

lakeFS Samples: The Quickest Way to Get Started

Iddo Avneri

lakeFS is a powerful solution for data version control that enables data practitioners to manage data as code using Git-like operations and achieve reproducible, high-quality data pipelines. While getting started with lakeFS is simple through its Quickstart guide, many seek tailored examples that integrate with their existing tech stack or address specific use cases. To

Best Practices Product Tutorials

Introducing lakeFS Transactional Mirroring (Cross-Region Mirroring)

Ariel Shaqed (Scolnicov), Idan Novogroder, Guy Hardonag

What is mirroring We are pleased to announce a preview of a long-awaited lakeFS feature: transactional mirroring across regions. Mirroring builds on top of S3 Replication to provide a consistent view of your versioned data in other regions. Once configured, it allows creating mirrors in all of your regions. Each mirror of a source repository

Product

lakeFS: Where’s my data?

Ariel Shaqed (Scolnicov)

If you’ve come across our content, you may have noticed blogs diving into the technical details of lakeFS, and this is one of them. These are lakeFS internals and you do not need to know any of the details below in order to use lakeFS at any level.  Either way, if you’re just curious about

Best Practices Product

dbt + Databricks: What are they and how do they work together best?

Tal Sofer

It’s clear that the adoption of dbt is picking up, as it now supports major big data compute tools like Spark and Trino, as well as platforms like Databricks. Incidentally, these technologies are a common choice among our community members, who often use dbt and Databricks together to manage a data lake (or lakehouse) over

Best Practices Product

lakeFS Transactions: Maintain Data Integrity Using ACID Principles

Nir Ozeri

We recently introduced the new High Level Python SDK, which provides a friendlier interface to interact with lakeFS, as part of our evergoing effort to make life simpler for data professionals.  In this article, we will introduce you to a cool new addition to the High Level SDK: Transactions! Read on to learn what lakeFS

Git for Data – lakeFS

  • Get Started
    Get Started
  • Where is data engineering heading in 2024? Find out in this year’s State of Data Engineering Report -

    Read it here
    +