Ready to dive into the lake?

lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Tal Sofer

Data Engineering Integrations

One Spark job, Many Data Sources – How to Easily Use lakeFS with Spark

Jonathan Rosenberg, Tal Sofer
August 15, 2022

lakeFS is an interface to the data lake, or the parts of the data lake one chooses to version control. The lakeFS interface is S3 compatible, and hence easily used with all common data applications, including Spark. In some cases, lakeFS is first adopted by the teams responsible for the data ingested to the lake, …

One Spark job, Many Data Sources – How to Easily Use lakeFS with Spark Read More »

Integrations Machine Learning

Build Reproducible Experiments with Kubeflow and lakeFS

Tal Sofer, Paul Singman
November 21, 2022

Introducing Kubeflow and lakeFS Kubeflow is a cloud-native ML platform that simplifies the training and deployment of machine learning pipelines on Kubernetes. An ML project using Kubeflow will consist of isolated components for each stage of the ML lifecycle. And each component of a Kubeflow pipeline is packaged as a Docker image and executed in a …

Build Reproducible Experiments with Kubeflow and lakeFS Read More »

Data Engineering Project

Advancing lakeFS: Version Data At Scale With Spark

Tal Sofer
March 24, 2022

Combining lakeFS and Spark provides a new standard for scale and elasticity to distributed data pipelines. When integrating two technologies, the aim should be to expose the strengths of each as much as possible. With this philosophy in mind, we are excited to announce the beta release of the lakeFS FileSystem! This native Hadoop FileSystem …

Advancing lakeFS: Version Data At Scale With Spark Read More »

Git for Data – lakeFS

  • Get Started
    Get Started