Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Data Engineering

Best Practices Data Engineering

Loosely Coupled Monolith vs Tightly Coupled Microservices

Barak Amar

TL;DR With some thoughtful engineering, we can achieve a lot of the benefits that come with a microservice oriented architecture, while retaining the simplicity and low operating cost of being a monolith. What is lakeFS? lakeFS is an open source tool that delivers resilience and manageability to object-storage based data lakes. lakeFS provides Git-like capabilities […]

Best Practices Data Engineering

Data Mesh Applied: How to Move Beyond the Data Lake with lakeFS

Einat Orr, PhD

The data mesh paradigm The Data Mesh paradigm was first introduced by Zhamak Dehghani in her article How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh.  Unlike traditional monolithic data infrastructures that handle the consumption, storage, transformation, and output of data in one central data lake, a data mesh supports distributed,

Data Engineering

Object Storage: Everything You Need to Know

Yael Rivkind

While Object Storage is not novel technology, it can still be overwhelming when getting started. Here’s a definitive guide to object-based storage with everything you need to know.   What is object storage? At its core, object storage or object-based storage represents a data storage architecture that allows you to store large amounts of unstructured

Data Engineering

Chaos Data Engineering

Oz Katz

Modern Data Lakes are a complexity tar pit. They involve many moving parts: distributed computation engines, running on virtualized servers connected by a software defined network, running on top of distributed object stores, orchestrated by a distributed stream processor or pipeline execution engine. These moving parts fail. All the time. Handling these failures is not

Best Practices Data Engineering

System Tests: Lessons Learned From Developing For OSS Project

Itai Admi

Overview In this article, I will try to cover some do’s and don’ts for system testing from the perspective of an open-source project. To keep things simple, it all boils down to running the system as our customers would: think of the different use-cases of your system, the environment where it runs, the configuration options,

Best Practices Data Engineering Tutorials

Building A Data Development Environment with lakeFS

Barak Amar

Overview As part of our routine work with data we develop code, choose and upgrade compute infrastructure, and test new data. Usually, this requires running parts of our production pipelines in parallel to production, testing the changes we wish to apply. Every data engineer knows that this convoluted process requires copying data, manually updating configuration,

Best Practices Data Engineering

How to Manage Your Data the Way You Manage Your Code

Einat Orr, PhD

50 years ago it was very hard to collaborate over code. When developing large scale software projects it was difficult to manage changes to source code over time, as revision control tools were only starting to enter mainstream computing. The adoption of version control tools, first centralized and then distributed, changed all that, and now

Data Engineering

Diary of a Data Engineer

Oz Katz

A glimpse into the life of a data engineer. Day 1: Finally, an easy one Got a pretty simple task for a change – read a new type of event stream generated by sales, and publish it to the data lake. Sounds like a straightforward ETL. I estimate this as one day of work. I

Data Engineering

How to Pick the Right Postgres for your Application

Ariel Shaqed (Scolnicov)

Lots of applications require a Postgres database. Before you can install them, you will need a Postgres database. How do you pick the right Postgres for your application? There are a bewildering variety of possible ways to acquire a database running on a Postgres instance, but the biggest choice is “build or buy”: whether to

Git for Data – lakeFS

  • Get Started
    Get Started
  • Where is data engineering heading in 2024? Find out in this year’s State of Data Engineering Report -

    Read it here
    +