The lakeFS Blog

Filter by

Manage your data lifecycle at scale with lakeFS Enterprise

Best Practices

Data Isolation: Benefits, Challenges & Best Practices

Data is the foundation of every organization, so ensuring it’s reliable and consistent is essential in driving informed decisions and ensuring a sustainable growth. This

Idan Novogroder
October 29, 2024

Dataset Versioning in the Age of Open Table Formats

Originally presented at Big Data LDN 2024. More than two decades ago, data warehouses outgrew the capacity of single machines, and scaling them started to

Tal Sofer
October 28, 2024

Collaborating Over Data: Introducing Pull Requests in lakeFS

In modern software development, Pull Requests (PRs) are a fundamental tool for collaborating on code. They allow teams to review, discuss, and merge changes in

Oz Katz, Itai Gilo
October 21, 2024

Machine Learning

MLflow Data Versioning: Techniques, Tools & Best Practices

Data versioning is a central aspect of modern data management, especially in the context of GenAI and machine learning. Teams need a solution to version

Amit Kesarwani
October 14, 2024

Top 9 RAG Tools to Boost Your LLM Workflows

A team looking to build an application that uses a large language model (LLM) like OpenAI’s GPT-4 or Meta’s LLama 2 will inevitably run into

Idan Novogroder
October 8, 2024

Best Practices

Nessie Catalog: Key Features, Use Cases & How to Use

This article focuses on how to work with Nessie Catalog. Please note that since its first publication, fundamental support for Iceberg REST Catalog has been

Tal Sofer
September 30, 2024

Automated Testing in Isolated Environments with GitHub Actions and lakeFS

Promoting ETL code for production is a straightforward process. We have our code – usually stored in Git – and want to build and test

Amit Kesarwani
September 24, 2024

Guide To The lakeFS File Representation

Once you start using lakeFS, the files on your object store will form a new representation. The names and paths of the files on the

Iddo Avneri
September 22, 2024

Amazon S3 Mountpoint vs lakeFS Mount

What is a mount? A filesystem mount is the ability to present a local device or a remote location as a local directory. It is

Amit Kesarwani
September 12, 2024

RAG as a Service: Benefits, Use Cases & Challenges

Retrieval Augmented Generation (RAG) is on its way to becoming the dominant framework for implementing enterprise applications based on Large Language Models (LLMs). However, implementing

Idan Novogroder
September 11, 2024

Best Practices

Apache Iceberg Catalogs: Types & How to Choose the Right Catalog

Apache Iceberg is the most popular open table format. It originated at Netflix due to the need to provide a table representation for data saved

Tal Sofer
September 5, 2024

The lakeFS Blog

Manage your data lifecycle at scale with lakeFS Enterprise

Pick up the Slack with lakeFS