Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Data Engineering

Data Engineering Machine Learning Thought Leadership

The State of Data Engineering 2024

Einat Orr, PhD

Since 2021 we’ve been releasing the annual State of Data Engineering Report, a compilation of all the relevant categories that have a direct impact on data engineering infrastructure. In 2024, we see 3 primary trends that influence the categories which will be covered in this report. Trend #1: GenAI influence on software infrastructure As predicted […]

Data Engineering Machine Learning

Top 15 Data Catalog Tools in 2024

Idan Novogroder

Many businesses are dealing with increasing volumes of data spread over several databases and repositories across on-premises systems, cloud services, and IoT technology. This complicates data management and data quality, preventing data practitioners from locating important data and unlocking insights from it.  This is where data catalogs come in. Initially, data catalogs required bespoke scripts

Best Practices Data Engineering Tutorials

ETL Testing Tutorial with lakeFS: Step-by-Step Guide

Iddo Avneri

ETL testing is critical in integrating and migrating your data to a new system. It acts as a safety net for your data, assuring completeness, accuracy, and dependability to improve your decision-making abilities. ETL testing may be complex owing to the volume of data involved. Furthermore, the data is almost always varied, adding an extra

Data Engineering Machine Learning Tutorials

Building A Data Lake For The GenAI And ML Era

Einat Orr, PhD

Despite data technology advancements, many organizations still struggle to access outdated mainframe data. Most of the time, you’re looking at siloed data architecture that just doesn’t align with their strategic goals. At the same time, organizations are under pressure from their competitors. A good data strategy enables companies to go beyond function-specific and interdepartmental analytics

Data Engineering Machine Learning

Data Pipeline Automation: Benefits, Use Cases & Tools

Idan Novogroder

Data is the lifeblood of any business. It drives decision-making, powers strategies, and boosts customer relationships. However, due to the enormous volume of data collected or its poor quality, most businesses still struggle to unlock its value. With the right data pipeline automation system in place, teams can clean and prepare data to improve your

Data Engineering Machine Learning Thought Leadership

Why Is DataOps So Hard and What Tools Make It Easier?

Einat Orr, PhD

TL;DR: DataOps complexity arises from unclear R&Rs, a lack of standardization in interfaces, distributed technology complexities, and difficulties in implementing engineering best practices. The solution is to define clear responsibilities, address missing requirements, and manage data pipelines efficiently using emerging solutions that enhance the manageability and resilience of DataOps. What makes DataOps so hard is,

Data Engineering Machine Learning Product

lakeFS Samples: The Quickest Way to Get Started

Iddo Avneri

lakeFS is a powerful solution for data version control that enables data practitioners to manage data as code using Git-like operations and achieve reproducible, high-quality data pipelines. While getting started with lakeFS is simple through its Quickstart guide, many seek tailored examples that integrate with their existing tech stack or address specific use cases. To

Data Engineering Thought Leadership

Data Engineering in 2024: Predictions

Oz Katz

This article was originally published on Datanami and is republished here with permission. As we officially kick off 2024, I realized I have a few thoughts on the direction of the data landscape that might be of interest to others.  This is a recap of my “predictions.”  I will admit that it’s a mix of what I

Git for Data – lakeFS

  • Get Started
    Get Started
  • Did you know that lakeFS is an official Databricks Technology Partner? Learn more about -

    lakeFS for Databricks