Ready to dive into the lake?
lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community

Data Engineering

Data Engineering Machine Learning Thought Leadership

Why Is DataOps So Hard and What Tools Make It Easier?

Einat Orr, PhD

TL;DR: DataOps complexity arises from unclear R&Rs, a lack of standardization in interfaces, distributed technology complexities, and difficulties in implementing engineering best practices. The solution is to define clear responsibilities, address missing requirements, and manage data pipelines efficiently using emerging solutions that enhance the manageability and resilience of DataOps. What makes DataOps so hard is, […]

Data Engineering Machine Learning Product

lakeFS Samples: The Quickest Way to Get Started

Iddo Avneri

lakeFS is a powerful solution for data version control that enables data practitioners to manage data as code using Git-like operations and achieve reproducible, high-quality data pipelines. While getting started with lakeFS is simple through its Quickstart guide, many seek tailored examples that integrate with their existing tech stack or address specific use cases. To

Data Engineering Thought Leadership

Data Engineering in 2024: Predictions

Oz Katz

This article was originally published on Datanami and is republished here with permission. As we officially kick off 2024, I realized I have a few thoughts on the direction of the data landscape that might be of interest to others.  This is a recap of my “predictions.”  I will admit that it’s a mix of what I

Data Engineering Machine Learning

Shallow Copy For Data: What Are Your Options?

Idan Novogroder

In the past five years, we’ve seen many concepts and new tools in the data ecosystem contribute to implementing engineering best practices in data. This trend includes the data mesh, data quality testing, observability, and data monitoring.  The practices we would like to borrow from software engineering and use in data engineering and data science

Best Practices Data Engineering

Databricks Autoloader: Ingesting Data with Ease and Efficiency

Idan Novogroder

You can ingest data files from external sources using a variety of technologies, from Oracle and SQL Server to PostgreSQL and systems like SAP or Salesforce. When putting this data into your data lake, you might run into the issue of identifying new files and orchestrating processes. This is where Databricks Autoloader helps. Databricks Autoloader

Data Engineering Machine Learning Product Tutorials

Introducing The New lakeFS Python Experience

Oz Katz, Nir Ozeri

Since its inception, lakeFS shipped with a full featured Python SDK. For each new version of lakeFS, this SDK is automatically generated, relying on the OpenAPI specification published by the given version. While this always ensured the Python SDK shipped with all possible features, the automatically generated code wasn’t always the nicest (or most Pythonic)

Data Engineering Machine Learning

What is Databricks and How Does It Unify the Power of Data Science and Engineering?

The lakeFS Team

Data-driven decision-making has become the foundation of business operations across every type of company, no matter the size or industry. Large volumes of data flow from many source systems to data warehousing, data lake, or analytics solutions.  What companies need to maximize their ROI from data is a fast, dependable, scalable, and user-friendly space that

Data Engineering Machine Learning Tutorials

Unlocking Data Insights with Databricks Notebooks

Idan Novogroder

Databricks Notebooks are a popular tool for interacting with data using code and presenting findings across disciplines like data science, machine learning, and data engineering. Notebooks are, in fact, a key offering from Databricks for generating processes and collaborating with team members thanks to real-time multilingual coauthoring, automated versioning, and built-in data visualizations.  How exactly

Git for Data – lakeFS

  • Get Started
    Get Started