Oz Katz

Data Engineering Thought Leadership

4 Ways to Reduce Cloud Data Storage Costs

Oz Katz
November 7, 2022

In the past year, words like recession, business slowdown and monetary cuttings are being heard more and more often. Not just in the economic press and in the media, these discussions are very much heard also in almost all companies – within boardrooms, in management meetings and when engaging with potential investors and customers. As …

4 Ways to Reduce Cloud Data Storage Costs Read More »

Announcements Data Engineering

Proudly announcing lakeFS Cloud

Einat Orr, PhD., Oz Katz
November 7, 2022

What is lakeFS? As data practitioners, we use many different terms to talk about what we do – we call it business intelligence, analytics, data pipelines, or insights. But there’s one term that captures what we do really well: delivering products.  When we were leading a large R&D organization, we couldn’t help but wonder about …

Proudly announcing lakeFS Cloud Read More »

Project

The lakeFS playground is now live and everybody can play!

Oz Katz, Michal Wosk
March 2, 2022

What if you could manage your data lake just like you manage code? With rollback, versioning, and branching capabilities on top of your existing data lake? lakeFS is an open-source project that provides a Git-like version control interface for data lakes, with seamless integration to most data tools and frameworks. lakeFS enables you to easily …

The lakeFS playground is now live and everybody can play! Read More »

Data Engineering

Hudi, Iceberg and Delta Lake: Data Lake Table Formats Compared

Oz Katz
March 26, 2022

Introduction When building a data lake, there is perhaps no more consequential decision than the format data will be stored in. The outcome will have a direct effect on its performance, usability, and compatibility. It is inspiring that by simply changing the format data is stored in, we can unlock new functionality and improve the …

Hudi, Iceberg and Delta Lake: Data Lake Table Formats Compared Read More »

Data Engineering Project

lakeFS Hooks: Implementing CI/CD for Data using Pre-merge Hooks

Oz Katz
March 2, 2021

Continuous integration of data is the process of exposing data to consumers only after ensuring it adheres to best practices such as format, schema, and PII governance. Continuous deployment of data ensures the quality of data at each step of a production pipeline. In this blog, I will present lakeFS’s web hooks, and showcase a …

lakeFS Hooks: Implementing CI/CD for Data using Pre-merge Hooks Read More »

Data Engineering

Chaos Data Engineering

Oz Katz
May 19, 2021

Modern Data Lakes are a complexity tar pit. They involve many moving parts: distributed computation engines, running on virtualized servers connected by a software defined network, running on top of distributed object stores, orchestrated by a distributed stream processor or pipeline execution engine. These moving parts fail. All the time. Handling these failures is not …

Chaos Data Engineering Read More »

LakeFS

  • Get Started
    Get Started
  • Join our live webinar on December 1st: Promote only high-quality data to production

    Register here
    +