Guy Hardonag

Integrations

dbt Tests – Create Staging Environments for Flawless Data CI/CD

Guy Hardonag, Paul Singman
May 11, 2022

Recently, we’ve heard from several community members experimenting with new development workflows using lakeFS and dbt. The timing isn’t surprising given dbt’s more recent support of big data compute tools like Spark and Trino that are some of the most commonly-used technologies by lakeFS users managing a data lake over an object store. The combination …

dbt Tests – Create Staging Environments for Flawless Data CI/CD Read More »

Project

New in lakeFS: Data Retention Policies

Yoni Augarten, Guy Hardonag
March 24, 2022

“I can remember everything. That’s my curse, young man. It’s the greatest curse that’s ever been inflicted on the human race: memory.” — Jedediah Leland, Citizen Kane (1941) lakeFS makes data corruptions easy to avoid and fix by allowing you to travel back in time to any state of your data. This new capability has …

New in lakeFS: Data Retention Policies Read More »

Integrations

Building Reproducible Data Pipelines with Airflow and lakeFS

Guy Hardonag
May 27, 2021

Update (May 26th, 2021): We officially released the lakeFS Airflow provider. Read all about it in the latest blog post. In this post, we’ll see how easy it is to use lakeFS with an existing Airflow DAG, to make every step in a pipeline completely reproducible in both code and data. This is done without …

Building Reproducible Data Pipelines with Airflow and lakeFS Read More »

Project

The lakeFS Katacoda Sandbox Environment – Interactive Data Versioning Learning

Guy Hardonag
March 2, 2022

If you’re interested in playing around and exploring lakeFS, you can now easily get started using the Katacoda demo which provides a personalized sandboxed environment – all from your browser, without installing anything.  lakeFS is an open source platform that delivers resilience and manageability to object-storage based data lakes. With lakeFS you can build repeatable, …

The lakeFS Katacoda Sandbox Environment – Interactive Data Versioning Learning Read More »

Data Engineering Project

The Quick Guide for Running Presto Locally on S3

Guy Hardonag
May 19, 2021

This post aims to cover our experience running Presto in a local environment with the ability to query Amazon S3 and other S3 Compatible Systems. We will: Describe the components needed and how to configure them. Provide a dockerized environment you could run. Show an example of running the provided environment and querying a publicly …

The Quick Guide for Running Presto Locally on S3 Read More »

LakeFS

  • Get Started
    Get Started
  • lakeFS Cloud is live!

    Read the announcement
    +

    lakeFS Cloud
    is live!

    annopp-img