Integrations

Integrations

Building Reproducible Data Pipelines with Airflow and lakeFS

Guy Hardonag
February 3, 2021

In this post, we’ll see how easy it is to use lakeFS with an existing Airflow DAG, to make every step in a pipeline completely reproducible in both code and data. This is done without modifying the actual code and logic of our jobs – by wrapping these operations with lakeFS commits. An example data …

Building Reproducible Data Pipelines with Airflow and lakeFS Read More »

Integrations

Git-like Operations Over MinIO with lakeFS

Yoni Augarten
January 5, 2021

lakeFS is an open source tool that delivers resilience and manageability to object-storage based data lakes. lakeFS provides Git-like operations over your MinIO storage environment and works seamlessly with all modern data frameworks such as Spark, Hive, Presto, Kafka, R and Native Python etc. Common use-cases include creating a development environment without copying or mocking …

Git-like Operations Over MinIO with lakeFS Read More »

LakeFS

  • Get Started
    Get Started