Continuous
validation of data
quality through
automated quality checks

Automate data quality checks within the data pipelines through hooks, so that bad data does not reach production.

bidtest

The Main Ingredients

Fully
automated

Best
practice

Production data
is protected

How it works?

Best Practices & Data Quality

Expose changes to consumers after quality has been assured with pre-merge hooks

Version control‚Äč

Create discoverable history of the data lake with an ordered set of versions, and ensure clear communication on which versions are used where

dev_test

Read more

How our customers are using
CI\CD for data

Reproducibility for ML experiments Read more>

Increasing research velocity with isolated environments Read more>

Dev Test environments for complex pipelines Read more>

Talk to a lakeFS engineer

Git for Data – lakeFS

  • Get Started
    Get Started
  • LIVE: Develop Spark pipelines against production data on February 15 -

    Register Now
    +