Use Cases

Tutorials Use Cases

How to Build an Isolated Testing Environment for Data with lakeFS

Barak Amar
November 7, 2022

Overview Our routine work with data includes developing code, choosing and upgrading compute infrastructure, and testing new and changed data pipelines. Usually, this requires running our tested pipelines in parallel to production, in order to test the changes we wish to apply. Every data engineer knows that this convoluted process requires copying data, manually updating …

How to Build an Isolated Testing Environment for Data with lakeFS Read More »

Data Engineering Use Cases

How to Develop Spark ETL Pipelines in Isolation

Amit Kesarwani, Vino SD, Iddo Avneri
November 7, 2022

You’re bound to ask yourself this question at some point: Do I need to test the Spark ETLs I’m developing? The answer is yes; you certainly should – and not just with unit testing but also integration, performance, load, and regression testing. Naturally, the scale and complexity  of your data set matters a lot, so …

How to Develop Spark ETL Pipelines in Isolation Read More »

LakeFS

  • Get Started
    Get Started
  • Git for Data - What, How and Why Now?

    Read the article
    +