Vino SD

Data Engineering Use Cases

How to Develop Spark ETL Pipelines in Isolation

Amit Kesarwani, Vino SD, Iddo Avneri
November 7, 2022

You’re bound to ask yourself this question at some point: Do I need to test the Spark ETLs I’m developing? The answer is yes; you certainly should – and not just with unit testing but also integration, performance, load, and regression testing. Naturally, the scale and complexity  of your data set matters a lot, so …

How to Develop Spark ETL Pipelines in Isolation Read More »

Case Studies

How Karius used lakeFS to comply with FDA regulations for the disease diagnostic studies.

Vino SD
August 11, 2022

A brief study, in collaboration with Karius team, on how lakeFS enabled Karius to run their experimental studies effectively. Karius is a Forbes AI-50 listed life sciences company based in California, that uses genomics and AI to advance infectious disease diagnostics. It uses advanced machine learning algorithms for data-intensive complex genomics analysis in real time.  Karius …

How Karius used lakeFS to comply with FDA regulations for the disease diagnostic studies. Read More »

Data Engineering Machine Learning

Data+AI Summit 2022 Recap: Top 6 Industry trends and 9 major announcements!

Vino SD
July 25, 2022

It was 27th June 2022. San Francisco was bustling with 5000+ data folks from around the world to attend the Data & AI summit live after two years. Four days packed with tons of information from Keynotes, Speakers, Panels, Expo booths and Databricks trainings. Flurry of new product announcements followed. lakeFS cloud launch, Delta lake …

Data+AI Summit 2022 Recap: Top 6 Industry trends and 9 major announcements! Read More »

LakeFS

  • Get Started
    Get Started
  • Git for Data - What, How and Why Now?

    Read the article
    +