DevOps and Drinks: Ensuring data quality in a data lake environment with lakeFS

Learn how lakeFS simplifies maintaining high-quality data lakes in two ways: 1. by providing no-copy, isolated data development environments and 2. enabling CI/CD workflows that allow for automated testing of data.
devops-drinks-quality-data-lake
devops-drinks-quality-data-lake

DevOps and Drinks: Ensuring data quality in a data lake environment with lakeFS

Learn how lakeFS simplifies maintaining high-quality data lakes in two ways: 1. by providing no-copy, isolated data development environments and 2. enabling CI/CD workflows that allow for automated testing of data.

Description

Hi Everyone,
It’s about that time to take a long lunch, grab a beverage of your choice and have a great time with 2 interactive topics related to both DevOps and Data Engineering. We are proud to announce our next DevOps & Drinks Virtual Meetup will be happening on Friday, February 11th at 1 PM EST via Zoom.

We will have Paul Singman, Developer Advocate at Treverse, discuss:
Ensuring data quality in a data lake environment with lakeFS

The first problem faced with big data was the feasibility of processing data at such a high scale. In solving the scale problem, people developed technologies we know today like Kafka, Spark, Presto, Snowflake, etc. Now the problem people face is one of manageability. They no longer ask if they can handle a dataset but rather: How can I move faster when developing data-intensive applications? How do I utilize all of my data and ensure it is high-quality?

In this talk, you will learn how lakeFS simplifies maintaining high-quality data lakes in two ways: 1. by providing no-copy, isolated data development environments and 2. enabling CI/CD workflows that allow for automated testing of data.

lakeFS (https://lakefs.io/) is an open source tool that brings Git-like operations and versioning to object storage (e.g. AWS S3, GCS, Azure Blob), enabling data teams to run parallel pipelines for experimentation and CI/CD for data. To learn more, join the lakeFS slack community at https://lakefs.io/slack and talk data with other users, stay up-to-date with the latest features, and more.

Paul is a developer advocate for the lakeFS project, after several years as a data engineer in the analytics team at Equinox Fitness. He enjoys contextualizing the latest data trends and technologies in blog posts and talks, instead of getting caught up in the hype surrounding specific tools. He’s spoken at various conferences and meetups, including the Postgres Conference and AWS re: Invent. When not working you can find him running, playing golf, and sleeping.

As always, the Averity DevOps & Security team will be there in full force answering any questions about the insane job market we’re living in, and providing face-melting guitar shredding and taking requests for our guests before and after the presentations. Bring some good ones!

Looking forward to seeing everyone on Friday 2/11/22 at 1PM EST!

LakeFS

  • Get Started
    Get Started