lakeFS Blog

Data Engineering

The Guide to Data Versioning

Paul Singman
November 25, 2021

“I have never lied to you, I have always told you some version of the truth.” “The truth doesn’t have versions, okay?” — Something’s Gotta Give (2003) Jack Nicholson and Diane Keaton discuss data versioning in Something’s Gotta Give. Table of Contents Introduction A version of something is defined as “a particular form in which some details are different …

The Guide to Data Versioning Read More »

Data Engineering

Data Versioning – Does It Mean What You Think It Means?

Einat Orr, PhD.
November 24, 2021

Introduction When we first thought about a tagline for our open source project lakeFS, we instinctively gravitated to terms like “Data versioning”, “Manage data the way you manage code”, “Git for data”, or any variation of the three that is grammatically correct.  We were very pleased with ourselves for 5 minutes, or maybe 7, before …

Data Versioning – Does It Mean What You Think It Means? Read More »

Data Engineering Hive Metastore

Takeaways From the Future of Metadata After Hive Metastore Roundtable

Paul Singman
November 16, 2021

Overview of Hive’s Metastore Let’s get right into it. This is not an objective recap of every topic covered at the Future of Metadata After Hive Roundtable last week. But it is a summary of what I found most interesting from the discussion between panelists Lior Ebel, Ryan Blue, Seshu Adunuthula and host Oz Katz. Watch the full talk below! …

Takeaways From the Future of Metadata After Hive Metastore Roundtable Read More »

Integrations

dbt Tests – Create Staging Environments for Flawless Data CI/CD

Guy Hardonag, Paul Singman
November 3, 2021

Recently, we’ve heard from several community members experimenting with new development workflows using lakeFS and dbt.  The timing isn’t surprising given dbt’s more recent support of big data compute tools like Spark and Trino that are some of the most commonly-used technologies by lakeFS users managing a data lake over an object store. The combination …

dbt Tests – Create Staging Environments for Flawless Data CI/CD Read More »

Project

lakeFS Community Call Recap – Oct. 2021

Paul Singman
November 24, 2021

Last week we held another lakeFS Community Call! We believe these calls are invaluable opportunities to have direct dialogue with our users on all things lakeFS. Oz covered important new lakeFS functionality, previewed what’s coming soon from the roadmap, and also shared two exciting updates from the community. Let’s recap! 6 Important lakeFS Releases 1. …

lakeFS Community Call Recap – Oct. 2021 Read More »

Data Engineering

3 Ways to Add Data to lakeFS

Paul Singman
October 26, 2021

Few people start using lakeFS without first having some data collected. Consequently, it is common that after getting it up and running, one of the first things people do is import their existing data to lakeFS. There isn’t a one-size-fits-all approach for doing this. Instead, there are ways that work great for a single file, …

3 Ways to Add Data to lakeFS Read More »

Go

Building Rich CLI Applications with Go’s Built-in Templating

Barak Amar
October 20, 2021

Overview The templating package text/template implements data-driven templates for generating textual output. Although we do not benefit from executing the template output more than once, we found it easy to use and helpful for outputting text with colors, marshaling data, and rendering tabular information. By mapping additional functions by name, it is possible to extend …

Building Rich CLI Applications with Go’s Built-in Templating Read More »

Project

lakeFS – Data Versioning at Scale

Paul Singman
November 9, 2021

If you think about it, lakeFS is about two things — version control and big data. We see ourselves as bringing version control to big data. This bridges a workflow gap that currently exists when working with data and working with code.  This gap is purely artificial — there’s no conceptual reason why different workflows should be required for …

lakeFS – Data Versioning at Scale Read More »

LakeFS

  • Get Started
    Get Started