Webinar Lottie

lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data

webcros

Product

Best Practices Product

How CytoReason Streamlined Nextflow with lakeFS for Smarter Data Pipelines

Ron Poches

TL;DR CytoReason is a technology company transforming biopharma’s decision-making—from trial and error to data-driven—through its AI platform of computational disease models. Leveraging an extensive database of public and proprietary data, the company maps human diseases tissue by tissue and cell by cell. Researchers at leading pharma companies, including Pfizer and Sanofi, rely on CytoReason’s technology […]

Best Practices Product Thought Leadership

Git-Style Workflows for Multimodal AI Data Using Dremio and lakeFS

Alex Merced, Tal Sofer

This post recaps a comprehensive tutorial published by Alex Merced from Dremio and Tal Sofer from lakeFS, highlighting how version control transforms multimodal data management for AI teams. The Challenge: Keeping Diverse Data Types in Sync and Queriable Modern AI pipelines consume more than just structured data. Training sets include images, model artifacts, logs, and

Product Thought Leadership

A Celebration of Shared Vision: lakeFS ???? DVC

Einat Orr, PhD

From Inspiration to Action When we were still dreaming up lakeFS, one of the projects that inspired us was DVC (Data Version Control). It was one of those moments when you realize – “Ah, others see it too.” We weren’t alone in believing that data should be managed like code. DVC was built by data

Best Practices Product Tutorials

Adding Data Version Control Capabilities to MATLAB with lakeFS

Joe Pringle

Many lakeFS customers in the aerospace, automotive, healthcare & life sciences, and manufacturing industries also are heavy users of MATLAB. lakeFS solves a range of data ops challenges for these organizations by serving as a “control plane” for AI-ready data – versioning complex data pipelines, tracking metadata and lineage, and enabling team collaboration through git-like

Product Thought Leadership

lakeFS Named a Representative Vendor in the 2025 Gartner® Market Guide for DataOps Tools

Gottfried Sehringer

We’re excited to share that lakeFS has been named a Representative Vendor in the 2025 Gartner® Market Guide for DataOps Tools. We believe this recognition reflects what we’re seeing across the industry: the urgent need for data infrastructures that can provide AI-ready data efficiently, repeatably, and safely as organizations build production AI systems. DataOps Market

Data Engineering Machine Learning Product

How lakeFS Helps Ensure Data Compliance

Tal Sofer

Data compliance is all about adhering to laws, regulations, standards, and internal policies regarding data use. Organizations must comply with regulations like the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), the California Consumer Privacy Act (CCPA) and SOC2 standards to protect sensitive information and maintain trust. Data compliance plays

Best Practices Product Tutorials

Versioned Data with Apache Iceberg Using lakeFS Iceberg REST Catalog

Amit Kesarwani

lakeFS Enterprise offers a fully standards-compliant implementation of the Apache Iceberg REST Catalog, enabling Git-style version control for structured data at scale. This integration allows teams to use Iceberg-compatible tools like Spark, Trino, and PyIceberg without any vendor lock-in or proprietary formats. By treating Iceberg tables as versioned entities within lakeFS repositories and branches, users

Data Engineering Machine Learning Product

How We Built Our lakeFS Iceberg Catalog

Itai Gilo

A behind-the-scenes look at the design decisions, architecture, and lessons learned while bringing the Apache Iceberg REST Catalog to lakeFS. When we first announced our native lakeFS Iceberg REST Catalog, we focused on what it means for data teams: seamless, Git-like version control for structured and unstructured data, at any scale. But how did we

lakeFS