Webinar Lottie

lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data

webcros

Machine Learning

Best Practices Data Engineering Machine Learning

lakeFS Top 10 Defining Product Milestones in 2025

Oz Katz

2025 was a defining year for lakeFS. Across open source and Enterprise editions, we shipped major capabilities that expanded lakeFS from a powerful data versioning layer into a control plane for AI-Ready Data – spanning structured and unstructured data, multiple public and private clouds, and a growing ecosystem of analytics and ML engines. Here’s our

Best Practices Data Engineering Machine Learning

Building a Data Center of Excellence for Modern Data Teams

Einat Orr, PhD

Sooner or later, every data team will reach a point where things stop working – whether it’s due to team growth, changing business requirements, or advancing pipeline complexity. When facing these issues, leaders start considering a different approach that perfectly balances centralized and decentralized organizational models. A Data Center of Excellence (DCoE) is a centralized

Best Practices Machine Learning

Iceberg Tables Management: Processes, Challenges & Best Practices

Itai Gilo

We all love data lakes. They’re just perfect for storing massive volumes of structured, semi-structured, and unstructured data in native file formats. And they let us explore, refine, and analyze petabytes of data constantly pouring in from various sources. But there’s a caveat. The individual files in a data lake lack the necessary information for

Data Engineering Machine Learning

What is Iceberg Versioning and How It Improves Data Reliability

Itai Gilo

Apache Iceberg includes built-in table versioning to ensure that all changes to your data are logged, consistent, and recoverable. Instead of overwriting files or relying on task time, Iceberg saves each update as an immutable snapshot, ensuring that readers always see a consistent picture of the table, even during heavy writes.  This boosts reliability by

Data Engineering Machine Learning

Data Agility: Building Faster, Smarter, Scalable Workflows

Idan Novogroder

It pays for organizations to treat their data like a product that works like a driving force behind innovation, efficiency, and competitiveness. While data quality is an important aspect, let’s not forget that companies operate in a rapidly changing environment – and their data needs to reflect this by quickly adapting. This is where data

Best Practices Data Engineering Machine Learning

How lakeFS Transactional Mirroring Keeps Your Data Available During Cloud Outages

Idan Novogroder

When AWS Goes Down, Your Data Shouldn’t On October 20th, 2025, AWS experienced a significant outage centered in the us-east-1 region. What started as a DNS resolution issue affecting DynamoDB quickly cascaded into widespread failures across major services and applications. From gaming platforms like Fortnite and social apps like Snapchat to enterprise systems and IoT

Data Engineering Machine Learning

Heterogeneous Data: Use Cases, Tools & Best Practices

Idan Novogroder

Organizations looking to unlock the value from their data are bound to encounter the challenge of dealing with diverse datasets. This includes data in various formats, sources, structures, and semantics, such as structured databases and spreadsheets, as well as unstructured text, photos, and sensor outputs.  Digital ecosystems will only become more complex, so the ability

Data Engineering Machine Learning

Distributed Data Management: Key Concepts, Tools & Best Practices

Idan Novogroder

Ask any data team, and you’ll quickly learn that nobody out there manages all the organization’s data in a single centralized location. Most teams operate across various clouds, locations, and platforms, facing increasingly fragmented, replicated, and decentralized data. This makes effective distributed data management an essential capability.  Keep reading this article to explore the fundamental

lakeFS