Webinar Lottie

lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data

webcros

Best Practices

Best Practices Machine Learning Thought Leadership

OpenAI’s Open Source Revolution: Why Enterprise AI Infrastructure Matters More Than Ever

Gottfried Sehringer

Yesterday, OpenAI launched gpt-oss-120b and gpt-oss-20b, marking the company’s first open-weight models since GPT-2 in 2019. This strategic shift represents far more than a product release—it signals a fundamental transformation in how large organizations, particularly in regulated industries, approach AI infrastructure and data management. OpenAI’s Strategic Return to Open Source The gpt-oss models—gpt-oss-120b and gpt-oss-20b—are […]

Best Practices Product Thought Leadership

The Evolving Equation: When Do You Move From Open Source to Enterprise with Data Version Control

Tal Sofer

Open source software has fundamentally reshaped technology—delivering unmatched flexibility, low friction, and rapid innovation. For some teams, it’s a philosophical commitment. For others, it’s the fastest path to building. lakeFS supports both models. For most data teams, the journey starts with open source and evolves over time. lakeFS open source offers a robust foundation for

Best Practices Machine Learning

AI-Ready Data: Characteristics, Challenges & Best Practices

Tal Sofer

Despite the increasing adoption of Artificial Intelligence (AI) applications, most organizations are bound to see implementation challenges. One of the issues lies in the data itself. A recent survey showed 80% of companies believe their data is suitable for AI, but more than half are actually dealing with challenges like internal data quality and categorization

Best Practices Machine Learning Product Tutorials

A Single Pane of Glass to Your Data: Multiple Storage Backends Support in lakeFS

Tal Sofer

Today’s organizations don’t just use a single data storage solution – they operate across on-prem servers, multiple cloud providers, and hybrid environments. This distributed approach has become necessary, but it comes with significant costs: teams struggle with siloed tools, duplicated processes, and an endless cycle of environment management that diverts focus from delivering actual value. 

Best Practices Data Engineering Machine Learning

6 Types of Metadata: Examples, Tools & Frameworks

Idan Novogroder

With the volumes of generated data increasing, metadata has become an essential component in organizing and comprehending massive datasets. Metadata plays a key role in any modern data strategy, especially among organizations that treat data as one of their most precious assets. This article dives into all the different metadata types, tools, and frameworks to

Best Practices Machine Learning

What is AI Data Storage? Benefits, Challenges & Best Practices

Tal Sofer

Many companies are modernizing their data storage infrastructure to capitalize on the opportunities of machine learning (ML) and advanced analytics. However, teams face several unique data management challenges such as the increasing time required for AI training and inference workloads, as well as the cost and scarcity and resources, particularly GPUs. Storage is a key

Best Practices Machine Learning

AI Agents in Business and Automation

Amit Kesarwani

This article discusses AI Agents in business and automation, focusing on building an AI Agent using lakeFS, LangChain, OpenAI, and FAISS (Facebook AI Similarity Search) to answer questions based on documents. It explains what AI Agents and LangChain are, and how lakeFS is used for data version control. The article also provides an example of

Best Practices Machine Learning

Metadata Management Tools: Types, Features & Benefits

Tal Sofer

Managing complex and massive data sets is tricky but metadata management tools can help teams keep their data in shape. Metadata management has become critical in data strategies created by organizations that treat data as an important asset. In this article, we dive into metadata management and give you an overview of tools teams use

Best Practices Machine Learning

What is Metadata? Examples, Benefits & Best Practices

Tal Sofer

What is the key element that guarantees all data published on portals is discoverable, comprehensible, reusable, and interoperable for people and technology like AI? You guessed right; it’s metadata. Metadata also plays a key role in data governance and management. According to Gartner,  organizations that fail to adopt a metadata-driven strategy for IT modernization might

lakeFS