WEBINAR: Agents are already in your data. Find out how to stay in control.

Best Practices

Best Practices Product Thought Leadership

The Evolving Equation: When Do You Move From Open Source to Enterprise with Data Version Control

Tal Sofer

Open source software has fundamentally reshaped technology—delivering unmatched flexibility, low friction, and rapid innovation. For some teams, it’s a philosophical commitment. For others, it’s the fastest path to building. lakeFS supports both models. For most data teams, the journey starts with open source and evolves over time. lakeFS open source offers a robust foundation for […]

Best Practices Machine Learning

AI-Ready Data: Characteristics, Challenges & Best Practices

Tal Sofer

Despite the increasing adoption of Artificial Intelligence (AI) applications, most organizations are bound to see implementation challenges. One of the issues lies in the data itself. A recent survey showed 80% of companies believe their data is suitable for AI, but more than half are actually dealing with challenges like internal data quality and categorization

Best Practices Machine Learning Product Tutorials

A Single Pane of Glass to Your Data: Multiple Storage Backends Support in lakeFS

Tal Sofer

Today’s organizations don’t just use a single data storage solution – they operate across on-prem servers, multiple cloud providers, and hybrid environments. This distributed approach has become necessary, but it comes with significant costs: teams struggle with siloed tools, duplicated processes, and an endless cycle of environment management that diverts focus from delivering actual value. 

Best Practices Data Engineering Machine Learning

6 Types of Metadata: Examples, Tools & Frameworks

Idan Novogroder

With the volumes of generated data increasing, metadata has become an essential component in organizing and comprehending massive datasets. Metadata plays a key role in any modern data strategy, especially among organizations that treat data as one of their most precious assets. This article dives into all the different metadata types, tools, and frameworks to

Best Practices Machine Learning

What is AI Data Storage? Benefits, Challenges & Best Practices

Tal Sofer

Many companies are modernizing their data storage infrastructure to capitalize on the opportunities of machine learning (ML) and advanced analytics. However, teams face several unique data management challenges such as the increasing time required for AI training and inference workloads, as well as the cost and scarcity and resources, particularly GPUs. Storage is a key

Best Practices Machine Learning

AI Agents in Business and Automation

Amit Kesarwani

This article discusses AI Agents in business and automation, focusing on building an AI Agent using lakeFS, LangChain, OpenAI, and FAISS (Facebook AI Similarity Search) to answer questions based on documents. It explains what AI Agents and LangChain are, and how lakeFS is used for data version control. The article also provides an example of

Best Practices Machine Learning

Metadata Management Tools: Types, Features & Benefits

Tal Sofer

Managing complex and massive data sets is tricky but metadata management tools can help teams keep their data in shape. Metadata management has become critical in data strategies created by organizations that treat data as an important asset. In this article, we dive into metadata management and give you an overview of tools teams use

Best Practices Machine Learning

What is Metadata? Examples, Benefits & Best Practices

Tal Sofer

What is the key element that guarantees all data published on portals is discoverable, comprehensible, reusable, and interoperable for people and technology like AI? You guessed right; it’s metadata. Metadata also plays a key role in data governance and management. According to Gartner,  organizations that fail to adopt a metadata-driven strategy for IT modernization might

Best Practices Machine Learning Product

The Holy Trinity of ML Reproducibility

Oz Katz

Reproducibility is a fundamental challenge in building reliable machine learning (ML) models and AI applications.  It’s not just about debugging a model when it fails in production; it’s also about ensuring that experiments are consistent, avoiding unintended variance, and making incremental progress with confidence.  Without reproducibility, ML teams risk wasting time on unreliable results and

We use cookies to improve your experience and understand how our site is used.

Learn more in our Privacy Policy