Webinar Lottie

lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data

webcros

Best Practices

Best Practices

Data Silos: What Is It And How Can lakeFS Help?

Einat Orr, PhD

If you aspire to run a data-driven organization, it’s time to get interested in data silos. It’s a more common issue than you’d expect. Data silos impact company operations and the data analytics projects that underpin them. Silos prevent executives from using data to manage processes and make sound business decisions.  Imagine what happens to […]

Best Practices

Cost-Efficient Data Science: Reduce Storage Costs with lakeFS

Nadav Steindler

The world is changing rapidly. The data revolution of the past couple decades continues to gain steam as it spreads beyond Big Tech to many traditional businesses. Innovations in Mobile, IoT, Analytics, AI, and the Public Cloud have many firms discovering the benefits they can reap by collecting and analyzing more data. When a company

Best Practices

Top 8 AI Frameworks: Benefits & How to Choose the Right One

Idan Novogroder

With the incredible pace of advancement in artificial intelligence, we’re seeing the rise of more and more AI frameworks that teams can employ to get their projects off the ground faster. If they want to train a machine learning model, they don’t have to read a paper about it and then write hundreds of lines

Best Practices

Data Governance Frameworks: Pillars, Examples & Benefits

Iddo Avneri

Data governance is all about keeping data safe, maintaining its high quality, and making it easily accessible for data discovery and business intelligence projects. It’s thanks to data governance that validated data travels through secure pipelines to trusted endpoints and users. As new data sources, such as Internet of Things (IoT) technologies, generate more data,

Best Practices Product

2 Ways to Work with Data Locally Using lakeFS

Iddo Avneri

When working with large datasets stored in object stores (such as Amazon S3, Google Cloud Storage, Azure Blob Storage on the cloud, or MinIO or Dell ECS on-prem), we often see a need for users to work locally with that data. Data scientists and engineers may prefer to work locally for several reasons. For instance,

Best Practices

Data Isolation: Benefits, Challenges & Best Practices

Idan Novogroder

Data is the foundation of every organization, so ensuring it’s reliable and consistent is essential in driving informed decisions and ensuring a sustainable growth. This is where data isolation comes into play. Data isolation is all about separating and protecting individual transactions within a database. Keeping these transactions from interfering with one another promotes security,

Best Practices Product Thought Leadership

Dataset Versioning in the Age of Open Table Formats

Tal Sofer

Originally presented at Big Data LDN 2024. More than two decades ago, data warehouses outgrew the capacity of single machines, and scaling them started to become costly or inefficient. This prompted the tech industry to rethink the architecture and start to use distributed systems. If we wanted to store more data, we just bought more

Best Practices Product Tutorials

Collaborating Over Data: Introducing Pull Requests in lakeFS

Oz Katz, Itai Gilo

In modern software development, Pull Requests (PRs) are a fundamental tool for collaborating on code. They allow teams to review, discuss, and merge changes in a controlled and transparent way.  But what if you could apply that same concept to data? At lakeFS, we’re excited to introduce Pull Requests for data — a new feature

Best Practices Machine Learning

Top 9 RAG Tools to Boost Your LLM Workflows

Idan Novogroder

A team looking to build an application that uses a large language model (LLM) like OpenAI’s GPT-4 or Meta’s LLama 2 will inevitably run into this issue: How can we ensure that the responses generated by these models align with the specific business context? This is where retrieval augmented generation (RAG) comes in. RAG brings

Best Practices

Nessie Catalog: Key Features, Use Cases & How to Use

Tal Sofer

This article focuses on how to work with Nessie Catalog. Please note that since its first publication, fundamental support for Iceberg REST Catalog has been added to lakeFS. Visit the lakeFS Iceberg REST Catalog article to learn more about this integration. Data is easily one of the most important assets in every organization, serving as

Best Practices Product Tutorials

Guide To The lakeFS File Representation

Iddo Avneri

Once you start using lakeFS, the files on your object store will form a new representation. The names and paths of the files on the object store will no longer look the same.  This article provides a high-level overview of the lakeFS file representation to help you understand the lakeFS file representation and how it

lakeFS