lakeFS Community

Machine Learning

Machine Learning Product

Guide to lakeFS lakectl local for machine learning 

Idan Novogroder

The bigger the data you deal with, the less it is possible to consume on a single system. lakeFS tackles this issue by allowing for the efficient administration of large-scale data stored remotely.  In addition to the capacity to manage massive datasets, lakeFS allows its users to carry out partial checkouts when working with certain […]

Machine Learning

Machine Learning Components: Elements & Classifications

Idan Novogroder

Today, machines are able to emulate human intelligence through the use of artificial intelligence technology. Approaches such as machine learning, deep learning, natural language processing, and computer vision have been crucial in enabling machines to perform tasks that were once exclusive to the human brain. Machine learning allows systems to learn and improve from experience

Data Engineering Machine Learning Thought Leadership

Why Is DataOps So Hard and What Tools Make It Easier?

Einat Orr, PhD

TL;DR: DataOps complexity arises from unclear R&Rs, a lack of standardization in interfaces, distributed technology complexities, and difficulties in implementing engineering best practices. The solution is to define clear responsibilities, address missing requirements, and manage data pipelines efficiently using emerging solutions that enhance the manageability and resilience of DataOps. What makes DataOps so hard is,

Machine Learning Tutorials

How to Toggle OpenAI Model Determinism

Amit Kesarwani

TL;DR In the previous blog, Introducing the LangChain lakeFS Loader, and sample notebook, we explained and demonstrated integration of lakeFS with LangChain and LLM models (specifically OpenAI models). In this blog, we will explore a new beta feature from OpenAI that enables reproducible responses from a model. Introduction Language models are Stochastic models (stochastic refers

Data Engineering Machine Learning Product

lakeFS Samples: The Quickest Way to Get Started

Iddo Avneri

lakeFS is a powerful solution for data version control that enables data practitioners to manage data as code using Git-like operations and achieve reproducible, high-quality data pipelines. While getting started with lakeFS is simple through its Quickstart guide, many seek tailored examples that integrate with their existing tech stack or address specific use cases. To

Machine Learning

Machine Learning Architecture Diagram: Key Elements

Idan Novogroder

Machine learning solutions come in handy for addressing various problems and achieving a wide range of goals. However, if we look at ML applications from a distance, we’ll instantly see that the fundamental components are almost always the same.  Whether you want to better understand the skeleton of machine learning solutions or start building your

Best Practices Machine Learning

What is LLMOps? Key Components & Differences to MLOPs

Idan Novogroder

Large Language Models (LLMs) are pretty straightforward to use when you’re prototyping. However, incorporating an LLM into a commercial product is an altogether different story. The LLM development lifecycle is made up of several complex components, including data intake, data preparation, engineering, model fine-tuning, model deployment, model monitoring, and more. The process also calls for

Data Engineering Machine Learning

Shallow Copy For Data: What Are Your Options?

Idan Novogroder

In the past five years, we’ve seen many concepts and new tools in the data ecosystem contribute to implementing engineering best practices in data. This trend includes the data mesh, data quality testing, observability, and data monitoring.  The practices we would like to borrow from software engineering and use in data engineering and data science

