lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data
Rapid AI Development
Starts with Scalable Data Version Control
Improve efficiency, collaboration and reproducibility across all your ML projects with lakeFS
Elevate your data operations
Simplify and optimize your machine learning projects across your AI workflows - from development to staging to production!
Data preparation in isolation
Create isolated branches of your data to experiment freely without disrupting your main dataset. Ensure data quality and boost productivity in your data preparation workflows.
Parallel ML experimentation
Conduct multiple tests on separate branches, compare experiment results, avoid data duplication and streamline resource management. Once you identify the best-performing models, effortlessly merge them into the main branch.
Machine learning data reproducibility
Maintain a detailed history of your data modifications using lakeFS data version control, synced with your code version control. Roll back previous versions if needed and ensure every experiment is reproducible.
Fast data loading for deep learning workloads
Enhance data loading times and conduct large-scale data operations. Create branches without data duplication, leverage efficient reads with fast data access and utilize caching to speed up data retrieval.
lakeFS enabled us to efficiently reproduce ML experiments, increase productivity of the data teams, and adhere to FDA compliance requirements
Leverage all the features available
with lakeFS data version control
Enhance efficiency, security and data consistency throughout your development process.
Use Git-like operations to manage your AI projects
Use branches to create isolated dev/test environments for safe experimentation. Integrate successful experiments into your main
datasets using merges. Use lakeFS Git integration to version code
and data achieving model reproducibility.
Work with managed data locally using lakectl local
Work with your data repositories directly on your local machine and perform version-controlled data operations, experiment with your datasets seamlessly without the need for remote access. Clone data repositories to your local environment, enabling offline data manipulation and testing. Ensure efficiency, productivity and controlled experimentation – in a local environment.
Fast data loading for deep learning workloads with lakeFS Mount
Virtually mount your lakeFS repositories, giving you a local filesystem access to data stored remotely. Minimize latency and ensure high performance, even with the frequent file accesses typical in deep learning applications.
Advanced unstructured
data filtering
Use advanced querying mechanisms to manage and query unstructured data effectively, while maintaining a clean and organized data environment. lakeFS enhanced object tagging enables you to manage and query unstructured data with greater precision and productivity.
Additional Resources
Read the latest on data version control, explore tutorials and pick up best practices