Ready to dive into the lake?

lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community
The lakeFS team
June 22, 2022

lakeFS Cloud provides a Git-like repository for data lakes in a hosted version available in AWS Marketplace

NEW YORK and TEL AVIV, June 22, 2022lakeFS, the technology that brings streamlined data lifecycle management and version control to data lakes, is announcing lakeFS Cloud, a fully managed SaaS version of its open-source technology. The new hosted version directly results from users who prefer a fully managed service instead of managing the lakeFS infrastructure themselves on their data lakes. lakeFS Cloud provides organizations with all the benefits and added peace of mind that comes with a SaaS solution, a value-add for enterprise users by request. lakeFS remains committed to its open-source version and robust community. The Saas solution is now available in AWS Marketplace.

Created in 2020 as an open-source platform, lakeFS provides a git-like version control interface for data lakes, with seamless integration to popular data tools and frameworks. Built to maximize the manageability of open-source data analytics solutions that scale, lakeFS Cloud enables the fast creation of predefined workflows essential for managing surging amounts of data—including assuring data reliability—within every enterprise. Other benefits include:

  • Assurance of high availability, uptime, and security 
  • Outsource deployment, installing, maintaining, and scaling
  • Enterprise support and SOC2 compliance 

“Data engineers are struggling to keep data pipelines in good shape, a result of the sheer volume stored in data lakes combined with a lack of manageability tools,” said Einat Orr, Ph.D, Co-founder and CEO at lakeFS. “Companies are justifiably concerned about data quality, and the high cost of error associated with flawed data. Our new SaaS model delivers all the benefits of lakeFS by ensuring highly resilient, high quality data products, while providing a fully managed serverless experience.”

The new SaaS model relieves data engineering teams of maintaining open source software to increase productivity and efficiency. Our open source community continues to grow, proving the value of maintaining both segments of our business. 

Gartner estimates that unstructured data represents an astounding 80 to 90% of all new enterprise data, and it’s growing 3X faster than structured data. lakeFS is the only data versioning tool that is format agnostic and by design is built to bring order and resilience to structured and unstructured data at scale. lakeFS’ new SaaS offering ensures the value of data for companies across industries—especially where data scales fast due to rapid growth—with benefits from better management of data engineering operations and best practices. 

“Since introducing lakeFS to our production data environment, we’ve enjoyed the benefits of atomic and isolated operations in our data pipelines. This has allowed us to spend more time improving other aspects of our data platform, and less time dealing with the fallout from race conditions and partially failed operations,” writes Lior Resisi, data platform team lead at Windward. 

“The cloud never warned us about the data getting clouded. As the blessing of infinite storage quickly became an unmanageable mess,  there is a need for technologies like lakeFS to make data accessible again.” —Sivan Bercovici, CTO, Karius

The ability to manage data the way we manage code is a fundamental capability, enabling faster development of data products while reducing errors. Teams no longer need to store and maintain copies of the entire data lake for every new job, pipeline or algorithm that they develop. lakeFS Cloud addresses the pain points of managing volumes of data by applying the same processes it uses for managing code. Best practices include:

  • Branches: experimentation allows data engineers to try tools and code in isolation; debugging creates an isolated snapshot of the data at the time of the failure; and collaboration enables tools, code, or different versions of the data
  • Merges & Hooks: version control easily points to newly deployed data; best practices and data quality enforced by pre-merge hooks
  • Commits & Reverts: recover from errors by instantly reverting data to a former, consistent snapshot of the data lake with rollbacks; troubleshoot by investigating production errors with an initial snapshot of the inputs to the failed process

To sign up for lakeFS Cloud, visit www.lakefs.io/cloud-registration.

To learn more...

Git for Data – lakeFS

  • Get Started
    Get Started
  • The annual State of Data Engineering Report is now available. Find out what’s new in 2023 -

    +