We are pleased to announce that lakeFS Cloud is now available as a self service on Azure.
lakeFS Cloud is a fully-managed lakeFS platform, providing version control for your data lake. As well as being secure and scalable, it includes enterprise features such as Single Sign On (SSO), managed garbage collection, and role-based access control (RBAC), auditing and support for DataBricks Unity.
lakeFS Cloud on Azure has been implemented to optimally support Azure’s stack of technologies. This means that as a user of lakeFS Cloud, you can store your data in either Azure Blob Storage or Azure Data Lake Storage Gen2. SSO is supported using Active Directory Federation Services (AD FS). In the backend, lakeFS Cloud stores its metadata in Cosmos DB.
With full support for the Azure data stack including Synapse and Databricks, lakeFS Cloud integrates seamlessly and provides version control over your data lake. The lakeFS Hadoop Filesystem integration enables you to use Spark on Azure with lakeFS. As an Azure user, this gives you all the tools that you need to adopt best practices around working with data, including:
- Reproducibility of data pipelines and ML experiments, using tools such as the lakeFS python client to integrate with your ML notebooks.
- Continuous integration and deployment (CI/CD) of your data pipelines, using lakeFS hooks to validate changes to data before its integrated.
- Isolated data environments for development, per user or function as required. lakeFS uses copy-on-write so new environments are fast and cheap to create.
- The ability to revert changes made to data, including accidental deletions and erroneous data as a result of flawed pipeline code.
lakeFS includes an S3-compatible gateway, meaning that it integrates with modern data tooling.
lakeFS Cloud on Azure is available for you to try for free today. Head over to lakefs.cloud to get started!