Webinar Lottie

lakeFS Acquires DVC, Uniting Data Version Control Pioneers to Accelerate AI-Ready Data

webcros
Amit Kesarwani
Amit Kesarwani Author

Amit heads the solution architecture group at Treeverse, the company...

Published on February 18, 2025

Introduction

Role-Based Access Control (RBAC) is an effective way to minimize the risk of data breaches by ensuring users only have access to the data and systems necessary for their job roles. Here’s how you can use RBAC to avoid data breaches:

1. Principle of Least Privilege (PoLP)

  • Assign users the minimum level of access required to perform their job.
  • Regularly review permissions to remove unnecessary access.

2. Define Clear Roles and Responsibilities

  • Establish predefined roles (e.g., Administrator, Manager, Employee, Guest).
  • Ensure roles have well-defined permissions aligned with job functions.

3. Implement Role Hierarchies

  • Use a hierarchical model where higher roles inherit permissions from lower roles.
  • Prevent unauthorized access by restricting elevated permissions to necessary users.

4. Regularly Audit and Review Access Controls

  • Conduct periodic audits to identify outdated or excessive permissions.
  • Implement automatic expiration for temporary access.

5. Restrict Administrative Access

  • Limit the number of users with administrative privileges.
  • Require justification and approval for privilege escalation.

6. Monitor and Log Access Activities

  • Track access logs to detect suspicious activities.
  • Set up alerts for unauthorized access attempts.

7. Segregate Duties

  • Divide critical tasks among multiple users to prevent fraud or data misuse.
  • Require dual approvals for highly sensitive actions.

8. Educate and Train Employees

  • Conduct security awareness training on access control best practices.
  • Ensure employees understand the importance of secure access management.

9. Automate Access Management

  • Utilize Identity and Access Management (IAM) solutions to enforce RBAC policies.
  • Integrate with HR systems to automatically adjust access based on role changes.

By implementing these RBAC best practices, organizations can significantly reduce the risk of unauthorized access, data breaches, and insider threats.

RBAC in lakeFS: Step-by-Step Configuration

Depending on your data protection needs, you might want to take a look at lakeFS Enterprise or lakeFS Cloud, which enables granular access control of data in the form of Role-Based Access Control (RBAC) policies.

Next we will review all steps needed to configure RBAC in lakeFS Enterprise/Cloud. This exercise assumes that lakeFS Enterprise/Cloud is already set up and running against your own storage, and is focused on setting up the RBAC.

Step 1 – Create a User

  1. There are 2 types of users in lakeFS: API Users and Regular Users:
    1. API users are ones that are used mainly to integrate other software tools/applications with lakeFS and require only API access.
    2. Regular users are real people who need to log in using email and password. If you use SSO (Single Sign-On) with lakeFS then regular users are automatically created in lakeFS.
  2. Let’s create a regular user first. Login to your lakeFS instance and click on Administration > Users > Invite User
create a user in lakeFS

  1. Enter user’s email address and click Invite button:
Invite a user

  1. Users will receive an email invitation from lakeFS to activate their account with lakeFS.
  2. Let’s create an API user next. Click on Administration > Users > Create API User.
  3. Enter API user name e.g. Python and click Create button:
Create API user

  1. Once you create the user, you can click on User ID to review it. If you want to create an Access Key for the API access then click on “Access Credentials” tab and click on “Create Access Key” button:
Create Access Key

  1. A new key will be generated:
Access key generated

As instructed, copy the Secret Access Key and store it somewhere safe. You will not be able to access it again (but you will be able to create new ones).

Step 2 – Create a Policy

  1. Let’s create an example policy to allow users to get the list of branches and get commit information for those branches in all repositories (refer to all Actions and Permissions in lakeFS documentation). These example permissions might be required for the users running Python programs and they may not need additional permissions required for the lakeFS UI.
  2. Login to your lakeFS instance and click on Administration > Policies > Create Policy
Create a policy

You will now see this screen:

Create policy screenshot

  1. Enter a unique Policy ID e.g. ListBranchesGetCommit, copy & paste following JSON in Policy JSON Document box and click Save button:
{
    "statement": [
        {
            "action": [
                "fs:ListBranches",
                "fs:ReadCommit"
            ],
            "effect": "allow",
            "resource": "*"
        }
    ]
}
ListBranchesGetCommit

  1. Once you save the policy, you can click on ListBranchesGetCommit policy to review it. You can click on the toggle switch to get the JSON View.
Toggle to get JSON view

JSON View:

JSON view

Step 3 – Create a Group

  1. Controlling access is done by attaching Policies, either directly to Users, or to Groups they belong to. You will create a group and will attach policy to a group in this step. If you want to attach policies directly to a user, then go directly to Step 5.
  2. Login to your lakeFS instance and click on Administration > Groups > Create Group
Create a Group

You will now see this screen:

Create Group screen

  1. Enter a unique Group Name e.g. PythonDevelopers and click Create button.
  2. Once you save the Group Name, click on the PythonDevelopers group to review it. Click on the “Attached Policies” tab and click on the “Attach Policy” button:
Attach Policy

  1.  Search for ListBranchesGetCommit policy, select it and click on “Attach Policies” button:
Attach Policies button

Step 4 – Add Users to a Group

  1. You will add users to a group in this step. If you want to attach policies directly to a user then proceed to Step 5.
  2. Login to your lakeFS instance and click on Administration > Groups.
  3. Click on the PythonDevelopers group to review it. Click on the “Add Members” button under “Group Memberships” tab:
Add Users to a Group

  1. Search for users and select one or multiple users. Click on “Add to Group” button:
Add to Group button

Step 5 – Attach Policy to a User

  1. You can attach policies directly to a user instead of adding users to a group (previous two steps).
  2. Login to your lakeFS instance and click on Administration > Users.
  3. Click on the User ID who requires access permissions. Click on the “Directly Attached Policies” tab and click on “Attach Policy” button:
Attach Policy to a User

  1. Search for ListBranchesGetCommit policy, select it and click on “Attach Policies” button:
Attach Policies button

Summary

As you have seen in this post, Role-Based Access Control (RBAC) is an effective way to minimize the risk of data breaches by ensuring users only have access to the data necessary for their job roles. 

Also, configuring RBAC in lakeFS Enterprise/Cloud is a simple and straightforward action. It is similar to RBAC functionality that you might have used in other applications. Controlling access in lakeFS Enterprise/Cloud is done by creating and attaching Policies, either directly to Users, or to Groups they belong to.

Next steps

To learn more about lakeFS functionalities and other data lake governance features, read more in the lakeFS blog, or join the friendly lakeFS community on Slack

lakeFS