Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superset OPA Authorization research #120

Closed
stefanigel opened this issue Oct 11, 2021 · 10 comments
Closed

Superset OPA Authorization research #120

stefanigel opened this issue Oct 11, 2021 · 10 comments
Assignees

Comments

@stefanigel
Copy link
Member

stefanigel commented Oct 11, 2021

As an administrator I'd like to be able to centrally authorize actions my users are taking using OpenPolicyAgent.

Superset has a roles concept and I'm not sure how pluggable the authorization part is.
We might only be able to fetch the roles a user belongs to from OpenPolicyAgent we might not be able to do anything.

This ticket is a research ticket: Please check how pluggable the authorization system is and how we can plug in OpenPolicyAgent and for which decisions and data it can be used.
If it turns out that we cannot do anything here we need to evaluate how feasible it is to change Superset and if that's also not feasible we need to check if there's anything else we can do to make the authorization experience nicer out of the box.

Please come up with either a foundation for a decision on how to go forward that we can talk about in the architecture meeting (or a separate one). It should cover our options from full OPA to no OPA.

@stefanigel stefanigel added this to the Release #3 milestone Oct 11, 2021
@lfrancke lfrancke removed this from the Release #3 milestone Nov 5, 2021
@lfrancke lfrancke changed the title Superset Authorization Superset OPA Authorization research Feb 9, 2022
@siegfriedweber siegfriedweber self-assigned this Feb 16, 2022
@siegfriedweber
Copy link
Member

Security in Superset

The security concept of Superset consists of:

  • Users which are assigned to roles
  • Roles which are composed of sets of permissions
  • Permissions which allow access to models (e.g. dashboards), views (web pages like the SQL Lab view), data sources (e.g. database tables), and databases
  • Row level security filters which restrict the rows in a database table a role has access to.

By default, authentication and authorization are performed by Superset. All related data is stored in the Superset SQL database. Everything is configured in the Superset web UI. The web UI provides auto-completion. Changes in the permissions and filters can be immediately tested. Other solutions do probably not provide this convenience.

Other supported authentication types are Open ID (e.g. Gmail), LDAP (e.g. Microsoft Active Directory), remote user (e.g. Kerberos), and OAuth. When using these authentication types, the roles which a user is a member of, are also often provided. Therefore if an Open Policy Agent is used then it should not contain the mapping from users to roles.

Authorization with an Open Policy Agent

The authorization in Superset is not configurable but can be replaced with a custom security manager. This custom implementation could route the access checks to an Open Policy Agent which handles the roles and permissions. Also the row level security filters could be returned by the Open Policy Agent. The write access to the web pages containing the roles and row level security could be revoked.

I have only found custom security managers which alter the authentication part slightly but none of them touches the authorization part.

Superset has some built-in roles like Admin, Alpha, and Gamma. The permissions assigned to these roles should not be adapted because they could differ between the Superset versions. Best practice is to assign one built-in role and several custom roles to a user. For instance, an employee in the finance department would be assigned to the Gamma role and additionally to a custom finance role. To avoid that the Rego rules depend on the Superset version, it could make sense that the built-in roles are not managed by the Open Policy Agent.

The effort to connect Superset to an Open Policy Agent depends on the degree to which it should replace the Superset security manager. If everything should be configurable in the Open Policy Agent then it becomes a huge project (see the code sizes of the SupersetSecurityManager, the SecurityManager, the BaseSecurityManager, and the AbstractSecurityManager). If only some aspects like the access to dashboards and data sources should be moved into Rego rules then the effort becomes much smaller but it should not be underestimated because changes concerning the security must be implemented and tested very carefully.

Other options

The roles and permissions can be imported with Flask cli. The Open Policy Agent could provide static rules which are then converted, imported and enforced by Superset. The row level security filters cannot be imported. Superset itself provides no API or tool except the web UI to change them.

The database credentials are stored in a Kubernetes secret, so it would be possible to update the tables in the Superset database. But it is not advisable to rely on implementation details because they can change between versions.

The security manager (or anything else which is configurable in the configuration script) could be adapted to load data into the Superset database via service objects. This solution is more robust to changes than directly touching the database.

@lfrancke
Copy link
Member

I'd be happy - for now - to just have the roles provided by OPA.
Having read your comment I'm not entirely sure how feasible this would be but it seems possible?

@siegfriedweber
Copy link
Member

siegfriedweber commented Mar 1, 2022

It is possible.

To verify that my theoretical thoughts are implementable, I created a technical spike. It is a quick and dirty solution to show that connecting Superset to OPA is feasible but it should not be used as base for a proper implementation.

I had a look at the security manager of the latest Superset release (version 1.4.1) but while skimming the source code I have not seen a good entry point to add the OPA integration. However, in the main branch of the Superset repository (commit 2cb3635256ee8e91f0bac2f3091684673c04ff2b) I found the function get_user_roles which seemed to be a viable entry point. One downside is that the following implementation is highly dependent on the Superset version. Perhaps there is a hook in the Flask App Builder which should be used instead. Another issue is that the administrator must choose a user role when creating a user. This role is then overridden by the ones provided by OPA when the user logs in. In some circumstances, the Superset web UI needed some refreshs before it displayed the views correctly which are accessible by the assigned roles.

opa_superset_security_manager.py

from flask import g
from flask_appbuilder.security.sqla.models import (Role, User)
from opa_client.opa import OpaClient
from superset.security.manager import SupersetSecurityManager
from typing import (Optional, List)

class OpaSupersetSecurityManager(SupersetSecurityManager):

    def get_user_roles(self, user: Optional[User] = None) -> List[Role]:
        if not user:
            user = g.user

        client = OpaClient(host = 'simple-opa')
        response = client.check_policy_rule(
                input_data = {'username': user.username},
                package_path = 'superset',
                rule_name = 'user_roles')

        role_names = response['result'][0]
        roles = list(map(self.find_role, role_names))

        user.roles = roles

        return roles

superset_config.py

from opa_superset_security_manager import OpaSupersetSecurityManager

CUSTOM_SECURITY_MANAGER = OpaSupersetSecurityManager

Dockerfile

FROM apache/superset:2cb3635256ee8e91f0bac2f3091684673c04ff2b

RUN pip install OPA-python-client

COPY superset_config.py /app/pythonpath/
COPY opa_superset_security_manager.py /app/pythonpath/

data.json

{
    "users": [
        {"username": "admin", "roles": ["Admin"]},
        {"username": "testuser", "roles": ["Gamma"]}
    ]
}

roles.rego

package superset

import data.users

user_roles[roles] {
    some i
    user := users[i]
    roles := user.roles
    user.username == input.username
}

@lfrancke
Copy link
Member

lfrancke commented Mar 2, 2022

Thank you.
We'd like to integrate OPA everywhere but if it's just not feasible now then we might have to live with(out) it.
This also means that Superset has no concept of getting Roles from LDAP either, correct?

We could also play the long game and kick this off upstream first and basically build the proper integrations and maybe also change the User creation flow. It might be worth starting a discussion on the mailing list. I could do that as well but you're probably much closer to the problem.

Let me know what you think.
I'll discuss with @soenkeliebau and will see what he thinks.

@lfrancke
Copy link
Member

lfrancke commented Mar 2, 2022

Another question: Do you need to create users in Superset first when using an external authentication service (e.g. LDAP)?

In general though: We'd like to kick off the discussion upstream and I believe we should start it at the beginning.
Basically checking whether there's any interest in thinking through the whole authentication/authorization workflow.
Ideally there'd be a hook/API that just gets a user id and a desired action and it returns a result.
This'd mean that the whole Permissions/Role stuff currently in Superset would be optional and/or just one possible implementation.

Is this (the discussion) something that you'd like to kick off? If not I can do it.

@siegfriedweber
Copy link
Member

This also means that Superset has no concept of getting Roles from LDAP either, correct?

The LDAP groups can be easily mapped to a list of Superset roles when using the LDAP authentication.

# a mapping from LDAP DN to a list of FAB roles
AUTH_ROLES_MAPPING = {
    "cn=fab_users,ou=groups,dc=example,dc=com": ["User"],
    "cn=fab_admins,ou=groups,dc=example,dc=com": ["Admin"],
}

# the LDAP user attribute which has their role DNs
AUTH_LDAP_GROUP_FIELD = "memberOf"

Do you need to create users in Superset first when using an external authentication service (e.g. LDAP)?

No, the user data should be pulled from LDAP. The mapping from the LDAP schema to Superset is configurable:

AUTH_USER_REGISTRATION = True  # allow users who are not already in the FAB DB
AUTH_USER_REGISTRATION_ROLE = "Public"  # this role will be given in addition to any AUTH_ROLES_MAPPING
AUTH_LDAP_FIRSTNAME_FIELD = "givenName"
AUTH_LDAP_LASTNAME_FIELD = "sn"
AUTH_LDAP_EMAIL_FIELD = "mail"  # if null in LDAP, email is set to: "{username}@email.notfound"

I want to clarify my comments above. This ticket is about the integration of the Open Policy Agent into Superset. So I wrote down what must be considered in this case because OPA is not supported by Superset. LDAP on the other hand is supported very well and the configuration options should be sufficient. The statement of my first post was that Superset allows the usage of a self-implemented security manager where we can call OPA to authorize roles and permissions. This is in my opinion the "hook/API that just gets a user id and a desired action and it returns a result" but it will be nevertheless a large effort, especially because it handles security which must be done right. My second post just answers the question if it is possible "to just have the roles provided by OPA". I wanted to answer this question without guessing so I tried to find one solution with minimal effort even if it is not a good solution. This can be figured out if we decide to further investigate in this direction.

I am not sure what exactly to kick off upstream. The API exists in form of the custom security manager. The OPA integration could be done there or in the security manager provided by Superset but in both cases it would be a lot of work. If we decide to only have the roles provided by OPA then we should ask upstream if deriving the SupersetSecurityManager is the way to go and what a good entry point would be.

@lfrancke
Copy link
Member

lfrancke commented Mar 8, 2022

I have a feeling that I don't quite understand yet.
If we use the SupersetSecurityManager can we ignore the whole role and permission concept in Superset or ist that still being used?

Ideally we would be able to completely hide the whole roles/permissions view from Superset and manage it all in OPA.
And I thought that this is not yet possible.
That's the process that I wanted to kick-off. I'd suggest that the two of us sit together for 30min and you show me the UI and we talk through it.

@lfrancke
Copy link
Member

We decided to abandon OPA integration in Superset entirely for now.
@siegfriedweber did give me a demonstration on the current capabilities and supported features.

Roles do have permissions and those permissions are very fine-grained and can change between versions.
Superset makes sure to update built-in roles when this happens.

Were we to use OPA for this our users would need to manually add those.
There are similar concerns around usability for assigning Roles to users.

@soenkeliebau
Copy link
Member

That's very annoying, but can't be helped if it would be too much effort to maintain.

@siegfriedweber
Copy link
Member

The situation in Superset 4.1 is still the same. The SupersetSecurityManager is large and tightly coupled to the SecurityManager of the Flask App Builder. There is no small subset of functions which would define an interface for authorization. Delegating the authorization to OPA would be a huge effort.

After talking to @sbernauer, I understand now the benefits if at least the roles are provided by OPA:

  • For the customers who already use the User Info Fetcher, it is easier to use it to get the roles from the various authentication backends than to figure out, how to configure Superset accordingly.
  • The roles could be automatically created in Superset. The permissions for the roles would still have to be set manually in the Superset UI. But the task of creating the roles and assigning the users (or implementing a role mapper) is no longer necessary.

The proposed solution is now to override the get_user_roles function in SupersetSecurityManager and delegate the request to OPA. Missing roles must be created in Superset and the user must be updated in the App Builder session and database (see the ancestor classes of the SupersetSecurityManager). The current roles should be visible in the Superset UI and a comprehensive integration test should verify that this implementation still works if a new Superset version is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants