Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow OPA Authorization #446

Open
4 tasks
Tracked by #438
adwk67 opened this issue May 21, 2024 · 1 comment
Open
4 tasks
Tracked by #438

Airflow OPA Authorization #446

adwk67 opened this issue May 21, 2024 · 1 comment
Assignees

Comments

@adwk67
Copy link
Member

adwk67 commented May 21, 2024

Issue checklist

Possible duplicate and/or overlapping issue.

As an administrator I'd like to be able to centrally authorize actions my users are taking using (ideally) OpenPolicyAgent. However, as documented in this ticket, Airflow - like Superset - is built on Flask and as such offers its own user/role authoriziation, or Flask-related mechanisms.

Airflow does not support Open Policy Agent, which is what we use wherever possible.
Instead, it delegates access control of the webserver UI to Flask directly and offers the following authentication types:

  • Database
  • OpenID
  • LDAP
  • Remote User
  • OAuth

Airflow ships with a number of default roles and it is advised to leave these unaltered.
LDAP offers authorization (via group membership) as well as authentication and is probably the most suitable way of implementing Airflow authorization, where appropriate, via Flask. It should be verified that the Flask search filters enable recursive mapping through group memberships.

Tasks

@siegfriedweber
Copy link
Member

siegfriedweber commented Nov 14, 2024

Approach

Authorization with OPA can be implemented for UI users:

  • Derive an OpaFabAuthManager class from FabAuthManager.
  • Override the following functions and delegate the decision to OPA:
    • is_authorized_configuration
    • is_authorized_connection
    • is_authorized_dag
    • is_authorized_dataset
    • is_authorized_pool
    • is_authorized_variable
    • is_authorized_view
    • is_authorized_custom_view
  • Implement the Rego rules.
  • Set the OpaFabAuthManager in the auth_manager option in the [core] section of the configuration file.
  • Disable the FAB security views by setting FAB_ADD_SECURITY_VIEWS to False.
  • Document the new feature and mention the advantages and disadvantages:
    • Centralized authorization management in OPA
    • Flexibel rules e.g. with pattern matching for DAG names
    • No visual security management via the web browser
    • Granting permissions for new DAGs could be more laborious if DAG-specific permissions are required.

Background

The security model of Airflow involves different types of users (see Airflow Security Model):

  • Deployment Managers - overall responsible for the Airflow installation, security and configuration
  • Authenticated UI users - users that can access Airflow UI and API and interact with it
  • DAG Authors - responsible for creating DAGs and submitting them to Airflow

This ticket covers only the "Authenticated UI users".

Airflow uses the Flask App Builder (FAB) but the security model is decoupled from it, see AIP-56 Extensible user management.
Airflow Core only calls the AirflowSecurityManagerV2 and in turn the abstract BaseAuthManager. There exist auth providers which derive these classes. The default auth provider is the FAB provider, containing the FabAuthManager and the FabAirflowSecurityManagerOverride. The responsibility of the security manager is mostly authentication and role assignment. The authorization is done by the auth manager. FAB synchronizes and stores all users, roles and permissions in the database.

The idea for using OPA with Airflow is to re-use the authentication part of the FAB provider and just replace the authorization part. This has the advantage that the existing authentication methods are not affected. However, the roles and permissions should not be read anymore from the database but the request should be delegated to OPA. This means that the is_authorized_* functions in the FabAuthManager must be overriden. The input for the Rego rule would look as follows:

{
    "user": {
        "id": "test-user",
        "roles": ["test-role"],
    },
    "action": "DELETE",
    "resource-type": "DAG",
    "resource-details": {
        "id": "my-dag-id",
        "tags": ["example1", "example2"],
        "dag-folder": "/dags/marketing",
    },
}

The default roles and permissions (see Access control) must then be replicated in Rego. If this is not desired because e.g. they can change between Airflow versions, then these permissions could also be added to the input of the Rego rule resulting in large requests.

@sbernauer sbernauer moved this to Refinement Acceptance: Waiting for in Stackable Engineering Nov 18, 2024
@sbernauer sbernauer changed the title Airflow Authorization Airflow OPA Authorization Nov 20, 2024
@siegfriedweber siegfriedweber moved this from Refinement Acceptance: Waiting for to Ready for Development in Stackable Engineering Nov 26, 2024
@siegfriedweber siegfriedweber moved this from 🔖 Ready to 🏗 In progress in Stackable Marispace-X Backlog Nov 27, 2024
@siegfriedweber siegfriedweber moved this from Ready for Development to Development: In Progress in Stackable Engineering Nov 27, 2024
@siegfriedweber siegfriedweber moved this from Next to In Progress in Stackable End-to-End Coordination Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Status: Development: In Progress
Status: 🏗 In progress
Development

No branches or pull requests

2 participants