Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fab_auth_manager: allow get_user method to return the user authenticated via Kerberos #43662

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

brouberol
Copy link
Contributor

@brouberol brouberol commented Nov 4, 2024

The issue this PR fixes was initially discussed in #39683.

@jijoj-hmetrix and I noticed that, starting from Airflow 2.8.0, Kerberos authentication does not seem to work with the stable API. Even when a user provides a valid Kerberos ticket, that the whole gssapi authentication dance is successful, and that the user has the required permissions, the API returns a 403 response.

$ curl --negotiate -u: -s --service-name airflow https://airflow-test.xxxx.com/api/v1/pools  | jq .
{
  "detail": null,
  "status": 403,
  "title": "Forbidden",
  "type": "https://airflow.apache.org/docs/apache-airflow/2.10.2/stable-rest-api-ref.html#section/Errors/PermissionDenied"
}

I found that airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager.get_user relies on flask-login's current_user to get the currently logged in user from the session.

However, the Kerberos auth backend stores the authenticated user in g and not in the session.

This patch allows the current user to be pulled either from g or the session, which allows the API to detect the user authenticated via Kerberos, and then link it to Fab permissions.

Here's an example from an instance running with the patch, with a admin user associated with a User account with Admin permissions:

$ curl --negotiate -u: -s --service-name airflow https://airflow-test.xxx.com/api/v1/pools
{
  "pools": [
    {
      "deferred_slots": 0,
      "description": "Default pool",
      "include_deferred": false,
      "name": "default_pool",
      "occupied_slots": 0,
      "open_slots": 128,
      "queued_slots": 0,
      "running_slots": 0,
      "scheduled_slots": 0,
      "slots": 128
    }
  ],
  "total_entries": 1
}

I accompany the change with a small unit test.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Copy link

boring-cyborg bot commented Nov 4, 2024

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch and the fix!

@vincbeck
Copy link
Contributor

vincbeck commented Nov 4, 2024

Some static check failures, shoud be easy to fix :) See documentation

…ted via Kerberos

The issue this PR fixes was initially discussed in apache#39683.

@jijoj-hmetrix and I noticed that, starting from Airflow 2.8.0, Kerberos
authentication does not seem to work with the stable API. Even when a
user provides a valid Kerberos ticket, that the whole gssapi
authentication dance is successful, and that the user has the required
permissions, the API returns a 403 response.

```console
$ curl --negotiate -u: -s --service-name airflow https://airflow-test.xxxx.com/api/v1/pools  | jq .
{
  "detail": null,
  "status": 403,
  "title": "Forbidden",
  "type": "https://airflow.apache.org/docs/apache-airflow/2.10.2/stable-rest-api-ref.html#section/Errors/PermissionDenied"
}
```

I found that [`airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager.get_user`](https://github.com/apache/airflow/blob/baf2b3cb4453d44ff00598a3b0c42d432a7203f9/providers/src/airflow/providers/fab/auth_manager/fab_auth_manager.py#L185-L189) relies on flask-login's [current_user](https://github.com/maxcountryman/flask-login/blob/main/src/flask_login/utils.py#L25) to get the currently logged in user from the session.

However, the Kerberos auth backend stores the authenticated user [in `g`](https://github.com/brouberol/airflow/blob/main/providers/src/airflow/providers/fab/auth_manager/api/auth/backend/kerberos_auth.py#L136)
and not in the session.

This patch allows the current user to be pulled either from `g` or the session,
which allows the API to detect the user authenticated via Kerberos, and
then link it to Fab permissions.

Here's an examle from an instance running with the patch, with a admin
user associated with a User account with Admin permissions:

```console
$ curl --negotiate -u: -s --service-name airflow https://airflow-test.xxx.com/api/v1/pools
{
  "pools": [
    {
      "deferred_slots": 0,
      "description": "Default pool",
      "include_deferred": false,
      "name": "default_pool",
      "occupied_slots": 0,
      "open_slots": 128,
      "queued_slots": 0,
      "running_slots": 0,
      "scheduled_slots": 0,
      "slots": 128
    }
  ],
  "total_entries": 1
}
```

I accompany the change with a small unit test.

Signed-off-by: Balthazar Rouberol <[email protected]>
@brouberol brouberol force-pushed the fix-kerberos-api-auth branch from 416fa1a to c89676d Compare November 5, 2024 07:03
@vincbeck vincbeck merged commit d536ec4 into apache:main Nov 5, 2024
56 checks passed
Copy link

boring-cyborg bot commented Nov 5, 2024

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@brouberol
Copy link
Contributor Author

brouberol commented Nov 5, 2024 via email

ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
@brouberol brouberol deleted the fix-kerberos-api-auth branch November 15, 2024 08:10
@nicolasge
Copy link

@brouberol do you know why the code in your PR has been removed in the latest Airflow image? say 2.10.3

@brouberol
Copy link
Contributor Author

brouberol commented Dec 15, 2024

@nicolasge if that's indeed the case, I wasn't aware of it, sorry.

Edit: it seems that the fix is indeed missing https://github.com/apache/airflow/blob/2.10.3/airflow/providers/fab/auth_manager/fab_auth_manager.py#L170-L174

@potiuk
Copy link
Member

potiuk commented Dec 15, 2024

@brouberol -> see the comment in #44943 -> providers are always released from main. You need to see which provider version it has been released in and have that provider. If it was not released in 2.10.3 - look at 2.10.4rc1 that is just being voted - maybe it contains newer provider version with the fix.

Look at https://airflow.apache.org/docs/apache-airflow-providers/index.html to learn how providers vs. core work.

@brouberol
Copy link
Contributor Author

brouberol commented Dec 15, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants