Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

authentication: direct external OIDC provider #1632

Merged

Conversation

liouk
Copy link
Member

@liouk liouk commented May 29, 2024

This PR adds an enhancement for the implementation of using a direct external OIDC provider instead of the built-in oauth stack in OCP.

See also https://issues.redhat.com/browse/OCPSTRAT-306.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2024
@openshift-ci openshift-ci bot requested review from syed and tkashem May 29, 2024 13:11
@liouk
Copy link
Member Author

liouk commented May 29, 2024

/retest

1 similar comment
@liouk
Copy link
Member Author

liouk commented May 30, 2024

/retest

@liouk liouk force-pushed the direct-external-oidc-provider branch 6 times, most recently from 4c0e494 to d130e8c Compare June 5, 2024 13:41
@liouk liouk force-pushed the direct-external-oidc-provider branch from d130e8c to e871efd Compare June 17, 2024 09:43
@liouk liouk force-pushed the direct-external-oidc-provider branch from e871efd to 914f4c0 Compare June 19, 2024 14:15
@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2024
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 25, 2024
@liouk
Copy link
Member Author

liouk commented Jul 29, 2024

/remove-lifecycle rotten

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 29, 2024
@liouk liouk force-pushed the direct-external-oidc-provider branch from 914f4c0 to 98f31c3 Compare August 13, 2024 11:46
@liouk liouk changed the title WIP: authentication: direct external OIDC provider authentication: direct external OIDC provider Aug 13, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 13, 2024

## Graduation Criteria

The goal for this enhancement is to go directly to GA, because the feature is already available and in GA in HCP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will continue through techpreview, but I'm hopeful we do it quickly.

Overall, for this feature there must be e2e tests that cover the following:
- configuring an external OIDC provider on a cluster that uses the built-in OAuth stack (good/bad configurations should be tested)
- on a cluster that uses an external OIDC provider, test reverting configuration back to the built-in OAuth stack (good/bad configurations should be tested)
- on a cluster that uses an external OIDC provider, test monitoring and cluster-authentication-operator status when the provider becomes unavailable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to be sure that the user mapping capabilities work as well.


### Implementation Details/Notes/Constraints

#### cluster-kube-apiserver-operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these trigger kube-apiserver revisions or simply trigger an all-at-once instant rollout using live file reloads for the kube-apiserver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've agreed that this decision must be documented in this enhancement. Putting a hold until this is resolved.

/hold

@deads2k
Copy link
Contributor

deads2k commented Aug 19, 2024

looks like a good overall direction. I'll approve, but leave the lgtm with the team

/approve

Copy link
Contributor

openshift-ci bot commented Aug 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liouk liouk force-pushed the direct-external-oidc-provider branch from 818c4bb to aa34cb5 Compare September 23, 2024 14:55
@liouk
Copy link
Member Author

liouk commented Sep 23, 2024

Changing to WIP in order to clarify the open questions.

/retitle [WIP] authentication: direct external OIDC provider

@openshift-ci openshift-ci bot changed the title authentication: direct external OIDC provider [WIP] authentication: direct external OIDC provider Sep 23, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 23, 2024
2. this means that the `spec.oauthMetadata` field will be set to empty; however `status.integratedOAuthMetadata` will still have a value
3. KAS-o config observer picks up the auth type change, configures KAS pods for OIDC, triggers a rollout
4. once rollout is successful, CAO picks up the change to OIDC, removes oauth-specific resources, including `status.integratedOAuthMetadata`
5. KAS-o oauth-metadata config observer picks up the change in `status.integratedOAuthMetadata` and since `spec.oauthMetadata` is also unset, removes the oauth-metadata configmap; this results in a new config, which triggers a new rollout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhm, I think that if you configure the external OIDC and remove the destination CM in the same step, no additional rollout will be required after cleaning up the status field, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking more about it, I believe we cannot guarantee that these two operations will be performed atomically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens after configuring the external OIDC server? Does kas use the new one, or both?

Copy link
Member Author

@liouk liouk Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you configure the external OIDC and remove the destination CM in the same step, no additional rollout will be required after cleaning up the status field, right?

Right -- that's what I've tried to achieve with the alternative proposal.

we cannot guarantee that these two operations will be performed atomically

If I understand this correctly, we could achieve atomicity if we use the same config observer to configure either the webhook or OIDC depending on the auth type, in a mutually exclusive way, i.e. configuring the one will remove configuration of the other. Indeed we cannot guarantee this will happen atomically if we use different observers.

Does kas use the new one, or both?

If we keep the webhook in the configuration and configure OIDC as well, both methods will be part of the authentication chain. If the first one in the chain fails to validate a token, then the next one will be used until it gets validated or the chain is finished. So both can be active at the same time.

One benefit of doing that, would be that we can keep the webhook up until the KAS-o has validated that OIDC works as expected; then it can mark its auth status as available, and the CAO can then clean up the oauth metadata and webhook (thus triggering a second rollout). If there's any problem with OIDC, the KAS-o won't mark OIDC as available, the CAO won't take down the webhook, and thus the webhook will keep working until the user fixes the OIDC config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this for some further details I learned.

If we keep the webhook in the configuration and configure OIDC as well, both methods will be part of the authentication chain.

While in theory this could be done, in practice that's not really possible due to our authentication CRD validation. The webhook can be defined only if type is not OIDC, which means that if we switch configuration to OIDC in the auth CR then we have to get rid of the webhook.

@liouk liouk force-pushed the direct-external-oidc-provider branch 7 times, most recently from 2c6b6f3 to f7772dd Compare October 8, 2024 13:38
@liouk liouk force-pushed the direct-external-oidc-provider branch from f7772dd to 13a83f2 Compare October 30, 2024 15:39
@liouk liouk changed the title [WIP] authentication: direct external OIDC provider authentication: direct external OIDC provider Oct 30, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 30, 2024
@liouk liouk force-pushed the direct-external-oidc-provider branch 3 times, most recently from 7a41c9f to 8d170a5 Compare November 5, 2024 13:56
@liouk liouk force-pushed the direct-external-oidc-provider branch from 8d170a5 to 03439ca Compare November 12, 2024 10:15
@liouk
Copy link
Member Author

liouk commented Nov 12, 2024

Fixups have been squashed, and open questions addressed; cancelling hold.

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2024
Copy link
Contributor

openshift-ci bot commented Nov 12, 2024

@liouk: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


#### Authentication Resource

The cluster's Authentication CR (`authentication.config/cluster`) must be modified and updated with the configuration of the external OIDC provider in the `OIDCProviders` field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The cluster's Authentication CR (`authentication.config/cluster`) must be modified and updated with the configuration of the external OIDC provider in the `OIDCProviders` field.
The cluster's Authentication CR (`authentication.config/cluster`) must be modified and updated with the configuration of the external OIDC provider in the `OIDCProviders` field. The `WebhookTokenAuthenticator` field should be removed (by setting it `null`).

I add this explicitly because readers who are not aware of this may just set oidcProviders and ignore webhookTokenAuthenticator. This makes readers who attempt to run oc patch or oc edit on authentication.config/cluster the first time get:

The Authentication "cluster" is invalid: spec.webhookTokenAuthenticator: Invalid value: v1.WebhookTokenAuthenticator{KubeConfig:v1.SecretNameReference{Name:"webhook-authentication-integrated-oauth"}}: this field cannot be set with the "OIDC" .spec.type

Copy link
Contributor

@ibihim ibihim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 13, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 060ebf1 into openshift:master Nov 13, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants