Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: mutator tracing #1050

Merged
merged 14 commits into from
Jan 17, 2023
Merged

feat: mutator tracing #1050

merged 14 commits into from
Jan 17, 2023

Conversation

alnr
Copy link
Collaborator

@alnr alnr commented Jan 6, 2023

#1001

This slightly refactors #1049 and keeps more of the timeout behavior unchanged.

@alnr alnr requested a review from aeneasr as a code owner January 6, 2023 12:46
@alnr alnr mentioned this pull request Jan 6, 2023
7 tasks
@codecov
Copy link

codecov bot commented Jan 6, 2023

Codecov Report

Merging #1050 (ddda29c) into master (8f42940) will decrease coverage by 0.35%.
The diff coverage is 69.23%.

@@            Coverage Diff             @@
##           master    #1050      +/-   ##
==========================================
- Coverage   78.09%   77.74%   -0.36%     
==========================================
  Files          83       83              
  Lines        3977     3841     -136     
==========================================
- Hits         3106     2986     -120     
+ Misses        593      578      -15     
+ Partials      278      277       -1     
Impacted Files Coverage Δ
driver/registry_memory.go 93.75% <ø> (+3.92%) ⬆️
x/registry.go 0.00% <0.00%> (ø)
pipeline/mutate/mutator_hydrator.go 67.82% <81.81%> (+1.73%) ⬆️
internal/driver.go 77.77% <0.00%> (-22.23%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@alnr alnr self-assigned this Jan 6, 2023
Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor thing for tests otherwise looks good!

pipeline/authn/authenticator_oauth2_introspection_test.go Outdated Show resolved Hide resolved
@alnr
Copy link
Collaborator Author

alnr commented Jan 9, 2023

  • updated config.schema.json
  • bumped ory/x
  • reverted test change (that snuck in during 7c7f21f)

pipeline/mutate/mutator_hydrator.go Outdated Show resolved Hide resolved
if len(cfg.Api.Retry.MaxDelay) > 0 {
maxRetryDelay := time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this needs to be moved one scope up, like it was before, so that the default is also set when the config is not set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we need a test for this, so e.g. only setting cfg.Api.Retry.MaxDelay, and expecting the giveUpAfter default to be set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this in more detail, the retry config was completely broken in 2a97e05.

I'll revisit this...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty... the retry configuration is broken all over Oathkeeper. Here's my understanding of what is currently happening:

  • The give_up_after property specifies the per-retry timeout. That's surprising to me, since it sounds like it means the overall timeout, including retries.
  • The max_delay specifies the maximum backoff wait time in between retries.
  • The maximum number of retries is not configurable, and is always 4 (the default in hashicorp/retryablehttp).
  • This config migration is incorrect. It thinks give_up_after is the overall timeout (including retries).

Not sure now is the time to fix this fully, since that could also break users.

With the goal of moving this PR over the line, I propose to keep as much backward-compatibility as possible. This particular mutator previously had no overall timeout and no retries if the retry config was not specified at all. If the retry config was specified (even if {}), it peformed up to 4 retries. If the retry config is empty ({}), the timeouts also changed.

This patch keeps that behavior, while fixing bugs:

  • the fallback timeouts (retry config present but empty) were wrong: 50ms timeout and 1s delay, even though the config schema promises a more reasonable 1s timeout and 100ms delay. Those defaults now at least work as advertised.
  • faulty error messages

We should consider revamping the timeout handling all over Oathkeeper in a follow-up ticket.

Thoughts? @zepatrik @aeneasr @daviddelucca ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is related to activate tracing on mutator. I think we could merge it and open another ticket/branch to work on revamping on timeouts.

Does make sense?

pipeline/mutate/mutator_hydrator.go Outdated Show resolved Hide resolved
spec/config.schema.json Outdated Show resolved Hide resolved
spec/config.schema.json Outdated Show resolved Hide resolved
Copy link
Member

@zepatrik zepatrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I agree to merge this as is and fix the other issues you found in a separate PR 👍

zepatrik
zepatrik previously approved these changes Jan 11, 2023
@alnr alnr marked this pull request as draft January 12, 2023 23:58
@alnr
Copy link
Collaborator Author

alnr commented Jan 12, 2023

This needs more work to synchronize the tracing config schemata with ory/x.

@alnr
Copy link
Collaborator Author

alnr commented Jan 13, 2023

I've synced the tracing config schema with the one from ory/x. Initially, I tried to reference it by URL, but that (logically) requires Oathkeeper to load the remote schema on startup, which is often not possible due to network segmentation.

@daviddelucca can you try this branch, please?

@alnr alnr marked this pull request as ready for review January 13, 2023 15:43
@aeneasr aeneasr merged commit f74e8e8 into master Jan 17, 2023
@aeneasr aeneasr deleted the refactor/mutator-tracing branch January 17, 2023 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants