-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be smarter about retries #409
Comments
Thanks for filing this @embano1 |
It seems this might be the standard among implementations right now |
I wonder why this is overwritten at all and not the default option used? https://github.com/cloudevents/sdk-go/blob/c6de9589bb3578364689adc64414000dabe93c01/v2/protocol/http/protocol.go#L40 This looks like it does what we need (although retrying on 500 would be nice too, I may want to file a PR for the SDK)? cloudevents/sdk-go#718 |
Yeah, it definitely seems like something the sdk should be handling |
just in case you haven't seen this: https://github.com/knative/eventing/blob/main/pkg/kncloudevents/retries.go and this: knative/eventing#5285 |
@lionelvillard are you suggesting we should be using kncloudevents as client? |
I'm just raising awareness, not suggesting anything :-) kncloudevents is meant to be Knative eventing compliant so maybe it makes sense for you to use it. |
@benmoss thoughts? |
Gotcha! So IMHO we have couple of options here (most, if not all of them breaking):
|
I'm confused what the desired behavior even is. We have a bunch of different definitions of when to retry and not, but not much clarity on which one is correct. I think I'd be most comfortable for us to follow the the eventing spec, but happy to hear arguments for alternative views. |
I agree Ben, we should align with the Knative Eventing spec then - even though I'm not 100% on the same page with some decisions here. But this also means we have to implement our own retry code mapping [1] since the CloudEvents SDK has a diverging (from the above spec) set of defaults: var defaultRetriableErrors = map[int]string{
404: "Not Found",
413: "Payload Too Large",
425: "Too Early",
429: "Too Many Requests",
502: "Bad Gateway",
503: "Service Unavailable",
504: "Gateway Timeout",
} [1] |
The dispatcher currently considers all !2xx || !3xx status codes as retryable, which is somewhat against the HTTP protocol spec:
https://github.com/knative-sandbox/eventing-rabbitmq/blob/928976461374266e0c029bc61719c68b3dffad8b/pkg/dispatcher/dispatcher.go#L160
I'm happy to file a PR if we get consensus on which codes we want to retry. Related work:
The text was updated successfully, but these errors were encountered: