Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interceptor not invoked when connection refused #6143

Closed
SamBarker opened this issue Jul 17, 2024 · 11 comments · Fixed by #6144
Closed

Interceptor not invoked when connection refused #6143

SamBarker opened this issue Jul 17, 2024 · 11 comments · Fixed by #6144
Assignees
Milestone

Comments

@SamBarker
Copy link
Contributor

SamBarker commented Jul 17, 2024

Describe the bug

I'm trying to migrate the Apache Flink kubernetes operator from using an OkHttp interceptor to using the new fabric8 http interceptor API. Which seems to have gone reasonably smoothly apart from one final test: https://github.com/apache/flink-kubernetes-operator/blob/aaf26aec8bdfedff0116c5897bd89b17b4454606/flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/KubernetesClientMetricsTest.java#L376

Which is failing because afterFailure is not invoked when there is no http response (I think). I've added a failing unit test in kubernetes-client which demonstrates what I expected to work.

So I'm wondering if the test I've written should work? Or have I missed something in how the API is expected to be used

Fabric8 Kubernetes Client version

SNAPSHOT

Steps to reproduce

  1. checkout https://github.com/SamBarker/kubernetes-client/tree/afterFailureInvocation
  2. run io.fabric8.kubernetes.client.http.AbstractInterceptorTest#afterHttpFailureRemoteOffline (I've been testing with the OkHttp client but I don't think it matters)
  3. see test fail

Expected behavior

The test passes

Runtime

Kubernetes (vanilla)

Kubernetes API Server version

1.25.3@latest

Environment

macOS

Fabric8 Kubernetes Client Logs

No response

Additional context

No response

@manusa
Copy link
Member

manusa commented Jul 17, 2024

/cc @shawkins

@manusa
Copy link
Member

manusa commented Jul 17, 2024

AFAIR the interceptors (Fabric8) are applicable once the connection is performed. For the other cases I kind of remember we had other callbacks, but I need to check.

@manusa manusa moved this to Planned in Eclipse JKube Jul 17, 2024
@shawkins
Copy link
Contributor

AFAIR the interceptors (Fabric8) are applicable once the connection is performed. For the other cases I kind of remember we had other callbacks, but I need to check.

That is correct, the fabric8 interceptors make this callback based upon the response code. The retry logic will look for IOExceptions, but does not invoke the interceptors.

I don't think we had a usecase where the handling of this situation needed delegated to an interceptor - it was assumed that the retry logic was all that was needed. If I remember correctly some of the initial attempt at this logic was to have a retry interceptor, but in the end it was easier to separate the concepts.

So at the very least this would require some kind of an enhancement, or could your operator rely on just the built-in retry behavior?

@SamBarker
Copy link
Contributor Author

I don't think we had a usecase where the handling of this situation needed delegated to an interceptor - it was assumed that the retry logic was all that was needed. If I remember correctly some of the initial attempt at this logic was to have a retry interceptor, but in the end it was easier to separate the concepts.

So at the very least this would require some kind of an enhancement, or could your operator rely on just the built-in retry behavior?

As far as I'm aware the retry behaviour is fine, the current OkHttp interceptor is purely for tracking metrics of failed and successful requests. The Fabric8 interceptor gives everything I think it needs for metrics when there is a HTTP response.

It feels a bit awkward that before is invoked for every retry but after|afterFailure is only invoked on HTTP response.

What is the lifecycle of request.id()? Is it regenerated on each retry or is it stable? If its stable I can possibly catch the missing failures...

@SamBarker
Copy link
Contributor Author

I've confirmed that request.id is updated for each retry so I can't fudge it by detecting the repeated ID and incrementing the failed counter.

@shawkins
Copy link
Contributor

I've confirmed that request.id is updated for each retry so I can't fudge it by detecting the repeated ID and incrementing the failed counter.

Sounds like you'll want an enhancement then for a new callback - it would be simplest if it were just a notification method. It seems like a bigger effort to allow the interceptor to handle retry logic as well.

@SamBarker
Copy link
Contributor Author

Yeah I don't have a use case for doing anything to the behaviour of the retry.

I think it looks relatively straightforward would you be happy if I open a PR? Or is there a design type process to go through?

@manusa
Copy link
Member

manusa commented Jul 19, 2024

I think it looks relatively straightforward would you be happy if I open a PR?

Sure, I was going to provide a fix myself, but it's always better to have community contributions. A PR would be much appreciated 🙌

Or is there a design type process to go through?

Maybe we can just discuss the implementation details, but it all seems quite straightforward.

As I see it:

As far as the Interceptor interface goes I think we just need to add the new callback method.

I'd suggest something like:

// Interceptor.java

  /**
   * Called in case the initial HTTP connection request fails after the request has been sent.
   * 
   * @param request the HTTP request.
   * @param failure the Java exception that caused the failure.
   */
  default void afterConnectionFailure(HttpRequest request, Throwable failure) {
  }

Then, implement the handling in the StandardHttpClient class within the consumeBytesOnce method:

// StandardHttpClient.java
    cf = cf.exceptionally(e -> {
      builder.getInterceptors().values().forEach(i -> i.afterConnectionFailure(effectiveRequest, e));
      if (e instanceof CompletionException) {
        throw (CompletionException) e;
      }
      throw new CompletionException(e);
    });
  • Provides a callback for exception but no options for further processing
  • Not sure if this might also get called in other situations (maybe the afterConnectionFailure is not a suitable method name)
  • It's also handled by the retry, so it will be called multiple times (once per retry)

Thoughts?

@SamBarker
Copy link
Contributor Author

Thanks @manusa that very much fits with what I was thinking.

Should this call back accept RequestTags as an argument? Or is that only relevant when using the builder interfaces? Similarly we don't need to try and distinguish between websocket and non-websocket connections, do we?

Should this interface be invoked when a websocket connection is broken? If not maybe the name should be afterInitialConnectionFailure.

SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 21, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 21, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 21, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 22, 2024
@manusa
Copy link
Member

manusa commented Jul 22, 2024

Should this call back accept RequestTags as an argument? Or is that only relevant when using the builder interfaces?

The idea of the method I proposed is to be invoked as a consumer callback whenever there's a connection failure without providing any output or outcome.

I think RequestTags are only used in case you need to modify the request to create a new connection. IMO, I don't think it's necessary, but I don't really have a use-case for this.

In any case, it's always easy to add extra-arguments to an interface method, but hard to remove them once the API is public, so we can always add it in the future.

Similarly we don't need to try and distinguish between websocket and non-websocket connections, do we?

The current Interceptor definition allows to distinguish failures for non-websocket failures, but doesn't provide distinction for web-socket failures.

This method is supposed to be called before any connection negotiation happens, so I'm not sure why would the user need a distinction between one or the other.

Should this interface be invoked when a websocket connection is broken?

Yes.

I initially suggested to bind the call to the new interface method in the StandardHttpClient.consumeBytesOnce method, but this will only invoke it in case of connection failures for regular HTTP connections.

So it's probably better to bind it in the StandardHttpClient.retryWithExponentialBackoff method or in both the consumeBytesOnce and buildWebSocket methods.

If not maybe the name should be afterInitialConnectionFailure.

Websocket connections are negotiated through HTTP and then upgraded.

@SamBarker
Copy link
Contributor Author

I've got the tests passing on https://github.com/fabric8io/kubernetes-client/pull/6144/files as discussed above. It's mostly done modulo a changelog entry (still in draft as I was working without agreement here :) )

I'll moving the failure handling to StandardHttpClient.retryWithExponentialBackoff and see how I get on.

I think RequestTags are only used in case you need to modify the request to create a new connection. IMO, I don't think it's necessary, but I don't really have a use-case for this.

Yeah that fits my understanding of how they are used but did want to clarify.

Websocket connections are negotiated through HTTP and then upgraded.

Yes, what I was getting at though is if its not invoked when an established connection fails then its only an initial connection handler and not handling all connection failures. That said the intention is to move it so this is moot.

SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 22, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 24, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Jul 24, 2024
@rohanKanojia rohanKanojia moved this from Planned to In Progress in Eclipse JKube Jul 25, 2024
SamBarker added a commit to SamBarker/kubernetes-client that referenced this issue Aug 8, 2024
@manusa manusa added this to the 7.0.0 milestone Aug 9, 2024 — with automated-tasks
@manusa manusa closed this as completed in 8147542 Aug 9, 2024
@github-project-automation github-project-automation bot moved this from Review to Done in Eclipse JKube Aug 9, 2024
manusa pushed a commit that referenced this issue Aug 9, 2024
…ed connection attempts (6144)

Add test which demonstrates what I expect to work
---
Introduce afterConnectionFailure to complement  `after` & `afterFailure`.

Fixes #6143
---
Add a bit more explanatory Javadoc to Interceptor
---
Extract private method to ensure changes to DefaultMockServer usage are consistent.
---
Add test which covers future being completed exceptionally with a CompletionException.

This happens in the Jetty implementation but  gets wrapped in an IOException by the OkHttp impl but this should be enough for the coverage checker.
---
Move afterConnectionFailure callback to `retryWithExponentialBackoff` so its invoked when an established websocket connection fails
---
restart the mock server after a connection failure is detected to speed up tests.

Rather than waiting ~20s to give up retrying.
---
Add changelog entry
---
Clarify javadoc
---
@see -> @link
---
Add unit test to ensure we unwrap CompletionExceptions.

Really just making the coverage checker happy.
---
Drop exception handling test from AbstractInterceptorTest as its now better covered elsewhere.

(cherry picked from commit 8147542)
Signed-off-by: Marc Nuri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants