Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Timeout for token retrievals in consequent failures #158

Closed
orenbm opened this issue Sep 9, 2020 · 4 comments
Closed

Fix Timeout for token retrievals in consequent failures #158

orenbm opened this issue Sep 9, 2020 · 4 comments

Comments

@orenbm
Copy link
Member

orenbm commented Sep 9, 2020

Summary

After we get the access token into the shared volume, we hit the reset button on the expBackoff clock
before waiting the token refresh timeout. This means that the next
interval will start 6 minutes (by default) into the clock and only one
cycle will run (as we exceed the max elapsed time of 2 minutes).

We should reset the clock after the timeout waits to ensure that the next cycle
will run for at most 2 minutes as intended.

Here is a log that shows the failure, in which we run only once before crashing (logs are written from bottom to top):

{"log":"ERROR: 2020/09/07 07:58:57 main.go:76: CAKC031E Retransmission backoff exhausted\n","stream":"stderr","time":"2020-09-07T07:58:57.784828076Z"}
{"log":"ERROR: 2020/09/07 07:58:57 main.go:48: CAKC016E Failed to authenticate\n","stream":"stderr","time":"2020-09-07T07:58:57.784823538Z"}
{"log":"ERROR: 2020/09/07 07:58:57 authenticator.go:233: CAKC027E Failed to send https authenticate request or receive response. Reason:...\n","stream":"stderr","time":"2020-09-07T07:58:57.784781083Z"}
{"log":"INFO: 2020/09/07 07:58:47 requests.go:47: CAKC012I Authn request to: ...\n","stream":"stdout","time":"2020-09-07T07:58:47.784448565Z"}
{"log":"INFO: 2020/09/07 07:58:47 authenticator.go:183: CAKC010I Buffer time:  30s\n","stream":"stdout","time":"2020-09-07T07:58:47.783685174Z"}
{"log":"INFO: 2020/09/07 07:58:47 authenticator.go:182: CAKC009I Current date: 2020-09-07 07:58:47.783493151 +0000 UTC\n","stream":"stdout","time":"2020-09-07T07:58:47.783680325Z"}
{"log":"INFO: 2020/09/07 07:58:47 authenticator.go:181: CAKC008I Cert expires: 2020-09-09 05:28:18 +0000 UTC\n","stream":"stdout","time":"2020-09-07T07:58:47.783676009Z"}
{"log":"INFO: 2020/09/07 07:58:47 main.go:45: CAKC006I Authenticating as user '...'\n","stream":"stdout","time":"2020-09-07T07:58:47.7836441Z"}
{"log":"\n","stream":"stdout","time":"2020-09-07T07:52:47.783548198Z"}
{"log":"INFO: 2020/09/07 07:52:47 main.go:63: CAKC013I Waiting for 6m0s to re-authenticate\n","stream":"stdout","time":"2020-09-07T07:52:47.783544229Z"} 

You can see that we start a new wait (CAKC013I) and this happens after we run expBackoff.Reset(). So we are supposed to run a new backoff cycle that will run for up to 2 minutes before crashing. However, we run only one time and finish the backoff before writing CAKC031E Retransmission backoff exhausted to the log.

Steps to Reproduce

Steps to reproduce the behavior:

  1. Run the authn-client as a side car
  2. Fail the second interval

Expected Results

The authenticator retries to authenticate

Actual Results (including error logs, if applicable)

The authenticator exists with failure after one try

@orenbm orenbm changed the title Second interval Fix Timeout for token retrievals in consequent failures Sep 9, 2020
orenbm added a commit that referenced this issue Sep 13, 2020
### Fixed
- Logs now correctly print only the Conjur identity without the policy branch prefix.
  ([#126](#126))
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  ([#158](#158))

### Changed
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [#119](#119)
orenbm added a commit that referenced this issue Sep 13, 2020
### Fixed
- Logs now correctly print only the Conjur identity without the policy branch prefix.
  ([#126](#126))
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  ([#158](#158))

### Changed
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [#119](#119)
orenbm added a commit that referenced this issue Sep 13, 2020
### Fixed
- Logs now correctly print only the Conjur identity without the policy branch prefix.
  ([#126](#126))
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  ([#158](#158))

### Changed
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [#119](#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
orenbm added a commit to cyberark/secretless-broker that referenced this issue Oct 8, 2020
This version introduces some changes that we can benefit from, especially these:
- Errors in the certificate injection process on login are now printed to the client logs.
  [cyberark/conjur-authn-k8s-client#/170](cyberark/conjur-authn-k8s-client#170)
- When authentication fails, the exponential backoff retry is correctly reset so
  that it will continue to attempt to authenticate until backoff is exhausted.
  [cyberark/conjur-authn-k8s-client#158](cyberark/conjur-authn-k8s-client#158)
- Wait slightly for the client certificate file to exist after login before
  raising an error.
  [cyberark/conjur-authn-k8s-client#119](cyberark/conjur-authn-k8s-client#119)
@orenbm
Copy link
Member Author

orenbm commented Oct 20, 2020

fixed by #157

@orenbm orenbm closed this as completed Oct 20, 2020
@alexkalish
Copy link

@orenbm: Could you please explain this fix from the user/customer's perspective? I don't understand the details of the K8s token refresh feature/logic and need to detail this in the release notes. Also, is this logic captured in our docs anywhere? I don't see anything about it here. Thanks!

@orenbm
Copy link
Member Author

orenbm commented Nov 10, 2020

i don't think the logic is captured in the docs. Our authn-k8s docs are lacking a lot of details on the flow.

What are you missing here? I was about to write the fix but it won't be different than what is explained in the description above and in the PR description.

CAn you please elaborate. on what is not clear?

@alexkalish
Copy link

@orenbm: I just find the description a little confusing. For example, "expBackoff clock" seems like an implementation detail. The clock isn't exposed to users, right? And the rest of the description is written around this clock. Why would someone care about this fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants