Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework agent retry config, extend it to cover proxy cache as well #11113

Merged
merged 11 commits into from
Mar 18, 2021

Conversation

ncabatoff
Copy link
Collaborator

@ncabatoff ncabatoff commented Mar 16, 2021

Remove template_retry config section. Add new vault.retry section which only has num_retries field; if num_retries is 0 or absent, default it to 12 for backwards compat.

Configured retries are used for both templating and api proxy, though if template requests go through proxy (currently requires persistence enabled) we'll only configure retries for the latter to avoid duplicate retrying. Though there is some duplicate retrying already because whenever the template server does a retry when not going through the proxy, the Vault client it uses allows for two behind-the-scenes retries for some 400/500 http error codes.

…ch only has num_retries field; if num_retries is 0 or absent, default it to 12 for backwards compat.

Configured retries are used for both templating and api proxy, though if template requests go through proxy (currently requires persistence enabled) we'll only configure retries for the latter to avoid duplicate retrying.  Though there is some duplicate retrying already because whenever the template server does a retry when not going through the proxy, the Vault client it uses allows for two behind-the-scenes retries for some 400/500 http error codes.

Missing: tests for proxy retries; tests for retries with persistent cache.
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 16, 2021 17:35 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 16, 2021 17:35 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 16, 2021 19:49 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 16, 2021 19:49 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 17, 2021 13:23 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 17, 2021 13:23 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 17, 2021 13:36 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 17, 2021 13:36 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 17, 2021 13:48 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 17, 2021 13:48 Inactive
@@ -193,8 +188,19 @@ func LoadConfig(path string) (*Config, error) {
return nil, errwrap.Wrapf("error parsing 'vault':{{err}}", err)
}

if err := parseRetry(result, list); err != nil {
return nil, errwrap.Wrapf("error parsing 'retry': {{err}}", err)
if result.Vault == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this ever be nil if we're always assigning it within parseVault (L236, result.Vault = &v)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be nil if there's no vault{} stanza.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. We could perform the assignment first within that func (similar to what we do in here), but I'll leave that up to you.

Comment on lines +196 to +203
if result.Vault.Retry == nil {
result.Vault.Retry = &Retry{}
}
switch result.Vault.Retry.NumRetries {
case 0:
result.Vault.Retry.NumRetries = ctconfig.DefaultRetryAttempts
case -1:
result.Vault.Retry.NumRetries = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this logic into parseRetry?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it there initially, I just moved it out in the most recent commits so that we could have a consistent default behaviour even when there's no vault{} or no vault.retry{} stanza.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I was trying to think if we could keep the assignment/modification of result.Vault.Retry within its parseRetry func.

Copy link
Contributor

@calvn calvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Left a few additional minor comments.

command/agent_test.go Outdated Show resolved Hide resolved
command/agent.go Outdated Show resolved Hide resolved
Comment on lines +252 to +253
// is enabled. The templating engine and the cache have some inconsistencies
// that need to be fixed for 1.7x/1.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on the specific inconsitencies (backoff behavior, and perhaps turn the comment into a // TODO: ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what they are, I'm just citing a slack convo with Jason. I just put this comment in here because I was surprised that persistence was related to template use of cache, and decided to put his reply in here for the sake of others who might be similarly confused.

Co-authored-by: Calvin Leung Huang <[email protected]>
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 17, 2021 20:29 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 17, 2021 20:29 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 17, 2021 20:32 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 17, 2021 20:32 Inactive
Copy link
Contributor

@calvn calvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a CL entry for this change, and maybe backlink/mention the PR that the added template_retry in this PR's description?

Copy link
Contributor

@jasonodonnell jasonodonnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and tested great with persistent caching.

@ncabatoff
Copy link
Collaborator Author

Can you add a CL entry for this change, and maybe backlink/mention the PR that the added template_retry in this PR's description?

Will do

@ncabatoff ncabatoff added this to the 1.7 milestone Mar 18, 2021
@vercel vercel bot temporarily deployed to Preview – vault March 18, 2021 17:37 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 18, 2021 17:37 Inactive
@vercel vercel bot temporarily deployed to Preview – vault-storybook March 18, 2021 17:47 Inactive
@vercel vercel bot temporarily deployed to Preview – vault March 18, 2021 17:47 Inactive
@ncabatoff ncabatoff merged commit 2548414 into master Mar 18, 2021
@ncabatoff ncabatoff deleted the rework-agent-retry-config branch March 18, 2021 18:14
ncabatoff added a commit that referenced this pull request Mar 18, 2021
…1113)

Remove template_retry config section.  Add new vault.retry section which only has num_retries field; if num_retries is 0 or absent, default it to 12 for backwards compat with pre-1.7 template retrying.  Setting num_retries=-1 disables retries.

Configured retries are used for both templating and api proxy, though if template requests go through proxy (currently requires persistence enabled) we'll only configure retries for the latter to avoid duplicate retrying.  Though there is some duplicate retrying already because whenever the template server does a retry when not going through the proxy, the Vault client it uses allows for 2 behind-the-scenes retries for some 400/500 http error codes.
ncabatoff added a commit that referenced this pull request Mar 19, 2021
…1113) (#11136)

Remove template_retry config section.  Add new vault.retry section which only has num_retries field; if num_retries is 0 or absent, default it to 12 for backwards compat with pre-1.7 template retrying.  Setting num_retries=-1 disables retries.

Configured retries are used for both templating and api proxy, though if template requests go through proxy (currently requires persistence enabled) we'll only configure retries for the latter to avoid duplicate retrying.  Though there is some duplicate retrying already because whenever the template server does a retry when not going through the proxy, the Vault client it uses allows for 2 behind-the-scenes retries for some 400/500 http error codes.
@calvn
Copy link
Contributor

calvn commented Jun 4, 2021

A side-effect from this PR that I noticed while working on #11711 is that moving template retry to be performed by the cache's client doesn't always behave the same as the internal retry via ctconfig.RetryConfig.Attempts. The latter will work for 400 response code errors returned by Vault, but the former will simply return the error immediately since go-retryablehttp will only retry on connection failures or 500's.

2021/06/04 19:46:57.039001 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:46:57.039985 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 1 after "250ms")
2021/06/04 19:46:57.293383 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:46:57.293407 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:46:57.295581 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 2 after "500ms")
2021/06/04 19:46:57.797885 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:46:57.797924 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:46:57.800669 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 3 after "1s")
2021/06/04 19:46:58.800856 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:46:58.800892 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:46:58.803075 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 4 after "2s")
2021/06/04 19:47:00.803449 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:47:00.803507 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:47:00.805684 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 5 after "4s")
2021/06/04 19:47:04.809820 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:47:04.809856 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:47:04.812631 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 6 after "8s")
2021/06/04 19:47:12.813067 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:47:12.813095 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:47:12.815464 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 7 after "16s")
2021/06/04 19:47:28.820148 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:47:28.820189 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:47:28.822619 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 8 after "32s")
2021/06/04 19:48:00.827528 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:48:00.827589 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:48:00.830290 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 9 after "1m0s")
2021/06/04 19:49:00.832486 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:49:00.832664 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:49:00.835545 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 10 after "1m0s")
2021/06/04 19:50:00.840056 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:50:00.840077 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:50:00.842005 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 11 after "1m0s")
2021/06/04 19:51:00.843680 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:51:00.843708 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:51:00.846842 [WARN] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (retry attempt 12 after "1m0s")
2021/06/04 19:52:00.850004 [TRACE] (view) vault.read(secret/data/invalid) starting fetch
2021/06/04 19:52:00.850043 [TRACE] vault.read(secret/data/invalid): GET /v1/secret/data/invalid
2021/06/04 19:52:00.852776 [ERR] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (exceeded maximum retries)
2021/06/04 19:52:00.852791 [ERR] (runner) watcher reported error: vault.read(secret/data/invalid): no secret exists at secret/data/invalid
2021-06-04T12:52:00.852-0700 [ERROR] template.server: template server error: error="vault.read(secret/data/invalid): no secret exists at secret/data/invalid"
2021/06/04 19:52:00.852808 [INFO] (runner) stopping
2021/06/04 19:52:00.852810 [DEBUG] (runner) stopping watcher
2021/06/04 19:52:00.852812 [DEBUG] (watcher) stopping all views
2021/06/04 19:52:00.852817 [TRACE] (watcher) stopping vault.read(secret/data/invalid)

vs

2021-06-04T13:00:41.003-0700 [INFO]  cache: received request: method=GET path=/v1/secret/data/invalid
2021-06-04T13:00:41.003-0700 [DEBUG] cache.leasecache: forwarding request: method=GET path=/v1/secret/data/invalid
2021-06-04T13:00:41.003-0700 [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secret/data/invalid
2021-06-04T13:00:41.003-0700 [DEBUG] cache.apiproxy.client: performing request: method=GET url=http://localhost:8200/v1/secret/data/invalid
[ERR] (view) vault.read(secret/data/invalid): no secret exists at secret/data/invalid (exceeded maximum retries)
[ERR] (runner) watcher reported error: vault.read(secret/data/invalid): no secret exists at secret/data/invalid
[INFO] (runner) stopping
[DEBUG] (runner) stopping watcher
[DEBUG] (watcher) stopping all views
[TRACE] (watcher) stopping vault.read(secret/data/invalid)
2021-06-04T13:00:41.004-0700 [INFO]  template.server: template server stopped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants