Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[licensing] add license fetcher cache #170006

Merged
merged 11 commits into from
Oct 30, 2023

Conversation

pgayvallet
Copy link
Contributor

@pgayvallet pgayvallet commented Oct 27, 2023

Summary

Related to #169788
Fix #117394

@pgayvallet pgayvallet added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:License v8.12.0 release_note:skip Skip the PR/issue when compiling release notes labels Oct 27, 2023
@@ -29,7 +30,7 @@ export function createLicenseUpdate(
) {
const manuallyRefresh$ = new Subject<void>();

const fetched$ = merge(triggerRefresh$, manuallyRefresh$).pipe(
const fetched$ = merge(triggerRefresh$, manuallyRefresh$.pipe(debounceTime(1000))).pipe(
Copy link
Contributor

@gsoldevila gsoldevila Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some plugins call licensing.refresh() (exposed on the licensing plugin contract), and they are awaiting for the license back.

Using debounceTime here might have a negative impact, as it will add an extra 1 second delay systematically to those calls. throttleTime is probably a better alternative, but I have to confirm it works, as the description says subsequent emissions are "ignored".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch, but actually both are problematic. Using throttleTime will ignore the subsequent emissions, meaning that the wait for refresh could be from up to 30s (the interval based refresh)

TBH, I only added this because I thought it would be a quick win. As we've seen, the use of exhaustMap is likely sufficient.

So probably the correct thing to do is to revert this change and keep the code as it was, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the throttleTime and it indeed ignores emissions, so it's not good either.
The picture they have in their website is misleading, as events that happen during the "throttle period" seem to be taken into account.

Copy link
Contributor

@gsoldevila gsoldevila Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE: throttleTime has a 3rd parameter, configuration, that serves precisely this purpose:
It isn't documented at all, but by playing around, I think I got to understand the meaning. The default config is:

export const defaultThrottleConfig: ThrottleConfig = {
  leading: true,
  trailing: false,
};
  1. leading: false, trailing: false all events are ignored.
  2. leading: true, trailing: false default behavior (ignore trailing).
  3. leading: false, trailing: true same behavior as debounceTime.
  4. leading: true, trailing: true let the first value (aka "leading") pass through, and it debounces subsequent (trailing) ones.

So, for our use case, we need the 4th option.

return async () => {
const client = isPromise(clusterClient) ? await clusterClient : clusterClient;
try {
const response = await client.asInternalUser.xpack.info(undefined, { maxRetries: 3 });
Copy link
Contributor

@gsoldevila gsoldevila Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed maxRetries: 3 wasn't part of the old code, so IIUC we're now asking elasticsearch-js to retry 3 times in a row before actually reporting error 👍🏼

That's already an improvement, but I wonder if we could have a more resilient strategy using exponential backoff and perhaps a few more retries (5?). That would probably cover https://github.com/elastic/sdh-kibana/issues/4194 already.

Don't want to block the PR on this, it can be tackled with a separate issue.
I also agree that ES probably shouldn't report itself as available if the GET /_xpack API is not.
We can investigate/confirm if that's the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed maxRetries: 3 wasn't part of the old code, so IIUC we're now asking elasticsearch-js to retry 3 times in a row before actually reporting error 👍🏼

It's actually the default value of the option. I'm not sure why I added it tbh, I think I wanted it to be more explicit. I should just remove it for now.

Copy link
Contributor

@gsoldevila gsoldevila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pgayvallet pgayvallet marked this pull request as ready for review October 30, 2023 13:50
@pgayvallet pgayvallet requested a review from a team as a code owner October 30, 2023 13:50
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@pgayvallet pgayvallet enabled auto-merge (squash) October 30, 2023 14:50
@pgayvallet pgayvallet merged commit 21c0b0b into elastic:main Oct 30, 2023
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
licensing 9.7KB 9.8KB +66.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Oct 30, 2023
jbudz added a commit to jbudz/kibana that referenced this pull request Oct 30, 2023
jbudz added a commit that referenced this pull request Oct 30, 2023
@jbudz
Copy link
Member

jbudz commented Oct 30, 2023

This was reverted with d7ab75f. It appears to have extended the runtime of a few test suites:

x-pack/test/saved_object_api_integration/security_and_spaces/config_basic.ts: 70.2 minutes
x-pack/test/saved_object_api_integration/security_and_spaces/config_trial.ts: 70.2 minutes

Builds:
https://buildkite.com/elastic/kibana-on-merge/builds/37538
https://buildkite.com/elastic/kibana-on-merge/builds/37539
https://buildkite.com/elastic/kibana-on-merge/builds/37540

@jbudz jbudz added the reverted label Oct 30, 2023
pgayvallet added a commit that referenced this pull request Oct 31, 2023
## Summary

#170006 was reverted because of
significant increases of run duration on some of our FTR test suites,
apparently because of the throttling that was added...

This PR reopens #170006 without
the throttling

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:License release_note:skip Skip the PR/issue when compiling release notes reverted Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[licensing] intermittent "license is not available" causing alerting rules to fail to execute
6 participants