Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove executionLog to reduce memory consumption #111911

Merged
merged 4 commits into from
Sep 14, 2021

Conversation

rudolf
Copy link
Contributor

@rudolf rudolf commented Sep 10, 2021

Summary

Saves at least 400MB of heap when doing a migration of 100k saved objects (because of GC it's hard to measure exactly, but while this migration previously required --max_old_space_size=1000 it now passes with --max_old_space_size=600)

When I wrote the initial implementation I was quite worried we'd get some weird failure edge case that we can't reproduce. So I added the executionLog so that if a migration fails we could log a lot of detail (i.e. the complete state and all action responses). However, we haven't really needed these detailed logs and we can still get this information by asking users to run migrations with debug logging enabled. So at the expense of losing some detail in our logs migrations have a much better chance of succeeding when there's a huge amount of saved objects, or many large saved objects.

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
Multiple Spaces—unexpected behavior in non-default Kibana Space. Low High Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces.
Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. High Low Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure.
Code should gracefully handle cases when feature X or plugin Y are disabled. Medium High Unit tests will verify that any feature flag or plugin combination still results in our service operational.
See more potential risk examples

For maintainers

@rudolf rudolf added v7.15.0 bug Fixes for quality problems that affect the customer experience labels Sep 10, 2021
@rudolf rudolf marked this pull request as ready for review September 13, 2021 08:40
@rudolf rudolf requested a review from a team as a code owner September 13, 2021 08:40
@rudolf
Copy link
Contributor Author

rudolf commented Sep 13, 2021

@elasticmachine merge upstream

Comment on lines -198 to -199
await cleanup(client, executionLog, finalState);
dumpExecutionLog(logger, logMessagePrefix, executionLog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, it seems to be the case, but just to be sure:

  1. When the migration logger's level is DEBUG, we're still outputting the same information we previously stored in the executionLogs, right?

  2. In case of failure, the full cause is still outputted in the logs even when not in DEBUG level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes
  2. We still output the full fatal message + traces etc. The difference is, before we would also log the execution log including the full state and responses to each action. Because this was a meta field, it wasn't visible by default with the legacy logger config, you would have to switch to NP JSON logging to actually see this. So in practice for most on-prem and cloud users we won't loose any details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that we only logged the state transition meta information from the executionLog. So I'm adding debug logging when there's a state transition so that we can inspect the complete state after every step when debugging is enabled.

@rudolf
Copy link
Contributor Author

rudolf commented Sep 14, 2021

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Comment on lines +68 to +71
const outputStream = this.outputStream;
this.outputStream = undefined;

outputStream.end(() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just

this.outputStream.end(() => {   
    resolve();
});
this.outputStream = undefined;

?

I think the handler will never be executed synchronously?

Comment on lines +55 to +62
{
kibana: {
migrations: {
state: currState,
duration: tookMs,
},
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only be visible when using the JSON layout (in addition to the debug level I mean), but I guess this can't be helped.

@rudolf rudolf merged commit 9b9286f into elastic:master Sep 14, 2021
@rudolf rudolf added the auto-backport Deprecated - use backport:version if exact versions are needed label Sep 15, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Sep 15, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Sep 15, 2021
@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.15
7.x

The backport PRs will be merged automatically after passing CI.

kibanamachine added a commit that referenced this pull request Sep 15, 2021
@rudolf rudolf deleted the migrations-memory branch September 15, 2021 20:26
@kibanamachine
Copy link
Contributor

Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync.

@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Sep 16, 2021
@rudolf rudolf added v7.15.1 and removed v7.15.0 labels Sep 16, 2021
kibanamachine added a commit that referenced this pull request Sep 16, 2021
…12223)

* Remove executionLog to reduce memory consumption (#111911)

Co-authored-by: Kibana Machine <[email protected]>

* Fix tests

Co-authored-by: Rudolf Meijering <[email protected]>
@kibanamachine kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience Feature:Saved Objects release_note:fix v7.15.1 v7.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants