-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Fix master node deadlock during ML daily maintenance #31836
[ML] Fix master node deadlock during ML daily maintenance #31836
Conversation
Pinging @elastic/ml-core |
@@ -79,7 +84,8 @@ public void remove(ActionListener<Boolean> listener) { | |||
|
|||
SearchRequest searchRequest = new SearchRequest(RESULTS_INDEX_PATTERN); | |||
searchRequest.source(source); | |||
client.execute(SearchAction.INSTANCE, searchRequest, forecastStatsHandler); | |||
client.execute(SearchAction.INSTANCE, searchRequest, new ThreadedActionListener<>(LOGGER, threadPool, | |||
MachineLearning.UTILITY_THREAD_POOL_NAME, forecastStatsHandler, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the removeDataBefore()
method in ExpiredModelSnapshotsRemover
should also use a ThreadedActionListener
in exactly the same way this class does. It also has the problem of doing a (potentially) large amount of parsing on the network thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I pushed a commit to fix that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I changed the PR title to be identical to #31691 so it's clearer to anyone browsing the PR list that it's fixing the same problem.
This is the implementation for master and 6.x of elastic#31691. Relates elastic#31683
d830877
to
4e61e8a
Compare
* 6.x: [ML] Fix master node deadlock during ML daily maintenance (#31836) Build: Switch integ-test-zip to OSS-only (#31866) Build: Fix detection of Eclipse Compiler Server (#31838) SQL: Remove restriction for single column grouping (#31818) Docs: Inconsistency between description and example (#31858) Fix and reenable TribeIntegrationTests QA: build improvements related to SQL projects (#31862) muted test [Docs] Add clarification to analysis example (#31826) Check timeZone() argument in AbstractSqlQueryRequest (#31822) Remove obsolete parameters from analyze rest spec (#31795) SQL: Fix incorrect HAVING equality (#31820) Smaller aesthetic fixes to InternalTestCluster (#31831) [Docs] Clarify accepted sort case (#31605) Do not return all indices if a specific alias is requested via get aliases api. (#29538) [Docs] Fix wrong link in Korean analyzer docs (#31815) Fix profiling of ordered terms aggs (#31814) Fix handling of points_only with term strategy in geo_shape (#31766) Docs: Explain _bulk?refresh shard targeting REST high-level client: add get index API (#31703)
* master: [ML] Fix master node deadlock during ML daily maintenance (#31836) Build: Switch integ-test-zip to OSS-only (#31866) SQL: Remove restriction for single column grouping (#31818) Build: Fix detection of Eclipse Compiler Server (#31838) Docs: Inconsistency between description and example (#31858) Re-enable bwc tests now that #29538 has been backported and 6.x intake build succeeded. QA: build improvements related to SQL projects (#31862) [Docs] Add clarification to analysis example (#31826) Check timeZone() argument in AbstractSqlQueryRequest (#31822) SQL: Fix incorrect HAVING equality (#31820) Smaller aesthetic fixes to InternalTestCluster (#31831) [Docs] Clarify accepted sort case (#31605) Temporarily disable bwc test in order to backport #29538 Remove obsolete parameters from analyze rest spec (#31795) [Docs] Fix wrong link in Korean analyzer docs (#31815) Fix profiling of ordered terms aggs (#31814) Properly mute test involving JDK11 closes #31739 Do not return all indices if a specific alias is requested via get aliases api. (#29538) Get snapshot rest client cleanups (#31740) Docs: Explain _bulk?refresh shard targeting Fix handling of points_only with term strategy in geo_shape (#31766)
This is the implementation for master and 6.x of #31691.
Native tests are changed to use multi-node clusters in #31757.
Relates #31683