-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use system context for cluster state update tasks #31241
Use system context for cluster state update tasks #31241
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left one comment about a potential other solution, to see if we like it better.
final ThreadContext threadContext = threadPool.getThreadContext(); | ||
final Supplier<ThreadContext.StoredContext> supplier = threadContext.newRestorableContext(false); | ||
try (ThreadContext.StoredContext ignore = threadContext.stashContext()) { | ||
threadContext.markAsSystemContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of doing this, I wonder if we should invest in a SystemThreadContext that we give to the executor and it does is under the system context. Alternatively we can have a "no context" executor that simply doesn't set any context. @tvernum do you see any other usage for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand this correctly. Would you have different executors based on whether they switch to the system context or stay in default context? Looking at the other usages of threadContext.markAsSystemContext();
, all of them seem to use executors that cannot be categorized as system or non-system (i.e. generic threadpool, ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my suggestion referred to the entire pattern of generating a new context and then marking it as system and deal with on the executor level. This has two potential upsides:
- You don't need to remember to do this everywhere you submit a task (not so important here as there's just one place).
- We don't do all the "clean & capture and then restore in the thread pool" song and dance.
Since this is not a high throughput usage and we only have one place we submit tasks it doesn't have such a big impact on MasterService
but might be useful for other things that use an internal thread pool executor.
@@ -346,8 +346,8 @@ public void onFailure(String source, Exception e) { | |||
} | |||
|
|||
@Override | |||
public void clusterStatePublished(ClusterChangedEvent clusterChangedEvent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if ClusterStateUpdateTask should have a final empty method for this? I think it's always the wrong one in that contex.t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea, I've pushed 1b7eeb4 to implement this.
The doc tests ( Finally, I'm not sure what the implications of this PR are for cluster state publication, where we cross node boundaries (which does not seem to transfer the system context, only reestablish it because of internal action?) |
I've removed the assertion in
|
…-task-under-system-context
@ywelsch I think you can remove the assertion it's not valid anymore. The new assertion also isn't |
…-task-under-system-context
After removing the assertions in |
…-task-under-system-context
…-task-under-system-context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @bleskes @tvernum @davidkyle @s1monw |
This commit makes it so that cluster state update tasks always run under the system context, only restoring the original context when the listener that was provided with the task is called. A notable exception is the clusterStatePublished(...) callback which will still run under system context, because it's defined on the executor-level, and not the task level, and only called once for the combined batch of tasks and can therefore not be uniquely identified with a task / thread context. Relates #30603
* 6.x: Add get stored script and delete stored script to high level REST API Increasing skip version for failing test on 6.x Skip get_alias tests for 5.x (#31397) Fix defaults in GeoShapeFieldMapper output (#31302) Test: better error message on failure Mute DefaultShardsIT#testDefaultShards test Fix reference to XContentBuilder.string() (#31337) [DOCS] Adds monitoring breaking change (#31369) [DOCS] Adds security breaking change (#31375) [DOCS] Backports breaking change (#31373) RestAPI: Reject forcemerge requests with a body (#30792) Docs: Use the default distribution to test docs (#31251) Use system context for cluster state update tasks (#31241) [DOCS] Adds testing for security APIs (#31345) [DOCS] Removes ML item from release highlights [DOCS] Removes breaking change (#31376) REST high-level client: add validate query API (#31077) Move language analyzers from server to analysis-common module. (#31300) Expose lucene's RemoveDuplicatesTokenFilter (#31275) [Test] Fix :example-plugins:rest-handler on Windows Delete typos in SAML docs (#31199) Ensure we don't use a remote profile if cluster name matches (#31331) Test: Skip alias tests that failed all weekend [DOCS] Fix version in SQL JDBC Maven template [DOCS] Improve install and setup section for SQL JDBC Add ingest-attachment support for per document `indexed_chars` limit (#31352) SQL: Fix rest endpoint names in node stats (#31371) [DOCS] Fixes small issue in release notes Support for remote path in reindex api Closes #22913 [ML] Put ML filter API response should contain the filter (#31362) Remove trial status info from start trial doc (#31365) [DOCS] Added links in breaking changes pages [DOCS] Adds links to release notes and highlights Docs: Document changes in rest client QA: Fix tribe tests to use node selector REST Client: NodeSelector for node attributes (#31296) LLClient: Fix assertion on windows LLClient: Support host selection (#30523) Add QA project and fixture based test for discovery-ec2 plugin (#31107) [ML] Hold ML filter items in sorted set (#31338) [ML] Add description to ML filters (#31330)
* master: Add get stored script and delete stored script to high level REST API - post backport fix Add get stored script and delete stored script to high level REST API (#31355) Core: Combine Action and GenericAction (#31405) Fix reference to XContentBuilder.string() (#31337) Avoid sending duplicate remote failed shard requests (#31313) Fix defaults in GeoShapeFieldMapper output (#31302) RestAPI: Reject forcemerge requests with a body (#30792) Packaging: Remove windows bin files from the tar distribution (#30596) Docs: Use the default distribution to test docs (#31251) [DOCS] Adds testing for security APIs (#31345) Clarify that IP range data can be specified in CIDR notation. (#31374) Use system context for cluster state update tasks (#31241) Percentile/Ranks should return null instead of NaN when empty (#30460) REST high-level client: add validate query API (#31077) Move language analyzers from server to analysis-common module. (#31300) [Test] Fix :example-plugins:rest-handler on Windows Expose lucene's RemoveDuplicatesTokenFilter (#31275) Reload secure settings for plugins (#31383) Remove some cases in FieldTypeLookupTests that are no longer relevant. (#31381) Ensure we don't use a remote profile if cluster name matches (#31331) [TEST] Double write alias fault (#30942) [DOCS] Fix version in SQL JDBC Maven template [DOCS] Improve install and setup section for SQL JDBC SQL: Fix rest endpoint names in node stats (#31371) Support for remote path in reindex api - post backport fix Closes #22913 [ML] Put ML filter API response should contain the filter (#31362) Support for remote path in reindex api (#31290) Add byte array pooling to nio http transport (#31349) Remove trial status info from start trial doc (#31365) [DOCS] Adds links to release notes and highlights add is-write-index flag to aliases (#30942) Add rollover-creation-date setting to rolled over index (#31144) [ML] Hold ML filter items in sorted set (#31338) [Tests] Fix edge case in ScriptedMetricAggregatorTests (#31357)
#31241 changed the cluster state update tasks to run under system context. The context wrapping did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the deprecation warnings were not carried back anymore to the caller when creating an index. This commit changes the restorable context supplier to preserve response headers.
#31241 changed the cluster state update tasks to run under system context. The context wrapping did not preserve response headers, though. This has led to a test failure on 6.x #31408, as the deprecation warnings were not carried back anymore to the caller when creating an index. This commit changes the restorable context supplier to preserve response headers.
This PR makes it so that cluster state update tasks always run under the system context, only restoring the original context when the listener that was provided with the task is called. A notable exception is the
clusterStatePublished(...)
callback which will still run under system context, because it's defined on the executor-level, and not the task level, and only called once for the combined batch of tasks and can therefore not be uniquely identified with a task / thread context. I've looked at usages of this, and adapted the two problematic ones to useclusterStateProcessed
instead.Relates #30603