Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiling plugin #91640

Merged
merged 19 commits into from
Nov 30, 2022
Merged

Conversation

danielmitterdorfer
Copy link
Member

@danielmitterdorfer danielmitterdorfer commented Nov 17, 2022

With this commit we add a new X-Pack plugin that retrieves data for Universal Profiler. This functionality is currently implemented in Kibana in the function getExecutablesAndStackTraces which we replicate here for efficiency purposes. The plugin is off by default and can be enabled by setting xpack.profiling.enabled to true.

With this commit we add a new plugin that retrieves data for Universal
Profiler. This functionality is currently implemented in Kibana in the
function `getExecutablesAndStackTraces` which we replicate here for
efficiency purposes.
@danielmitterdorfer danielmitterdorfer marked this pull request as ready for review November 22, 2022 13:56
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Nov 22, 2022
With this commit we keep the transport action enabled. This is to ensure
operator privileges can be checked in all cases (see
OperatorPrivilegesIT.testEveryActionIsEitherOperatorOnlyOrNonOperator).
@danielmitterdorfer danielmitterdorfer added >enhancement Team:Search Meta label for search team and removed needs:triage Requires assignment of a team area label labels Nov 23, 2022
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Search Meta label for search team labels Nov 23, 2022
@danielmitterdorfer danielmitterdorfer added the :Search/Search Search-related issues that do not fall into other categories label Nov 23, 2022
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team and removed needs:triage Requires assignment of a team area label labels Nov 23, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Hi @danielmitterdorfer, I've created a changelog YAML for you.

@danielmitterdorfer
Copy link
Member Author

danielmitterdorfer commented Nov 23, 2022

The test failure on elasticsearch-ci/part-1 is unrelated to this PR and also reproduces on main. I've raised #91837 and pushed #91838 to mute the test.

@nik9000 nik9000 requested review from nik9000 and dnhatn November 23, 2022 16:11
@javanna javanna removed the request for review from dnhatn November 24, 2022 13:07
@javanna javanna self-requested a review November 24, 2022 13:07
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments and questions, nothing major. This looks pretty solid and well tested to me, thanks @danielmitterdorfer .

I would ask @nik9000 to double check the aggs bits in TransportProfilingAction and then we are good?

@danielmitterdorfer
Copy link
Member Author

danielmitterdorfer commented Nov 28, 2022

Thanks for the review @javanna! I addressed your feedback in c21262b. I'm uncertain how to use AbstractXContentSerializingTestCase in my setup (see my comment above). Also, I clarified that we currently intend that this plugin does not leverage CCS but I'm uncertain what is required to support it (see my other comment). Could you please take another look?

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple more questions/suggestions, LGTM otherwise


return channel -> {
RestStatusToXContentListener<GetProfilingResponse> listener = new RestStatusToXContentListener<>(channel);
RestCancellableNodeClient cancelClient = new RestCancellableNodeClient(client, request.getHttpChannel());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be good to test the automatic cancellation (on channel closure), especially that all children tasks are cancelled too. We have a similar test for the search API: SearchRestCancellationIT .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer! I'll look into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test now in b445b6d.

TaskManager taskManager = internalCluster().getInstance(TransportService.class, nodeName).getTaskManager();
Task task = taskManager.getTask(taskId.getId());
assertThat(task, instanceOf(CancellableTask.class));
assertTrue(((CancellableTask) task).isCancelled());
Copy link
Member

@javanna javanna Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check that all children tasks are cancelled? My understanding is that we are now checking that the top-level profiling action is cancelled. Ideally, we'd make sure that the children search tasks either don't start before the cancellation, or get cancelled, or complete before the cancellation.

We could check here that there are no tasks with this specific parent task id, on the other hand that would be the case even if we did not set the parent task id properly. Should we instead try and collect those children tasks before the cancellation and double check that they do get cancelled afterwards?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure though where things get blocked with the script below, maybe always at the first inner search request? ideally it would block at any stage but that sounds like it would complicate this test quite a bit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've implemented this now in 55c158a.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll merge it then when CI is green. :)

@danielmitterdorfer danielmitterdorfer merged commit 2b472c9 into elastic:main Nov 30, 2022
@danielmitterdorfer danielmitterdorfer deleted the profiler-plugin branch November 30, 2022 16:56
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Dec 2, 2022
* upstream/main: (209 commits)
  Remove unused methods and classes from HLRC (elastic#92030)
  Clean up on exception while chunking XContent (elastic#92024)
  Add profiling plugin (elastic#91640)
  Remove unused methods and classes from HLRC (elastic#92012)
  Remove IndexerState from HLRC (elastic#92023)
  Ensure cached time elapses in ClusterServiceIT (elastic#91986)
  Chunked encoding for RestGetIndicesAction (elastic#92016)
  Simplify shardsWithState (elastic#91991)
  [DOCS] Updates ML decider docs by mentioning CPU as scaling criterion (elastic#92018)
  Add chunking to ClusterState.Custom impls (elastic#91963)
  Speedup time_series agg by caching current tsid ordinal, parent bucket ordinal and buck ordinal (elastic#91784)
  Drop the ingest listener call count tracking (elastic#92003)
  [DOCS] fixes issue number 91889 - missing [discrete] header (elastic#91976)
  Fix PersistentTasksClusterServiceTests (elastic#92002)
  [docs] Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated (elastic#91811)
  Clarify writability in Netty4HttpPipeliningHandler (elastic#91982)
  Load stable plugins as synthetic modules (elastic#91869)
  Handle any exception thrown while generating source for an IngestDocument (elastic#91981)
  fixing Apache HttpHost url on java-rest doc (elastic#91945)
  Implement repair functionality for aliases colliding with indices bug (elastic#91887)
  ...
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Dec 5, 2022
With this commit we use `profiling-events-all` to query profiling events
if the appropriate index for a sample count is not present yet. This can
happen on cluster startup when only a few events have been accumulated
because additional indices are only created when there are enough
events. This aligns behavior of the Elasticsearch plugin with the Kibana
plugin.

Relates elastic#91640
danielmitterdorfer added a commit that referenced this pull request Dec 6, 2022
With this commit we use `profiling-events-all` to query profiling events
if the appropriate index for a sample count is not present yet. This can
happen on cluster startup when only a few events have been accumulated
because additional indices are only created when there are enough
events. This aligns behavior of the Elasticsearch plugin with the Kibana
plugin.

We also refactor the profiling-related test cases so we can simulate situations
after cluster startup when not all profiling-related indices have been created.
As testing also cancellation would require additional complexity and 
decreases test maintainability (need to adjust `slack` based on the scenario
which is a very low-level change), we also separate tests for the regular case 
and cancellation.
    
Relates #91640
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Jan 10, 2023
Originally the profiling plugin required to set `search.max_buckets` to
a higher value than the current default. Had we enabled the profiling
plugin by default, we would have changed the default value of that
setting for all clusters implicitly. Due to a different bootstrapping
process of the profiling solution, we do not require to change any
settings anymore in the plugin. With this commit we therefore enable the
plugin by default.

Relates elastic#91640
danielmitterdorfer added a commit that referenced this pull request Jan 23, 2023
Originally the profiling plugin required to set `search.max_buckets` to
a higher value than the current default. Had we enabled the profiling
plugin by default, we would have changed the default value of that
setting for all clusters implicitly. Due to a different bootstrapping
process of the profiling solution, we do not require to change any
settings anymore in the plugin. With this commit we therefore enable the
plugin by default.

Relates #91640
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants