Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread state as attribute to process.runtime.jvm.threads.count #7636

Conversation

mateuszrzeszutek
Copy link
Member

Resolves #7006

This is just a draft presenting how adding the state attribute to the threads counter could look like. I did not bother making the tests work at this point; let's first discuss if we want to include the attribute.
I can create a spec issue/PR if we're okay with this.
cc @open-telemetry/java-instrumentation-approvers

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable to me

@trask
Copy link
Member

trask commented Jan 24, 2023

cc @kittylyst and @roberttoyonaga from a JFR future perspective

…elemetry/instrumentation/runtimemetrics/Threads.java

Co-authored-by: Trask Stalnaker <[email protected]>
@kittylyst
Copy link

@mateuszrzeszutek Can you explain the intent of this change a bit more, and maybe a couple of sample use cases? I'm not really sure I understand how this information would be used in practice.

@roberttoyonaga
Copy link
Contributor

I can create a spec issue/PR if we're okay with this.

I think it might be harder to do this with JFR. The jdk.JavaThreadStatistics event would have to be modified to carry this information which means that the JDK code would have to do the additional steps shown in this PR. Right now jdk.JavaThreadStatisticsonly provides active, daemon, and accumulated thread counts.

@mateuszrzeszutek
Copy link
Member Author

@mateuszrzeszutek Can you explain the intent of this change a bit more, and maybe a couple of sample use cases? I'm not really sure I understand how this information would be used in practice.

My main reason for adding this is parity with the micrometer thread metrics: https://github.com/micrometer-metrics/micrometer/blob/fa1a3798d74d4ec6e18e2bbd45bdda09498bf942/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/jvm/JvmThreadMetrics.java#L71-L75

I suspect that seeing the number of e.g. blocked threads might be valuable for some people; although at the same time I agree that thread dumps/profiling is probably a better tool to detect concurrency problems.

@gabrielgiussi as the author of #7006, can you tell us why you need that metric?

@gabrielgiussi
Copy link

gabrielgiussi commented Jan 27, 2023

@mateuszrzeszutek @kittylyst
I can't provide any concrete example about how that information is useful during debugging/troubleshooting (but I'll ask), I'm just trying to mirror some existing metrics we have in our services using the prometheus client, in a new kind of services where we are using opentelemetry.
The prometheus client exports a jvm_threads_state metric which is currently being included in our grafana dashboards.

I assumed there was no downside to adding the status label to the existing metric.

I suspect that seeing the number of e.g. blocked threads might be valuable for some people

I think the same, it is probably not enough to understand what is going on, but at least you would like to quickly see if the amount of threads in runnable state is what you were expecting.

@kittylyst
Copy link

I'm not seeing the use case. "Number of blocked threads" is somewhat useful in something like JFR (but actually, "number of threads holding a lock for longer than $THRESHOLD ms" is much more useful) - but that is very fine-grained and works because JFR is providing a constant stream of fairly low-level data. A sample taken once a minute(?) that just snapshots whatever is the current state is much less useful.

Now, a counter "Number of threads that blocked for longer than $THRESHOLD in the current period" would be useful, as that can show changes in the background number of long-held locks, and can indicate where to dig in, or can detect a change (e.g. a regression) in a blue / green deployment. But, my understanding is that this change does not provide this.

I want to be sure that we're not sailing too close to: "We need parity with other solutions that are shipping metrics that are not actionable"

@roberttoyonaga
Copy link
Contributor

roberttoyonaga commented Jan 27, 2023

I assumed there was no downside to adding the status label to the existing metric.

I'm a bit concerned about modifying the semantic conventions unless there is specific need for the addition. The downside is it could make it more challenging for other implementations (ie. JFR streaming) to fully implement the semantic conventions.

@mateuszrzeszutek
Copy link
Member Author

The facts brought up above:

  1. we don't have a good reason except parity with other metric libs, and
  2. it's very hard to implement this correctly in JFR

convinced me that we should close this as "won't do". I'll leave this PR & issue open for a couple of days; if anyone has any good reasons why we should include these metrics, please speak up.

@mateuszrzeszutek
Copy link
Member Author

There doesn't seem to be any good reasons to include this; I'm closing this PR and all related issues. If anyone really needs this change, please create a new issue and explain exactly how adding the state attribute benefits you (parity with other metrics solutions isn't a good reason; see the discussion above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add thread state as attribute to process.runtime.jvm.threads.count
5 participants