Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/resourcedetection, receiver/hostmetrics] Report total memory and CPU capacity numbers as resource attributes #22099

Closed
mx-psi opened this issue May 19, 2023 · 10 comments
Labels
enhancement New feature or request on hold This is blocked by another PR/issue priority:p2 Medium processor/resourcedetection Resource detection processor receiver/hostmetrics

Comments

@mx-psi
Copy link
Member

mx-psi commented May 19, 2023

Component(s)

processor/resourcedetection, receiver/hostmetrics

Describe the issue you're reporting

As part of improving the infrastructure monitoring capabilities of the OpenTelemetry Collector, I want to report total memory and filesystem capacity as well as CPU cores.

Part of this information can already be retrieved by combining information from the hostmetrics receiver; for example if you count the number of cpu values on system.cpu.time you can get the total number of cores. However, if you want to produce this information at the exporter, you then depend on all metrics reaching the same exporter and therefore you would make your deployment stateful.

I want therefore to add this information as resource attributes to avoid stateful deployments.

My remaining open question is where to add this. I see two possibilities:

  1. Make this part of the resource attributes on metrics generated by the host metrics receiver.
  2. Make this part of the resource attributes added by the resource detection processor system detector.

I am leaning towards (2), since that way this information can be leveraged by users that do not use the host metrics receiver but still want to have that kind of information, but I want some other opinions.

@mx-psi mx-psi added enhancement New feature or request priority:p2 Medium processor/resourcedetection Resource detection processor receiver/hostmetrics labels May 19, 2023
@github-actions
Copy link
Contributor

Pinging code owners for receiver/hostmetrics: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

Pinging code owners for processor/resourcedetection: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mx-psi mx-psi changed the title [processor/resourcedetection, receiver/hostmetrics] Report total capacity numbers as resource attributes [processor/resourcedetection, receiver/hostmetrics] Report total memory and CPU capacity numbers as resource attributes May 19, 2023
@dmitryax
Copy link
Member

I am leaning towards (2), since that way this information can be leveraged by users that do not use the host metrics receiver but still want to have that kind of information, but I want some other opinions.

Do you have a use case of metrics coming from another receiver that would need that attribute?

@mx-psi
Copy link
Member Author

mx-psi commented Jun 1, 2023

I am leaning towards (2), since that way this information can be leveraged by users that do not use the host metrics receiver but still want to have that kind of information, but I want some other opinions.

Do you have a use case of metrics coming from another receiver that would need that attribute?

I guess I am thinking of something like debugging an issue by filtering a telemetry signal (say, traces) based on the total capacity on a given metric (similar to the use case for something like host.type). For most users, using host.type for this is more useful, but if you are not on a cloud environment or want to get really detailed these could also be useful

@dmitryax
Copy link
Member

dmitryax commented Jun 2, 2023

I believe these can be optional resource attributes on both host metrics receiver and resource detection receiver, so user can decide what scope to apply it to.

Another thing I would like to clarify is if putting that information in resource attributes is a good approach. Do we have anything in the OTel specification/semconv to guide here? If not, we probably should start there.

@frzifus
Copy link
Member

frzifus commented Jun 7, 2023

My remaining open question is where to add this. I see two possibilities:

Would it also be an option to provide this info as a metric? I would be interested in something like system.cpu.number. In my case, that would be the only metric I am interested in. With physical hardware, it's probably a bit boring. But with virtual machines, the number of available CPUs can be changed at runtime.

update
Thats what I had in mind: #23231


cc @chambridge

@dmitryax
Copy link
Member

dmitryax commented Jun 8, 2023

Would it also be an option to provide this info as a metric? I would be interested in something like system.cpu.number. In my case, that would be the only metric I am interested in. With physical hardware, it's probably a bit boring. But with virtual machines, the number of available CPUs can be changed at runtime.

Actually, I like this more than putting it into the resource attribute because this metric can be used in computations on the backends, while it's hard to do with the resource attribute. But I still would like to see any guidance from the OTel spec regarding this. It's be great if someone can look into that and start an issue if it's not specified anywhere.

cc @mx-psi

@mx-psi
Copy link
Member Author

mx-psi commented Jun 9, 2023

Let's continue the discussion on the semantic-conventions repository first to clear up both the name and whether this should be a metric or a resource attribute. I will mark this as 'on hold' in the meanwhile.

@mx-psi mx-psi added the on hold This is blocked by another PR/issue label Jun 9, 2023
dmitryax pushed a commit that referenced this issue Aug 2, 2023
…and physical CPUs (#23231)

**Link to tracking Issue:** #22099 

Signed-off-by: Benedikt Bongartz <[email protected]>
Co-authored-by: Pablo Baeyens <[email protected]>
@dmitryax dmitryax closed this as completed Aug 2, 2023
@diranged
Copy link

@dmitryax,
Coming back to this - I'd like to see these values as resource attributes because we're trying to map attributes from the OTEL Collector into Datadog via https://docs.datadoghq.com/opentelemetry/schema_semantics/host_metadata/#cpu-conventions. Is that something that's possible today now that the values exist as metrics? Or do we need another patch to potentially expose these as attributes? For what it's worth, I don't really understand why they were made metrics given that they don't generally change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request on hold This is blocked by another PR/issue priority:p2 Medium processor/resourcedetection Resource detection processor receiver/hostmetrics
Projects
None yet
Development

No branches or pull requests

4 participants