Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource_attribute not emerging with metrics #21477

Closed
zchef2k opened this issue May 3, 2023 · 13 comments
Closed

resource_attribute not emerging with metrics #21477

zchef2k opened this issue May 3, 2023 · 13 comments
Labels
bug Something isn't working receiver/sshcheck

Comments

@zchef2k
Copy link

zchef2k commented May 3, 2023

Component(s)

receiver/sshcheck

Describe the issue you're reporting

I'm appealing to splunk-otel-collector to include sshcheck in that release. In my testing of the code to include sshcheck, I've noticed that the resouce_attribute ssh.endpoint does not emerge with the metrics being collected- the net effect being is I have successful collection of metrics but I don't know for what machines/endpoints.

Please let me know if I've misinterpreted the available documentation, if I have a flaw in my testing, or could improve the quality of this issue. Thank you.

https://github.com/signalfx/splunk-otel-collector/issues/3050#issuecomment-1532787623

edit: typos, clarification.

@zchef2k zchef2k added the needs triage New item requiring triage label May 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@atoulme
Copy link
Contributor

atoulme commented May 4, 2023

By default, resource attributes are not enabled. You will need to add a configuration entry:

receivers:
  sshcheck:
    resource_attributes:
       ssh.endpoint:
         enabled: true

Please try it out and let us know if it works.

@atoulme atoulme removed the needs triage New item requiring triage label May 4, 2023
@zchef2k
Copy link
Author

zchef2k commented May 4, 2023

This has been my config for testing, the resource_attributes bits having been most recently added before I submitted the issue.

  sshcheck:
    endpoint: <redacted>:22
    username: zchef2k
    key_file: /etc/otel/collector/id_rsa
    known_hosts: /etc/otel/collector/known_hosts
    ignore_host_key: true
    collection_interval: 30s
    resource_attributes:
      ssh.endpoint:
        enabled: true

I do get the metrics, just not the resource_attribute with them.

@atoulme
Copy link
Contributor

atoulme commented May 4, 2023

So I am looking back at the image you shared in the other issue. What's interesting is that ssh.endpoint is actually there, as a dimension, but its value is an empty string.

I'm not sure why that is but maybe it's worth digging into.

@atoulme
Copy link
Contributor

atoulme commented May 4, 2023

I found the issue: the scraper doesn't emit resource attributes. There's a bug.

You need this change:
atoulme@fd57c6a

You will need a test for it and so on. Codeowners can help from there.

@atoulme atoulme added the bug Something isn't working label May 4, 2023
@zchef2k
Copy link
Author

zchef2k commented May 4, 2023

A couple of questions...

I queried the IM's API for MTS on sshcheck.status and this is what I got back:

[
    {
        "active": true,
        "created": 1683033728000,
        "creator": null,
        "customProperties": {
            "host.id": "<redacted>",
            "host.name": "<redacted>,
            "host_cpu_cores": "2",
            "host_cpu_model": "Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz",
            "host_kernel_name": "linux",
            "host_kernel_release": "4.18.0-425.13.1.el8_7.x86_64",
            "host_kernel_version": "#1 SMP Tue Feb 21 04:20:52 EST 2023",
            "host_linux_version": "<redacted>",
            "host_logical_cpus": "2",
            "host_machine": "x86_64",
            "host_mem_total": "5823932",
            "host_os_name": "<redacted>",
            "host_physical_cpus": "2",
            "host_processor": "x86_64",
            "os.type": "linux"
        },
        "dimensions": {
            "host": "<redacted>",
            "host.id": "<redacted>",
            "os.type": "linux",
            "sf_metric": null
        },
        "id": "FvIE1fFAwCM",
        "lastUpdated": 0,
        "lastUpdatedBy": null,
        "metric": "sshcheck.status",
        "metricType": "GAUGE",
        "source": null,
        "tags": []
    }
]

Out of curiosity, would you expect ssh.endpoint to be listed as a dimension and null, or are you not surprised that it's entirely absent because it is null? Or, am I missing the point?

But to the issue of the bug and your fork that attempts to address it, can you advise how I'd go about testing? I've not built the contrib distribution nor am I a Go developer, though I'm not afraid to jump in and help. I'm an ops/infra guy with just enough understanding of these things to be dangerous. :)

@zchef2k
Copy link
Author

zchef2k commented May 5, 2023

I took a stab at building splunk-otel-collector but this time pointing to a local sshcheckreceiver module that was modified based on your example. In go.mod I included:

// Use atoulme's code for sshcheckreceiver
replace github.com/open-telemetry/opentelemetry-collector-contrib/receiver/sshcheckreceiver => /home/<redacted>/Code/opentelemetry-collector-contrib/receiver/sshcheckreceiver

To me this seemed to make sense, but instead I think it highlights a lack of understanding on my part how these things come together, if not a fundamental understanding of Go. It doesn't build, the output being:

GO111MODULE=on CGO_ENABLED=0 go build -trimpath -o ./bin/otelcol_linux_amd64 -ldflags "-X github.com/signalfx/splunk-otel-collector/internal/version.Version= -X go.opentelemetry.io/collector/internal/version.Version=" ./cmd/otelcol
# github.com/signalfx/splunk-otel-collector/internal/receiver/discoveryreceiver
internal/receiver/discoveryreceiver/receiver.go:185:27: undefined: metric.NewNoopMeterProvider
# go.opentelemetry.io/collector/processor/batchprocessor
../../go/pkg/mod/go.opentelemetry.io/collector/processor/[email protected]/metrics.go:212:50: cannot use bpt.processorAttr (variable of type []"go.opentelemetry.io/otel/attribute".KeyValue) as type []metric.AddOption in argument to bpt.batchSizeTriggerSend.Add
../../go/pkg/mod/go.opentelemetry.io/collector/processor/[email protected]/metrics.go:214:48: cannot use bpt.processorAttr (variable of type []"go.opentelemetry.io/otel/attribute".KeyValue) as type []metric.AddOption in argument to bpt.timeoutTriggerSend.Add
../../go/pkg/mod/go.opentelemetry.io/collector/processor/[email protected]/metrics.go:217:48: cannot use bpt.processorAttr (variable of type []"go.opentelemetry.io/otel/attribute".KeyValue) as type []metric.RecordOption in argument to bpt.batchSendSize.Record
../../go/pkg/mod/go.opentelemetry.io/collector/processor/[email protected]/metrics.go:219:55: cannot use bpt.processorAttr (variable of type []"go.opentelemetry.io/otel/attribute".KeyValue) as type []metric.RecordOption in argument to bpt.batchSendSizeBytes.Record
make: *** [Makefile:120: otelcol] Error 2

@zchef2k
Copy link
Author

zchef2k commented May 5, 2023

I shouldn't sell myself too short. I got it to build- in my local sshcheckreceiver's go.mod I had to change:
go.opentelemetry.io/otel/metric/noop v0.38.1 // indirect to go.opentelemetry.io/otel/metric/noop v0.37.0 // indirect

Once built, I get the dimension:

[
    {
        "active": true,
        "created": 1683317168000,
        "creator": null,
        "customProperties": {
            "host.id": "<redacted>",
            "host.name": "<redacted>",
            "host_cpu_cores": "2",
            "host_cpu_model": "Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz",
            "host_kernel_name": "linux",
            "host_kernel_release": "4.18.0-425.13.1.el8_7.x86_64",
            "host_kernel_version": "#1 SMP Tue Feb 21 04:20:52 EST 2023",
            "host_linux_version": "<redacted>",
            "host_logical_cpus": "2",
            "host_machine": "x86_64",
            "host_mem_total": "5823932",
            "host_os_name": "<redacted>",
            "host_physical_cpus": "2",
            "host_processor": "x86_64",
            "os.type": "linux",
            "ssh.endpoint": "<redacted>"
        },
        "dimensions": {
            "host": "<redacted>",
            "host.id": "<redacted>",
            "os.type": "linux",
            "sf_metric": null,
            "ssh.endpoint": "<redacted>"
        },
        "id": "FvY-CT2A0AA",
        "lastUpdated": 0,
        "lastUpdatedBy": null,
        "metric": "sshcheck.status",
        "metricType": "GAUGE",
        "source": null,
        "tags": []
    }
]

@atoulme Can you advise me what should happen next? I've done little more than to help test the outcome of your code change. I don't think any code I've changed to package this up meets contribution standards.

@atoulme
Copy link
Contributor

atoulme commented May 5, 2023

We have 2 codeowners on this thread: @nslaughter and @codeboten . As humanely as possible, you can try to exert gentle pressure on them such as promising them a frosty beverage if they follow up on this matter. Alternatively, if you are a Splunk customer, you can try to open a support case, state your need, open an idea ticket and work with our folks to identify your use case, see how to best adopt this receiver, and we'll try to satisfy you however we can.

Given that this bug is identified and is a simple code change, I might get around to fixing it in my copious spare time if nothing else happens, but I cannot make guarantees from there.

@zchef2k
Copy link
Author

zchef2k commented May 5, 2023

in my copious spare time
🤣

Thank you for your help. I will open a support case and reach out to my account team.

Maybe I can catch @nslaughter and @codeboten as we head into the weekend to make good on that frosty beverage.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 5, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2023

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 3, 2023
@atoulme atoulme reopened this Nov 1, 2023
@atoulme
Copy link
Contributor

atoulme commented Nov 1, 2023

Reopening to close with good news. This was fixed with issue #24441

@atoulme atoulme closed this as completed Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/sshcheck
Projects
None yet
Development

No branches or pull requests

2 participants