Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JMXFetch support dogstatsd over UDS #6565

Merged
merged 11 commits into from
Oct 30, 2020
Merged

Conversation

gh123man
Copy link
Member

@gh123man gh123man commented Oct 14, 2020

What does this PR do?

This PR enables JMXFetch to use UDS when configured. Since the rest of dogstatsd can use UDS instead of UDP, it makes sense for JMXFetch to also support this.

Related JMXFetch PR: DataDog/jmxfetch#335

Motivation

Tracking internal card: AC-646

Describe your test plan

Tested by running JMXFetch and inspecting trace level logs for metrics send by JMXFetch

@gh123man gh123man added the [deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead.. label Oct 14, 2020
@gh123man gh123man requested a review from a team as a code owner October 14, 2020 20:24
@gh123man gh123man added this to the 7.24.0 milestone Oct 14, 2020
Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM!

Since this PR doesn't work without the related JMXFetch changes, we should release a new version of JMXFetch with the related changes, and bump its "nightly" version in release.json in this PR before we can merge it.

Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment inline, let me know.

Also, let's make sure that in containerized environments, JMXFetch metrics reported with UDS will not get origin tags attached by the agent (otherwise, they get the agent's origin tags added, which is very confusing to users): I believe DataDog/jmxfetch#264 ensures that the java statsd client will report the none value on the entity ID tag to avoid having these origin tags added, but reading through dogstatsd's parsing code it looks like the additional option dogstatsd_entity_id_precedence needs to be explicitly enabled as well. I'll ask container-integrations for some feedback here.

pkg/jmxfetch/jmxfetch.go Outdated Show resolved Hide resolved
@clamoriniere
Copy link
Contributor

Hi @olivielpeau

it looks like the additional option dogstatsd_entity_id_precedence needs to be explicitly enabled as well.

In the JMXFetch with UDS, the dogstatsd_entity_id_precedence should be set at false which is the default value.
So like this even if the DD_ENTITY_ID=none dogstatsd server will call the UDS origin detection.

Then later in the logic. the call to the tagger for the DD_ENTITY_ID is skipped because of the value none.

If you enable dogstatsd_entity_id_precedence the UDS origin detect is skipped if the DD_ENTITY_ID is not empty. So even with the none value it UDS origin is skipped. This logic may look weird but it was defined like this for backward compatibility.

@gh123man gh123man requested a review from a team as a code owner October 28, 2020 17:30
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I do have a small concern I'd like to talk about that might have to be addressed on the JMXFetch side of things (in particular with the statsd client).

With UDP, when we have drops we can often see these in the UDP OS-level stats, so we had that tool to understand what was happening when metrics didn't add up. With UDS, since the channel is reliable, all drops will happen in the java-dogstatsd-client. Without telemetry we'll be a little blind to this, so we should maybe consider adding that.

@olivielpeau
Copy link
Member

With UDP, when we have drops we can often see these in the UDP OS-level stats, so we had that tool to understand what was happening when metrics didn't add up. With UDS, since the channel is reliable, all drops will happen in the java-dogstatsd-client. Without telemetry we'll be a little blind to this, so we should maybe consider adding that.

Yes that's a good point. I'm still hesitant to enable telemetry in JMXFetch's dogstatsd client as-is though, I'm still anticipating confusion on where these metrics come from (I think a lot of users will assume they come from an actual java app submitting custom metrics, not JMXFetch).

In the short-term, maybe we could make JMXFetch pass a handler to java-dogstatsd-client (https://github.com/DataDog/java-dogstatsd-client/blob/eb5d1d680781e97d7b41d7bd1f1a8b3510eef7b2/src/main/java/com/timgroup/statsd/StatsDSender.java#L126) and keep track of the number of packet drops that way (and report that number to the agent with the status API, or with a dogstatsd metric with a specific name). It won't provide as much granularity as the telemetry metrics (only the number of packet dropped), but could be enough for now. WDYT?

Longer-term, it'd be nice to allow telemetry metrics to have a custom prefix so that jmxfetch can make the java dogstatsd client submit telemetry with that prefix, e.g. datadog.jmxfetch.dogstatsd_client or something similar?

Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have some visibility now into submission failures in jmxfetch over UDS, so let's merge this!

Copy link
Contributor

@derekwbrown derekwbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little nit about the version numbers, but feel free to merge after fixing w/o re-review

@gh123man gh123man merged commit d3ce934 into master Oct 30, 2020
@gh123man gh123man deleted the brian/jmxfetch-support-uds branch October 30, 2020 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[deprecated] team/agent-core Deprecated. Use metrics-logs / shared-components labels instead..
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants