Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_perf_counters: New option "Expand" expands wildcards in counter definitions. #2336

Closed

Conversation

o7g8
Copy link
Contributor

@o7g8 o7g8 commented Jan 29, 2017

The change introduces a new option "Expand" to a definition of Windows counter which allows to expand wildcards in ObjectName, Instance and Counter.
Read "Remarks" in https://msdn.microsoft.com/en-us/library/windows/desktop/aa372605(v=vs.85).aspx about how the wildcards are expanded and how they can be used.

Required for all PRs:

  • CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
  • [x ] Sign CLA (if not already signed)
  • [ x] README.md updated (if adding a new plugin)

Oleg Grytsynevych added 2 commits January 29, 2017 16:44
@sparrc
Copy link
Contributor

sparrc commented Jan 29, 2017

I think this may solve #1291

@TheFlyingCorpse @discoduck2x could you review?

@o7g8
Copy link
Contributor Author

o7g8 commented Jan 29, 2017

My change only expands wildcards and therefore allows to cover wider selection of counters with a compact config entry. It solves part of #1291.

Apart from wildcards the #1291 also addresses another issue - currently missing periodical "refresh" of the set of counters monitored by Telegeraf.
The "refresh" thing is independent from wildcards because the "refresh" can also be useful for non-wildcard counters: currently if we specify a process-level counter for a particular process in Telegraf config, then if the process doesn't exist at the moment when Telegraf reads config for win_perf_counters, then the counter will be ignored. Even after the "monitored" process starts, we will not get telemetry for the process with Telegeraf.

I see the "refresh" thing as a separate option in the counter definition, and it should be implemented on a separate pull request,

@PierreF
Copy link
Contributor

PierreF commented Jan 30, 2017

@o7g8 did you know if PdhExpandCounterPath use localized name or English name for the counter ?

I ask this regarding #2261 which allow to use the same name (English name) on all system. Microsoft documentation is not always clear about this, but PdhExpandCounterPath doesn't require localized name to work ? If yes, it may disallow to use this option on non-English system as PdhExpandCounterPath will require localized name and later PdhAddEnglishCounter will require an English name.

@o7g8
Copy link
Contributor Author

o7g8 commented Jan 30, 2017

@PierreF , thank you for your comment! Here is the documentation describing handling of localized names:

Note If the counter path contains a wildcard character, the non-wildcard portions of the path will be localized, but wildcards will not be expanded before adding the localized counter path to the query. In this case, you will need use the following procedure to add all matching counter names to the query. 1.Make a query
2.Use PdhAddEnglishCounter with the string containing wildcards
3.Use PdhGetCounterInfo on the counter handle returned by PdhAddEnglishCounter to get a localized full path (szFullPath.) This string still contains wildcards, but the non-wildcard parts are now localized.
4.Use PdhExpandWildCardPath to expand the wildcards.
5.Use PdhAddCounter on each of the resulting paths

BTW there is another thing which I've just noticed and which is also important - the PdhExpandCounterPath doesn't support partial wildcards! I will change the code and will use PdhExpandWildCardPath instead of PdhExpandCounterPath, because it more flexible and allows partial wildcrads (on post-Vista systems).

Seems like my code requires some more work for non-English systems.

@sparrc sparrc added this to the 1.3.0 milestone Jan 30, 2017
@o7g8
Copy link
Contributor Author

o7g8 commented Jan 30, 2017

I have replaced PdhExpandCounterPath with PdhExpandWildCardPath, which supports the partial wildcards. It works, but it's not as flexible as one could expect. It expands correctly following queries:

  • \LogicalDisk(C*)\% Free Space - expands the query to the disk "C:"
  • \LogicalDisk(C*)\* - returns all counters for the disk "C:"
  • \LogicalDisk(*:)\* - returns all counters for the disk "C:" (on my machine)

But surprisingly it doesn't expand following obvious queries with partial wildcards in Counters part:

  • \LogicalDisk(C:)\%* - I would expect all counters starting with %, but it fails with "The specified counter could not be found"
  • \LogicalDisk(C:)\Av* - I would expect all counters starting with Av (e.g. "Average..."), but it fails with "The specified counter could not be found"
  • \LogicalDisk(C:)\*Time - I would expect all counters ending with Time, but it returns all the C: counters instead.

From the experiments I conclude that the PdhExpandWildCardPath does a good job with partial wildcards for Instance, but it seems to support only non-partial wildcards for Counters. Not perfect, but a step forward compared to PdhExpandCounterPath.

@PierreF I haven't tried the approach with the call to PdhGetCounterInfo, quoted in my previous message. It requires allocation of PDH_COUNTER_INFO structure, which looks a bit scary.
I have found some Go examples for the structure signature and for the PdhGetCounterInfo signature. Can anyone help with adaption of the struct signature to our codebase?
I wonder if the support for non-English systems coud be treated as a separate concern on another pull request?

@TheFlyingCorpse
Copy link
Contributor

This looks good to me!

@discoduck2x
Copy link

@sparrc , sorry missed the notify for the latest status on this one , will test asap and get back !

@discoduck2x
Copy link

@sparrc @TheFlyingCorpse can i find this commit in a binary somewhere? doesnt seem to be any nightly builds available for windows.

@sparrc
Copy link
Contributor

sparrc commented Mar 2, 2017

@PierreF could you provide a final review on this?

@PierreF
Copy link
Contributor

PierreF commented Mar 2, 2017

One issue: from my understanding this will not works on non-English Windows if we don't set PreVistaSupport = true.

PdhExpandWildCardPath seems to take localized name and later (if PreVistaSupport is False) we use PdhAddEnglishCounter.

I think we should write in README that when expand is true, counter need to use localized name and it would be great to change the "if m.PreVistaSupport" to include an "or PerfObject.Expand" so when Expand is enabled we use PdhAddCounter instead of PdhAddEnglishCounter.

That being said, while testing I got the following error:

panic: runtime error: slice bounds out of range

goroutine 14 [running]:
github.com/influxdata/telegraf/plugins/inputs/win_perf_counters.UTF16ToStringArray(0xc04200e480, 0x5c, 0x5c, 0xc042175f38, 0x0, 0x5c)
        /go/src/github.com/influxdata/telegraf/plugins/inputs/win_perf_counters/pdh.go:443 +0x134
github.com/influxdata/telegraf/plugins/inputs/win_perf_counters.expandCounterQuery(0xc042343c60, 0x20, 0x6, 0xc042343c60, 0xc042343c6f, 0xc042407e18, 0x0)
        /go/src/github.com/influxdata/telegraf/plugins/inputs/win_perf_counters/win_perf_counters.go:167 +0x142
github.com/influxdata/telegraf/plugins/inputs/win_perf_counters.expandQuery(0xc042343c60, 0x20, 0x1, 0x6, 0xc042343c60, 0x20, 0x0, 0x0)
        /go/src/github.com/influxdata/telegraf/plugins/inputs/win_perf_counters/win_perf_counters.go:193 +0xd7
github.com/influxdata/telegraf/plugins/inputs/win_perf_counters.(*Win_PerfCounters).ParseConfig(0xc042171600, 0xc042407ea0, 0x0, 0x0)
        /go/src/github.com/influxdata/telegraf/plugins/inputs/win_perf_counters/win_perf_counters.go:207 +0x285
github.com/influxdata/telegraf/plugins/inputs/win_perf_counters.(*Win_PerfCounters).Gather(0xc042171600, 0x16aa740, 0xc042343c00, 0x0, 0x0)
        /go/src/github.com/influxdata/telegraf/plugins/inputs/win_perf_counters/win_perf_counters.go:270 +0x822
github.com/influxdata/telegraf/agent.gatherWithTimeout.func1(0xc04232c8a0, 0xc0421716c0, 0xc042343c00)
        /go/src/github.com/influxdata/telegraf/agent/agent.go:153 +0x50
created by github.com/influxdata/telegraf/agent.gatherWithTimeout
        /go/src/github.com/influxdata/telegraf/agent/agent.go:154 +0xd3

Configuration used:

[[outputs.file]]
    files = ["stdout"]

[[inputs.win_perf_counters]]
  [[inputs.win_perf_counters.object]]
    ObjectName = "Processeur"
    Instances = ["*"]
    Counters = [
       "% d’inactivité",
    ]
    Expand = true

@sparrc
Copy link
Contributor

sparrc commented Mar 2, 2017

so when Expand is enabled we use PdhAddCounter instead of PdhAddEnglishCounter.

@PierreF could you remind me why we ever use "PdhAddEnglishCounter"? Why can't we just always use PdhAddCounter?

@PierreF
Copy link
Contributor

PierreF commented Mar 2, 2017

@sparrc Because with PdhAddCounter use need localized name, which means you can not provide one configuration file for all systems. You need one configuration per localization.

stringLine := UTF16PtrToString(&buf[0])
for stringLine != "" {
strings = append(strings, stringLine)
nextLineStart += len(stringLine) + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure that +1 should not be here.
If your string has a length of 1 character, the next string is one bytes later, not 1+1 bytes later :)
And this is very likely my issue with panic: runtime error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PierreF The strings are NULL terminated, so that +1 is for hopping over that one I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very fishy to me, the length of stringLine tells us nothing about the offset into buf, because buf contains utf16 while stringLine contains utf8.

@sparrc
Copy link
Contributor

sparrc commented Mar 2, 2017

I see....I think I misunderstood how that worked.

@PierreF In that case, why do we ever use PdhAddCounter? Is it unreasonable to ask users in other countries to use the English names? From what I can tell they are fairly similar...

@PierreF
Copy link
Contributor

PierreF commented Mar 2, 2017

Yes, I think that too that PdhAddEnglishCounter should be always used.
The support for PdhAddCounter was to support pre-vista Windows. I would +1 a drop of this (pre-vista... means Windows XP, Vista support ends in few month, everyone should be running Vista or more :))

I haven't tried the approach with the call to PdhGetCounterInfo, quoted in my previous message. It requires allocation of PDH_COUNTER_INFO structure, which looks a bit scary.
[...]
I wonder if the support for non-English systems coud be treated as a separate concern on another pull request?

That seems reasonable (as long as we wrote in README that currently Expand require localized name as in my previous comment). But it could be a good idea to support only English name and properly implements #2336 (comment)

@discoduck2x
Copy link

@o7g8 @PierreF , anywhere i can get hold of a binary to test ? or will this be merged soon?

@TheFlyingCorpse
Copy link
Contributor

A note on the English name. some environments likely use only native language. forcing use of the english counter names can be a negative impact of forcing the removal of the localized alternative (pre-vista).

@matso42
Copy link

matso42 commented Mar 8, 2017

I would also be interested in a Windows binary nightly build on this.

@sparrc sparrc modified the milestones: 1.4.0, 1.3.0 Mar 8, 2017
@vtsingaras
Copy link

Hello, I have merged o7g8's branch into current master and tested on Windows 10 and Windows Server 2016. It seems to be working just fine. Is it possible to have this PR merged? I could open a new PR with the merges and resolved conflicts if necessary.

Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a stupid question, but what would happen if we just always treated it as if expand was enabled?

stringLine := UTF16PtrToString(&buf[0])
for stringLine != "" {
strings = append(strings, stringLine)
nextLineStart += len(stringLine) + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very fishy to me, the length of stringLine tells us nothing about the offset into buf, because buf contains utf16 while stringLine contains utf8.

var buf []uint16
ret := PdhExpandWildCardPath(query, nil, &bufSize)
for ret == PDH_MORE_DATA {
buf = make([]uint16, bufSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code does not seem defensive enough to me. Can bufSize be zero here, if so then the next line will panic. What if ret is not PDH_MORE_DATA at least once, then we will try to index it in UTF16ToStringArray and panic. In general we need to check anything that crosses an application boundary before we can assume anything about it.

@vtsingaras
Copy link

@danielnelson Hi, sorry for the delay. I agree with you that we should always treat that expand is enabled.

If we always expand the paths, then it would also make the code that decides if we should add the counter measurement inside Gather() simpler, as if we pre-expand when parsing config we know when inside Gather() that we want this measurement (with the exception of still filtering for _Total).
It would also make it simpler to correctly parse multiple instances with a '#' in their name.

@danielnelson danielnelson modified the milestones: 1.4.0, 1.5.0 Aug 14, 2017
@danielnelson danielnelson added area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Aug 23, 2017
@danielnelson danielnelson modified the milestones: 1.5.0, 1.6.0 Nov 30, 2017
@danielnelson danielnelson removed this from the 1.6.0 milestone Jan 4, 2018
@danielnelson
Copy link
Contributor

This PR has been superseded by #4189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin platform/windows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants