-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moved Performance Counters to its own Telemetry Consumer #2122
Conversation
<?xml version="1.0"?> | ||
<package xmlns="http://schemas.microsoft.com/packaging/2010/07/nuspec.xsd"> | ||
<metadata> | ||
<id>Microsoft.Orleans.OrleansTelemetryConsumers.Counters</id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question, but do you think there is still value on having a separate nuget package for the perf counters installer? Maybe they can still live in different DLLs, but potentially shipped in a shared nuget?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A package for counter control already exists. Why not have this in that package? This would lead to two interdependent counter packages we'd need to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that. I just don't like the idea of have the counter control referenced in the project like we have today.
There is a tools folder on nuget that we can add it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, counter control is already updated to use the consumer dll instead of the types that once where in OrleansRuntime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, @jason-bragg I think the intent is the user to Add/Register the consumer, and not the .exe from counter control. We should add the .exe inside the consumer nuget IMHO and get rid of the old nuget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@galvesribeiro, I've no objection with moving the exe to this package. IMO, what is important here is to have the consumer and exe in the same package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok @jason-bragg will do that. Please let me know if there nothing else to address and I'll make the change, rebase, push. Thanks :)
To preserve behavior in test code and bang on this change more (short of writing tests), I suggest adding the xml line to test configs, and the programmatic code to default test cluster configuration. |
internal static class OrleansCounterManager | ||
{ | ||
private static readonly Logger logger = LogManager.GetLogger("OrleansPerfCounterManager", LoggerType.Runtime); | ||
//private static readonly List<Tuple<StatisticName, bool>> counterNames = new List<Tuple<StatisticName, bool>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plz remove commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops, forgot that.
Please also add a line in the Changelog.md saying that you now have to opt into perf counters (and potentially link to documentation too once we have that). Remember that this changelog is end-user facing, no need to go into a lot of technical details, just link to the PR (and later on the doc) |
@@ -585,6 +586,7 @@ private Silo LoadSiloInNewAppDomain(string siloName, Silo.SiloType type, Cluster | |||
new object[] { }); | |||
|
|||
appDomain.UnhandledException += ReportUnobservedException; | |||
appDomain.DoCallBack(RegisterPerfCountersTelemetryConsumer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify... This hack here and on TestSiloHost is something @jdom and myself agreed to leave for now since we can't configure properly via code config the telemetry consumer inside the test cluster/silo AppDomain from the xunit tester.
I'll revisit it later once I get the next task for the new config and silo host model and this will (finally) get out here.
Don't know why the CI is failing now... It is building and running all the tests here (even from build.cmd script)... Any ideas? |
<file src="OrleansTelemetryConsumers.Counter.XML" target="lib\net451" /> | ||
<file src="OrleansTelemetryConsumers.Counter.dll" target="lib\net451" /> | ||
<file src="OrleansTelemetryConsumers.Counter.pdb" target="lib\net451" /> | ||
<file src="$SRC_DIR$OrleansTelemetryConsumers.Counters\**\*.cs" target="src" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build error is when building the nugets.
File not found: 'OrleansTelemetryConsumers.Counter.XML'.
There is something odd here. Is there really a 'Counter' and 'Counters' assembly (and related XML doc files)?
@@ -20,7 +20,7 @@ | |||
<DefineConstants>DEBUG;TRACE</DefineConstants> | |||
<ErrorReport>prompt</ErrorReport> | |||
<WarningLevel>4</WarningLevel> | |||
<DocumentationFile>bin\Debug\OrleansTelemetryConsumers.Counters.XML</DocumentationFile> | |||
<DocumentationFile>obj\Debug\OrleansTelemetryConsumers.Counters.XML</DocumentationFile> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is wrong, should be bin
, not obj
@jason-bragg I was talking with @jdom... Do you know why @sergeybykov any clues? Can we remove it? |
Unfortunately I don't know the history of Counter control. I do know it loads assemblies and create specific counters for grains found, in addition to the hard coded runtime of counters. I suspect this dynamic counter generation based off of grains is what necessitated the tool. |
AHHH ok. That makes sense @jason-bragg . But in all cases, we could have that code inside the Installer class and avoid that exe IMHO. My point is that Installutil can perform the same thing as well. |
Understood. I think Orleans assemblies need some general way to publish its counters so monitoring integration points requiring upfront initialization (like performance counters, or ETW events, ...) know what metrics to register. This counter publishing mechanism, imo, should be extensible in a way that allows other extension points (providers, serializers, ..) to register their own counters. For instance, the OrleansAzureUtils could register performance counters for its table and blob storage calls. This would allow monitoring integration points to know about metrics in Orleans core as well as third party extensions. This is a much wider issue than should be addressed in this PR, as the recently added APM/KPI interfaces don't account for systems that require upfront registration, and the existing internal runtime counters don't expose counter data in a general way. Maybe since we need to replace perf counters for .NetCore port we can reframe the problem as a general metrics publishing problem rather than a monitoring problem? |
I don't agree with that. The telemetry APIs should not care about registration. Each consumer should have its own mechanism to register the metrics/counters on its own. There is nothing blocking the providers to report metrics. They can use the methods Today, the ugly thing is that metrics are inside LogManager, etc... Anyway, I think this should be debated in another issue. In fact, one of the things that I plan for 2.0 is a complete revamp on the telemetry, counters, metrics, log, trace, story. In other words... I want to keep us out of those business. We should just provide common set of interfaces and let people plug whatever they want to consume the data pushed from Orleans runtime, providers and even their own data from grains. Again, I'll open another issue later to discuss that when I start to work in it for 2.0. Regardless of the future, I still think we can remove that counter control. If not on this PR, in another but before we make a new release. There is no point on have that if the installer class can do the same thing. It is less one thing people need to care about in deployment. I suggest to get it done now since people will start using and will make easier when they read the change in changelog.md. |
"I'll open another issue later to discuss that when I start to work in it for 2.0." Agreed. We can discuss it then. |
The perf counter is a consumer just like any other. In the specific case of the perf counters, there is a requirement to the counter to have been previously created in order to report the values. The counter creation is the reponsibility of the consumer. If it will be created outside runtime, at runtime on first usage, or whatever way they want, this is the consumer implementation. It is also its responsibility to discard any kind of telemetry they don't want. For example, this implementation of IMetricTelemetryConsumer, only care about metrics pushed to previously created counters and discard the other calls. More open providers like AI, NewRelic, doesn't have any filter at all. It capture what is pushed to it, and push to Azure AI without do any kind of filter. The AI user, will filter that on its dashboard any way they want. Again, I don't want Orleans to have to deal with this business. The solo purpose of this specific PR is to move Perf Counters code outside the core and there is no better place to leave it today than on its own IMetricTelemetryConsumer. As a bonus, all the stats that were only published to the the PerfCounters, now are published to ALL registered IMetricTelemetryConsumers without caring about their implementation details, filters, registration, it just don't need to care. Ok, I think we are extending this discussion too much here :) Lets move on. I'm waiting @jdom perform the VSO/Perf tests so we can move forward on .Net Core. Later on, lets discuss that more but thanks for the comments, will think about it for 2.0 |
case UpdateMode.Increment: | ||
if (value.HasValue) | ||
{ | ||
cd.PerfCounter.IncrementBy((long)value.Value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think original code checked if cd was null, then if cd.PerfCounter was null. If cd.PerfCounter was null it would create it, which we don't want to do, but if it is null, we don't want to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right... Let me fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
<file src="$SRC_DIR$OrleansTelemetryConsumers.Counters\**\*.cs" target="src" /> | ||
<file src="OrleansCounterControl.exe" target="tools" /> | ||
<file src="OrleansCounterControl.exe.config" target="tools" /> | ||
<file src="Orleans.dll" target="tools" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to install the OrleansCounterControl as it was before. It needs to load application grains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok... More one reason to remove counter control and put that grain load logic inside the installer class and let people use installUtil.exe from within the server deployment directory... That could be even automated from the silo startup...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for using the installer util, instead of maintaining our own. It's just a question of resources. If you want to make that part of this change, great, but if not we need the existing patterns to be preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jason-bragg I`ll add it to the installer class tomorrow. Was just waiting to someone from the core-team to aknowledge that. Will push changes tomorrow. Thanks
<package xmlns="http://schemas.microsoft.com/packaging/2010/07/nuspec.xsd"> | ||
<metadata> | ||
<id>Microsoft.Orleans.CounterControl</id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave counter control in place and continue to support until installUtil is used in production environments. Let's phase out counter control over time. 1.3.0 we support counter control or users can use installUtil. 1.3.1+ish we deprecate counter control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jason-bragg I'm against that... Why should we keep it? The command line is basically the same. There is no need to deploy that .exe anymore and if you need PerfCounter the dll will be there deployed with server code. The user can still use the old CounterControl nuget to register if they want. We are just not going to push anymore new nugets so why keep the project? I though by your comment here that you agree with me and if I add that to this PR we could remove the old one... Its totally pointless to have it if you leave the old nuget alive...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is based on the assumption that InstallUtil will work in all environments as well as the CounterControl. That it has no unexpected dependencies and/or side effects. That when errors occur they will be clear enough to users. These are all probably true, but we've very large and critical services running on Orleans and 1.3 already includes a fair amount of risk. Continuing to support this assembly until the new approach has been vetted in production environments is simply a risk mitigation strategy that costs us little but can save us and our customers hours (or days) should something go wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jason-bragg OK, I'll not arg anymore... I want to finish this work... But just for the sake of information, there is no difference at all from running a console app and using installutil.exe specially the way I did, which uses exactly same code as the console app and installutil.exe is just calling the Install()
method the same way the console does. But ok, will not enter in the debate. I just want this .Net Core blocker to go away. I'm going to revert everything and reapply some of the changes.
@jason-bragg counter control is back and now you are able to install both ways. Please let me know anything else you want. |
Great work. |
Thank you guys for reviewing that @jason-bragg and @jdom |
Thanks dude, this unblocks some of the work I'm currently doing in .net core :) |
More one step forward on .Net Core move following #368
In this PR, I moved the Perf Counter specific code to its own telemetry consumer and made it a
IMetricTelemetryConsumer
.Basically the
OrleansCounterControl.exe
still be used to install the default Orleans Counters however, now the Statistics are published not just to the perf counter but for all registeredIMetricTelemetryConsumer
.To use it install nuget
Microsoft.Orleans.OrleansTelemetryConsumers.Counters
and configure as:XML
or
Code