Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to library #180

Merged
merged 7 commits into from
Oct 9, 2018
Merged

Conversation

mar-kolya
Copy link
Contributor

@mar-kolya mar-kolya commented Sep 17, 2018

Allow JMX fetch to be used as a library.

At this point this is more of an RFC and not intended to be merged directly. Main changes include:

  • Make JMXFetch releasable on maven. This PR currently releases into temporary repo and this would need to change.
  • Refactor code so System.exit() is never called from main run function.
  • Add support for loading configuration from Java resources.
  • Add additional parameters and constructor method to config object to allow creating it from the user of this library.

@mar-kolya mar-kolya mentioned this pull request Sep 26, 2018
Copy link
Member

@truthbk truthbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good to me, very non-invasive transition to a JAR usable by the APM tracer. Added a few small comments, but this looks real promising. Also, thanks for the cleanup on a few things! 🙇

@@ -24,8 +24,10 @@ public static void setup(Level level, String logLocation) {
Logger.getRootLogger().addAppender(fa);
LOGGER.info("File Handler set");
} else {
System.out.println("Log location is not set, will output log to stdout.");
ConsoleAppender consoleAppender = new ConsoleAppender(new PatternLayout(LOGGER_LAYOUT));
ConsoleAppender consoleAppender = new ConsoleAppender(new PatternLayout(LOGGER_LAYOUT), logLocation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the log4j docs I believe logLocation can be either "System.out" or "System.err", not sure if we're being defensive enough here, also unsure what the behavior is otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I should not have passed logLocation there, it is handled later. Is this what you meant the problem is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you revert this line, this looks good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done. Sorry, I've reverted that a while back but apparently pushed into a wrong place. My bad, fixed now.

}

@Override
public void displayMatchingAttributeName(JMXAttribute jmxAttribute, int rank, int limit) {
System.out.println(" Matching: " + rank + "/" + limit + ". " + jmxAttribute);
LOGGER.debug(" Matching: " + rank + "/" + limit + ". " + jmxAttribute);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods have associated commands in datadog-agent (stuff like datadog-agent jmxfetch list-matching) that is expected to print to stdout always, as it's a debugging command. These LOGGER.debug changes, unless I'm missing something, would make it subject to the log level, thus potentially breaking the agent commands. If we're going to keep this, it should probably be info or warning at the least - still maybe preferable to print to stdout.... 🤔

Copy link
Contributor Author

@mar-kolya mar-kolya Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConsoleReporter is accessible via tracing agent as well, so it would be nice to be consistent with it. I'll make this an info.

configurationList.add(new Configuration((LinkedHashMap<String, Object>) new YamlParser(this.getClass().getResourceAsStream("/jmx-2.yaml")).getParsedYaml()));
loadMetricConfigFiles(appConfig, configurationList);

ArrayList<LinkedHashMap<String, Object>> defaultConf = (ArrayList<LinkedHashMap<String, Object>>) new Yaml().load(this.getClass().getResourceAsStream("default-jmx-metrics.yaml"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in your opinion we should always be submitting the resource configs with jmxfetch, even if those beans aren't configured. That might be a valid point, still I think the issue here might be the fact that jmxfetch metrics are only "free" for the first 350 metrics - after this limit they will be billed as custom metrics. Although this limit is normally more than enough, we should double-check the implications, maybe make it an opt-in feature when using jmxfetch on "standalone" mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change specifically just replaces jmx-1 and jmx-2 with single configuration file, and extends that configuration slightly. So this is sort of preexisting behavior.
Do you see a problem with this?

return configs;
}

private void loadFileConfigs(AppConfig config, ConcurrentHashMap<String, YamlParser> configs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you probably saw, currently on Agent6 the default way to provide configs to jmxfetch is via the datadog-agent API. That's where we get the metrics, as a JSON payload from. Not sure if that use-case makes sense to you guys - maybe grabbing those from the agent itself, or from the APM agent. Just mentioning it in case that might apply, the more consistent we can be across use-cases (lib vs standalone, etc), the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, 'load files' was a preexisting functionality - I've just refactored some code into this helper function.
The new thing is below - loading files via resources. And this is kind of useful because this allows APM agent to package resource with itself and pass it to jmxfetch without too much hassle.

Copy link
Member

@arbll arbll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. This is done in a way that does not affect the logic of Jmxfetch so I am not worried about that part.
The main thing we need to be careful with is keeping compatibility with preexisting setups.

- include:
domain: java.lang
type: GarbageCollector
name: Copy
Copy link
Member

@arbll arbll Oct 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing a mention to this filter in https://docs.datadoghq.com/integrations/java/#description-of-the-filters. What does it do ? And how do we differenciate jvm.gc.minor_collection_count from Copy vs ParNew for ex ? Does it appends something to the metric name or add a tag ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thad documentation says On top of these parameters, the filters support “custom” keys which means that you can filter by bean parameters. - which is what we are using here. We are just filtering by specific GC name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I understand now, seems like I missed that part 😅 . Now for the second part of my comment, I guess you can only have one Young Gen Collector and one Old Gen Collector ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, JVM allows only one Old and one New gen collector - by there are a few different possibly types of those.

@@ -0,0 +1,145 @@
# Memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see multiple differences between default-jmx-metrics.yaml and the default metrics that we used to collect in jmx-1 and jmx-2 . It's very important that we don't remove/rename any metrics as it would break the graphs, monitors, etc built around them.

Here are the diffs that I found Before => Now :

  • jvm.thread_count => thread.count
  • jvm.gc.parnew.time => jvm.gc.minor_collection_time ?
  • jvm.gc.cms.count => jvm.gc.minor_collection_count ?

There are also multiple new metrics introduced, do we need to add them and would it be possible to move them to another PR ? That way we can focus on keeping compatibility. Improving the testing here might make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially the idea is that JVM may run a few different types of garbage collection - but it is always one for new gen and one for old gen. Current ('old') version of the config doesn't really take this into account and sort of lumps the together - and also names metrics with GC type name, making it further confusing since it is not what actually is being reported.

The change I'm proposing is using example from here: https://github.com/DataDog/integrations-core/blob/bf11a2812c56b129c56f5cc389e74ee469bbd414/cassandra/datadog_checks/cassandra/data/conf.yaml (and couple of other places) that covers all known GC types and put them into predefined metrics. This would mean clients get GC metrics properly separated by GC type which seems like a useful thing.

Would it be ok if I leave current change in place and put back old metrics I've removed/changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed that change - please let me know if this is an acceptable solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. This sounds acceptable and adding 4 new metrics should not be an issue. This would also have the nice side effect of allowing us to remove some complexity from the config files of other jmx integrations like cassandra.
We should remove jvm.gc.parnew.time and jvm.gc.cms.count from the docs and add the new ones after this is merged.

Copy link
Member

@arbll arbll Oct 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll discuss this changes with my team to see if there is anything I am missing on why this was not added previously. I'll then do some sanity check with A5 and A6 and if everything looks good I'll approve

@@ -0,0 +1,43 @@
# This is a reduced set of 'default' metrics to make tests more predictable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, let's keep compatibility with the metric names. You can use https://docs.datadoghq.com/integrations/java/#metrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is actually use only by tests, so it is not terribly important what specific metrics are inhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know but I was hoping that the tests would catch the metric name changes to avoid regressions. Did they not ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make some changes to tests to make build pass... So I guess they did to some extent.

pom.xml Outdated
<packaging>jar</packaging>
<groupId>datadog</groupId>
<artifactId>jmxfetch</artifactId>
<version>0.20.3</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll bump that in another PR (and probably increase the minor since it's a pretty big change 😛 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now.

Copy link
Member

@arbll arbll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, sorry that it was so long to merge this. Once https://github.com/DataDog/jmxfetch/pull/180/files#r222989521 is addressed, feel free to merge.

Copy link
Member

@arbll arbll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@mar-kolya mar-kolya merged commit dd406bb into DataDog:master Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants