Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PAYARA-4121 Fixed Updating of Flashlight Class Instrumentation #4278

Merged
merged 5 commits into from
Oct 23, 2019

Conversation

jbee
Copy link
Contributor

@jbee jbee commented Oct 18, 2019

Symthom

When un-deploying and re-deploying an application the web and HTTP statistics would no longer update.

Finding

The reason for updates no longer reaching the statistics counters was that the connection done via class instrumentation was cut when un-deploying an app. This occurred because un-deploying an application does disable statistics providers specifically responsible for the application which wrongly triggered a removal of the class instrumentation for the probe provider class connecting the events to the statistics counters. So instead of just disconnecting the application specific statistics all statistics based on a particular class were disabled. Since the overall statistics for HTTP and web use the same class as the application specific ones this also disabled the overall statistics.

Solution

Because the trigger for changes of class instrumentation was a static call in the innermost part of the logic there was no clear way to only trigger updates to the class instrumentation after the logic is done updating the changed probes and invokers. Another mistake was that disabling a invoker caused disabling of the class even though there might be other invokers and other probes for the same provider class. This is why both transform and untransform became just update. If updating would require to update the instrumentation or remove it is now evaluated based on the probes and their active invokers. If there are active invokers and the probe is enabled the instrumentation is kept and updated. Methods with no invokers are no longer instrumented.

To avoid running the update for each trigger call, which is each method times the statistics provider instances, the trigger only flags for update and a asynchronous daemon thread is used to run the actual updates with a small delay. By granting a delay the logic causing multiple triggers is likely to have completed and flagged all needed updates before the thread actually performs them.

During shutdown the thread ends from interruption and does not perform (most of) the triggered transformation that a shutdown normally causes.

Note to the Reviewer

Main change is in ProbeProviderClassFileTransformer. Besides that I only added some missing generics and made some hash maps into concurrent ones while looking for the problem.

Note that synchronized was largely removed where present. Synchronisation is done by now using a concurrent map and atomics. To allow reclaim of key class key was changed to String which alled use of concurrent map. Other methods no longer need synchronisation because only the updater thread will call them.

This also fixes PAYARA-1285 which should be checked with the reproducer given in the ticket.

Testing

Easiest is to test this after #4274 is merged since it will be easy to verify in the monitoring console that un-deploying and re-deploying an application does no longer cause the request/sec to zero out.

With MC started enable web and HTTP monitoring until you see some request/sec in the Core view of MC. Redeploy MC and check that the requests/sec are still above 0, most likely 0.5 due to data polling every 2secs.

Testing should also check that enabling and disabling the web or HTTP monitoring has the expected effect. Again this is verified easiest after #4274 already being in master.

To check that instrumentation no longer is added or removed multiple times from provider classes I put a log output on INFO level in both cases of this if-statement https://github.com/payara/Payara/pull/4278/files#diff-16bec2e5206edbe4e95b050af4617992R193. Note that occasionally it is still possible that same class is handled twice due to unlucky async timing but not more often then that.

@jbee jbee self-assigned this Oct 18, 2019
private final Map<String, List<StatsProviderRegistryElement>> configToRegistryElementMap = new HashMap();
private final Map<Object, StatsProviderRegistryElement> statsProviderToRegistryElementMap = new HashMap();
private final Map<String, List<StatsProviderRegistryElement>> configToRegistryElementMap = new ConcurrentHashMap<>();
private final Map<Object, StatsProviderRegistryElement> statsProviderToRegistryElementMap = new ConcurrentHashMap<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a concurrency fix that most likely is needed.

for (FlashlightProbe probe : probes.values()) {
if (probe.isEnabled() && probe.getInvokerCount() > 0) {
return true;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to maintain a "dirty" flag telling us if instrumentation needs to be updated the decision is now made based on the probe state. Anything else would cause us to remove instrumentation when there are active invokers waiting to be called.

// we still need to write out the class file in debug mode if it is
// disabled.
cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES + ClassWriter.COMPUTE_MAXS);
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES + ClassWriter.COMPUTE_MAXS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cw could be inlined.

}
Log.fine("untransformed", providerClass.getName());
return null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: null means transform back to the class bytes as given in the original file.

}

private synchronized void addTransformer() {
private void addTransformer() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all updates are run by the same thread no synchronisation is needed any more.

}
catch (Throwable th) {
try (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just using try-with-resources instead

@@ -323,7 +261,7 @@ public MethodVisitor visitMethod(int access, String name, String desc, String si
MethodVisitor mv = super.visitMethod(access, name, desc, signature, exceptions);

FlashlightProbe probe = probes.get(makeKey(name, desc));
if (probe != null) {
if (probe != null && probe.isEnabled() && probe.getInvokerCount() > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important change to the way this class works: We decide if a method gets instrumented based on the state of the probe. As all state changes should trigger flag for update this is now safe to do.

@jbee
Copy link
Contributor Author

jbee commented Oct 18, 2019

jenkins test please

@jbee jbee requested a review from Pandrex247 October 18, 2019 14:30
@jbee
Copy link
Contributor Author

jbee commented Oct 21, 2019

jenkins test please

@jbee
Copy link
Contributor Author

jbee commented Oct 22, 2019

jenkins test please

@jbee
Copy link
Contributor Author

jbee commented Oct 22, 2019

jenkins test please

@@ -118,7 +118,8 @@ StatsProviderRegistryElement getStatsProviderRegistryElement(Object statsProvide
}

List<StatsProviderRegistryElement> getStatsProviderRegistryElement(String configElement) {
return (this.configToRegistryElementMap.get(configElement));
List<StatsProviderRegistryElement> plain = configToRegistryElementMap.get(configElement);
return plain == null ? Collections.emptyList() : new ArrayList<>(plain);
Copy link
Contributor Author

@jbee jbee Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated fix for a problem I observed where the returned list is iterated and the iteration (or another process) causes changes to the list causing a concurrent modification exception. Since the method is only used in this module by methods doing this kind of loop I concluded it is best to always return a copy that can be iterated without problem.

@jbee jbee requested a review from jGauravGupta October 22, 2019 12:36
@jbee
Copy link
Contributor Author

jbee commented Oct 23, 2019

@Pandrex247 might want to review this was well. Waiting for his feedback on it.

@jbee jbee merged commit 4ad5418 into payara:master Oct 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants