-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APPSERV-114 Addresses possible sources of Race Conditions in InvocationManager #4602
Conversation
@@ -418,15 +419,16 @@ private void deregisterApplication(String applicationName) { | |||
public String getApplicationName() { | |||
InvocationManager invocationManager = Globals.getDefaultBaseServiceLocator() | |||
.getService(InvocationManager.class); | |||
if (invocationManager.getCurrentInvocation() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: while the use of thread local should ensure that multiple calls to getCurrentInvocation()
should return the same instance it is better to not take changes and do it once. Also improves readability.
private final ThreadLocal<Stack<ApplicationEnvironment>> applicationEnvironments = withInitial(Stack<ApplicationEnvironment>::new); | ||
private final ThreadLocal<Deque<Method>> webServiceMethods = withInitial(ArrayDeque<Method>::new); | ||
|
||
private Map<ComponentInvocationType, List<RegisteredComponentInvocationHandler>> regCompInvHandlerMap = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Using HashMap
here is definitely not thread-safe but should be. But is is hard to see how this can cause the observed issues. However, we do talk about multi-threading so everything is possible :D
// don't need to be synchronized because each thread has its own ArrayList. | ||
private InheritableThreadLocal<InvocationArray<ComponentInvocation>> frames; | ||
|
||
private final ThreadLocal<Stack<ApplicationEnvironment>> applicationEnvironments = withInitial(Stack<ApplicationEnvironment>::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Replacing Stack
here is more of a modernisation to avoid sychronized
in favour of CAS based synchronisation.
if (invHandlers != null) { | ||
for (ComponentInvocationHandler handler : invHandlers) { | ||
handler.beforePreInvoke(invocationType, previousInvocation, invocation); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Missing try-finally
here to make sure the frames.addLast(invocation);
will definitely happen is my best guess on what might have caused the issue observed,
|
||
if (invHandlers != null) { | ||
for (ComponentInvocationHandler handler : invHandlers) { | ||
handler.afterPostInvoke(invocation.getInvocationType(), prevInv, invocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. You might have noticed that invocation
is passed to afterPostInvoke
instead of curInv
. I believe this is just a inconsistency in the chosen way to express the logic. At this point curInv
is (or really should be) invocation
. To avoid any future confusion I unified it to use invocation
everywhere as it is done in preInvoke
and as I think it makes most sense as this is what is passed to the method as "current". It is unfortunate how the API was designed and that it leaves room for inconsistencies like calling postInvoke
with a different value then preInvoke
. postInvoke
really should not have had an argument as the current is that pushed to the stack when invoking preInvoke
. Or the API should have made clear that the argument is only passed to postInvoke
to verify that the nesting is correct and that it indeed is the current top of the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth a warning when this assumption doesn't hold and you actually have a good spot since you already jump over the invocation head.
This might disclose bugs we have around nested transactions or cleanups.
jenkins test please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tricky to fully think through all of the ramifications of your changes, but I can't spot any howlers.
I've ran it through MP Rest Client, OpenTracing, Fault Tolerance, and Metrics as these all poke it in various ways (OpenTracing was the one I was most worried about with the trace propagation) and seems to pass all of these!
It also seems to correctly propagate traces from my MicroProfileOnMicro demo app so the changes don't seem to have negatively affected tracing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall quite nice readability improvement
private final ThreadLocal<Deque<ApplicationEnvironment>> appEnvironments = withInitial(ConcurrentLinkedDeque::new); | ||
private final ThreadLocal<Deque<Method>> webServiceMethods = withInitial(ConcurrentLinkedDeque::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't ArrayDequeue
sufficient here, since it is guaranteed there will not be any cross-thread access as it is thread-local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say so, yes. Given the importance of the manager this was more of a "better safe then sorry" thing to give me peace of mind. Since we both think it is not necessary we might just make this unsynchronized.
} | ||
|
||
|
||
static class InvocationArray<T extends ComponentInvocation> extends ArrayList<T> { | ||
static final class InvocationFrames extends ConcurrentLinkedDeque<ComponentInvocation> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here I don't see InvocationFrames being used outside thread locals, so I'd infer it doesn't need additional CAS overhead and an ArrayDequeue
is sufficient.
I'm also afraid that extending rather than delegating makes too many public methods available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also afraid that extending rather than delegating makes too many public methods available.
InvocationFrames
is only used privately.
Again, done this for peace of mind. Could go to ArrayDequeue
.
setRegCompInvHandlers.add(handler); | ||
} | ||
typeHandlers.computeIfAbsent(type, | ||
key -> new ListComponentInvocationHandler(new CopyOnWriteArrayList<>())) // OBS! must be thread safe List here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this thread safety requirement valid for all constructions of ListComponentInvocationHandler
? If so, it is better handled in its constructors by copying whatever list is on input into its own list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the allTypesHandler
is composed once during construction and never changed so it does not need any form of synchronisation while the handlers per type can change while being used so they need it.
|
||
if (invHandlers != null) { | ||
for (ComponentInvocationHandler handler : invHandlers) { | ||
handler.afterPostInvoke(invocation.getInvocationType(), prevInv, invocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth a warning when this assumption doesn't hold and you actually have a good spot since you already jump over the invocation head.
This might disclose bugs we have around nested transactions or cleanups.
…ds warning in case postInvoke called with an unecpected invocation instance
jenkins test please |
@pdudits Now uses unsynchronised collections within thread locals. Also added the logging you suggested with uncovered 2 issues in the test setup. 🎉 |
LOGGER.log(WARNING, "postInvoke not called with top of the invocation stack. Expected {0} but was: {1}", | ||
new Object[] { current, invocation }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last Nitpick would be, that only EjbInvocation
provides reasonable toString()
. It would be great to add one into WebComponentInvocation
and ComponentInvocation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I remember checking it myself but then forgot to do something about it. Too many context switches. Will look into it tomorrow.
…info in log messages
jenkins test please |
…manager APPSERV-114 Addresses possible sources of Race Conditions in InvocationManager
Background
Running FT TCK strongly suggested that a race condition in or around the
InvocationManger
leads to inconsistent context when trying to derive the current application like shown below:The issue observed was that different application names were returned in a non deterministic manner for the actual same application processing tests.
Summary
The most likely cause is related to the frames that push and pop the stack which is peeked by
getCurrentInvocation()
. But it could also be related to application environment stack peeked bypeekAppEnvironment()
for case that notComponentInvocation
was on the stack.The issue can either be within the implementation of the
InvocationMangerImpl
but also on the outside as theComponentInvocation
is essentially managed from the outside by callingpreInvoke
andpostInvoke
.Changes:
InvocationArray
/InvocationFrames
(renamed) inner classStack
withConcurrentLinkedDeque
(synchronized
vs. CAS based synchronisation)ArrayList
as bases of frames stack withConcurrentLinkedDeque
(as this is only used via thread local non synchronised should not be an issue but to be sure and to provide better readability of the base class was changed)ComponentInvocationHandler[]
handlers with a singleComponentInvocationHandler
that is initialised withListComponentInvocationHandler
in case multiple handlers are actually needed. The underlyingList
usesArrayList
as it is read-only after being initialised. This allows to simplify the rest of the implementation that only has to assume a single handler.ListComponentInvocationHandler
to manageRegisteredComponentInvocationHandler
. In this case the underlyingList
uses aCopyOnWriteArrayList
as multiple threads might register handlers concurrently.Map
forRegisteredComponentInvocationHandler
(that are per type) with a thread-safeConcurrentHashMap
as multiple threads can register handlers concurrently.try-finally
topreInvoke
to make sure that any exception thrown by handlers does not prevent adding of the invocation instance (I'd say this is the best candidate for the observed issue)Testing
The implementation of
InvocationMangerImpl
got covered with unit tests to a coverage > 95%.This mostly intends to make sure the changes in the implementation do not cause unexpected behaviour, in particular throwing exceptions when it should not. The tests do include a few tests that include multiple threads to verify the thread parent-child stack inheritance behaviour, not to show that the class is thread-safe. To allow better reasoning about thread safety the implementation was cleaned and simplified so the use of collection becomes more clear.
Testing Performed
The unit tests were added mostly to make sure changes did not introduce new bugs like NPEs and alike and that the implementation does behave as expected from reading the code.
In addition I ran the FT TCK as that had shown issues before and since it deploying lots of applications.