-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUSTCOM-76 WebappClassLoader Synchronisation Test #4430
Conversation
Synchronisation issues in the WebappClassLoader have already been fixed, but not tested properly. This commit adds a test for the synchronisation issues that caused CUSTCOM-76.
jenkins test please |
Test will pass if all tasks complete without exception (-> future not completed -> TimeoutException -> swallowed -> pass)
Test will fail if any task complete with an exception (any) (-> future completed -> rethrowing the cause) Nevermind, the test passed even with 600 threads in the pool, 100000 tasks (*3) and 30 second timeout and Thread.yield instead of sleep. Then I played bit more with that ... see comments in code. I'm still not sure with the target of the test (maybe add javadoc with some human readable description of the purpose). |
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Show resolved
Hide resolved
for (JarEntry entry : Collections.list(jarFile.entries())) { | ||
webappClassLoader.findResource(entry.getName()); | ||
// System.out.println("Looked up " + resourceEntry); | ||
Thread.sleep(0, 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes slowing down of all lookup tasks. When I tried to run much more threads to cause deadlock, I replaced this with Thread.yield(). But it is not generally better, it depends on what we hunt.
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the test on my system suggests that there is a deadlock.
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Show resolved
Hide resolved
appserver/web/war-util/src/test/java/org/glassfish/web/loader/WebappClassLoaderTest.java
Outdated
Show resolved
Hide resolved
The test could pass without any threads running in time when running on a slow machine. This commit makes sure that a minimum number of threads run before completing. A few other changes were made to make the test more reliable.
executor = Executors.newFixedThreadPool(60); | ||
|
||
// Require a minimum number of executions before completing | ||
latch = new CountDownLatch(EXECUTION_COUNT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means we just wait fore the first 100 tasks of the 300 tasks to be finished. With the current code (Thread.sleep(0, 100);
in the lookup() method) this means only the 'add' and 'closeJar' tasks are finished. No 'lookup' task is finished by the time the test checks for an error.
public void shutdown() throws InterruptedException { | ||
if (executor != null) { | ||
executor.shutdownNow(); | ||
assertTrue("Executor could not shutdown. This could mean there is a deadlock.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executor.awaitTermination termination always return true, also in the case the tasks isn't finished. In that case, the executor send an interrupted signal and InterruptedException occurs (verified with try catch in lookup method).
} | ||
|
||
// Wait for tasks to execute | ||
assertTrue("The tasks didn't finish in the allowed time.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It waits only for 100 tasks of the 300 submitted. Since 'add' and 'closeJar' are small/fast tasks, this will never be an issue (except in case of deadlock)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is there exclusively for deadlocks
for (JarEntry entry : Collections.list(jarFile.entries())) { | ||
webappClassLoader.findResource(entry.getName()); | ||
// System.out.println("Looked up " + resourceEntry); | ||
Thread.sleep(0, 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sleep means that only a 'few' lookups are performed (in my test, around 3000 in the 130 ms the test is running).
By replacing it with Thread.yield()
, the number of lookup actions is increased to around 21000 which mean the chance of having a concurrency issue (which we like to detect with this test) is increased significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread.yield()
causes a huge amount of false positives on the test. I imagine this is because it reduces the total time spent on this task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build passed, but I still have little a doubt if with this counts of threads and pool capacity and tasks it can detect lock issues.On current intels and JDKs are synchronizations pretty fast, so from my experience running hundreds or thousands is better (but may take too much time on slower environments (hw/config)).
Also what I tested - too many files in classloader (thousand) slows down the classloading to nearly unusable state (999 threads waiting to call addJar and one working inside). I'm not sure if it is alright, it took also 8GB heap, so maybe there is some close missing ...? (but it is not a problem of this test)
It depends ... I believe you :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had an offline conversation about my comments and current state is better to see differences (with and without the fix in WebappClassloader)
Good for me to be merged.
jenkins test please |
CUSTCOM-76 WebappClassLoader Synchronisation Test
Description
Synchronisation issues in the WebappClassLoader have already been fixed,
but not tested properly. This commit adds a test for the synchronisation
issues that caused CUSTCOM-76.
Testing
Testing Performed
Ran the unit test with
-Dsurefire.rerunFailingTestsCount=500
before and after the changes in #4378.