Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky tests: IT dev-test #1633

Open
bmarwell opened this issue Jan 22, 2023 · 9 comments · Fixed by #1669
Open

Flaky tests: IT dev-test #1633

bmarwell opened this issue Jan 22, 2023 · 9 comments · Fixed by #1669

Comments

@bmarwell
Copy link
Contributor

Hi!

Not sure it is related to maybe #1626 (PR not merged yet), but this test will sometimes fail:

DevTest.configIncludesChangeTest():

[INFO] [INFO] --- maven-jar-plugin:3.2.2:jar (default-jar) @ dev-it ---
Warning: ARNING] JAR will be empty - no content was marked for inclusion!
[INFO] [INFO] Building jar: /home/runner/work/ci.maven/ci.maven/liberty-maven-plugin/target/it/dev-it/target/dev-it-1.0-SNAPSHOT.jar
[INFO] [INFO] 
[INFO] [INFO] --- maven-jar-plugin:3.2.2:test-jar (default) @ dev-it ---
[INFO] [INFO] Building jar: /home/runner/work/ci.maven/ci.maven/liberty-maven-plugin/target/it/dev-it/target/dev-it-1.0-SNAPSHOT-tests.jar
[INFO] [INFO] 
[INFO] [INFO] --- maven-failsafe-plugin:3.0.0-M7:integration-test (integration-test) @ dev-it ---
[INFO] [INFO] Using auto detected provider org.apache.maven.surefire.junit4.JUnit4Provider
[INFO] [INFO] 
[INFO] [INFO] -------------------------------------------------------
[INFO] [INFO]  T E S T S
[INFO] [INFO] -------------------------------------------------------
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.MultiModuleRunInstallEarTest
[INFO] [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 52.165 s - in net.wasdev.wlp.test.dev.it.MultiModuleRunInstallEarTest
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.DevCopyTestDependenciesTest
[INFO] [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.323 s - in net.wasdev.wlp.test.dev.it.DevCopyTestDependenciesTest
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.MultipleConcurrentLibertyModulesPlTest
[INFO] [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.805 s - in net.wasdev.wlp.test.dev.it.MultipleConcurrentLibertyModulesPlTest
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.MultiModuleRunEjbTest
[INFO] [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.328 s - in net.wasdev.wlp.test.dev.it.MultiModuleRunEjbTest
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.MultiModuleHotTestingTest
[INFO] [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.915 s - in net.wasdev.wlp.test.dev.it.MultiModuleHotTestingTest
[INFO] [INFO] Running net.wasdev.wlp.test.dev.it.DevTest
Error: [ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 127.036 s <<< FAILURE! - in net.wasdev.wlp.test.dev.it.DevTest
Error: [ERROR] net.wasdev.wlp.test.dev.it.DevTest.configIncludesChangeTest  Time elapsed: 23.799 s  <<< FAILURE!

…

[INFO] java.lang.AssertionError
[INFO] 	at org.junit.Assert.fail(Assert.java:87)
[INFO] 	at org.junit.Assert.assertTrue(Assert.java:42)
[INFO] 	at org.junit.Assert.assertTrue(Assert.java:53)
[INFO] 	at net.wasdev.wlp.test.dev.it.DevTest.configIncludesChangeTest(DevTest.java:107)
[INFO] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[INFO] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[INFO] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[INFO] 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[INFO] 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)

Line 107 is:

replaceString("<feature>servlet-4.0</feature>", "<!-- replace -->", srcServerXMLIncludes);

Assuming we are looking at the wrong line now.
Looking at the test, it depends on timeouts:

boolean foundUpdate = verifyLogMessageExists("<feature>servlet-4.0</feature>", 60000, targetServerXMLIncludes);

… which is 60 seconds. Maybe you might want to use Awaitility here? It would give you a much nicer exception message "for free"..

It would read to:

      await()
        .atMost(Duration.ofSeconds(60L))
        .until(() -> verifyLogMessageExists("<feature>servlet-4.0</feature>", 10, targetServerXMLIncludes));

(10 ms chosen randomly).

This would help to identify the actual problem.

@bmarwell
Copy link
Contributor Author

@scottkurz @cherylking Some other tests seem to fail, too. For example, in the current main build, there's this:

2023-02-01T20:33:52.2556285Z [INFO] [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 59.349 s <<< FAILURE! - in net.wasdev.wlp.test.dev.it.MultiModuleHotTestingTest
2023-02-01T20:33:52.2569955Z [INFO] [ERROR] net.wasdev.wlp.test.dev.it.MultiModuleHotTestingTest.runTest  Time elapsed: 37.299 s  <<< FAILURE!
2023-02-01T20:33:52.2570665Z [INFO] java.lang.AssertionError: 
2023-02-01T20:33:52.2571546Z [INFO] Last 100 lines of log at /home/runner/work/ci.maven/ci.maven/liberty-maven-plugin/target/it/dev-it/target/../resources/multi-module-projects/typeA/logFile.txt:
2023-02-01T20:33:52.2572098Z [INFO] ===================== START =======================
2023-02-01T20:33:52.2572695Z [INFO] 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
2023-02-01T20:33:52.2573400Z [INFO] 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
2023-02-01T20:33:52.2573957Z [INFO] 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

[...]

2023-02-01T20:33:52.2599835Z [INFO] [INFO] 
2023-02-01T20:33:52.2600114Z [INFO] [INFO] Results:
2023-02-01T20:33:52.2600394Z [INFO] [INFO] 
2023-02-01T20:33:52.2600675Z [INFO] [ERROR] Failures: 
2023-02-01T20:33:52.2601061Z [INFO] [ERROR]   ConverterUnitTest.testHeightFeet:12 expected: <2> but was: <4>
2023-02-01T20:33:52.2601442Z [INFO] [INFO] 
2023-02-01T20:33:52.2601780Z [INFO] [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
2023-02-01T20:33:52.2602102Z [INFO] [INFO] 
2023-02-01T20:33:52.2602434Z [INFO] [ERROR] Unit tests failed: There are test failures.

Sadly, it seems to be hard to execute the profile online-its locally, see:

https://github.com/OpenLiberty/ci.maven/blob/main/.github/workflows/maven.yml#L59-L69

So, what's the next best way to proceed here?

@bmarwell
Copy link
Contributor Author

bmarwell commented May 7, 2023

So, looking again at the tests, I see:

2023-05-04T01:28:03.4175956Z [INFO] [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.536 s <<< FAILURE! - in net.wasdev.wlp.test.dev.it.MultiModuleRunInstallEarTest
2023-05-04T01:28:03.4186009Z [INFO] [ERROR] net.wasdev.wlp.test.dev.it.MultiModuleRunInstallEarTest.notOverwriteExistingM2Test  Time elapsed: 0.015 s  <<< FAILURE!

Now there's a catch here. I found forkMode=once in your test setup which is not supported anymore.
Instead use

                    <forkCount>1</forkCount>
                    <reuseForks>true</reuseForks>

Same for generate-features-it.
I wonder it only fails on OpenLiberty. Maybe the runtime does not have the requested features anymore?
Anyway, both fail on newFeatureFile not exists.

bmarwell added a commit to bmarwell/ci.maven that referenced this issue May 7, 2023
cherylking added a commit that referenced this issue May 7, 2023
[#1633] set forkCount/resuseForks instead of forkMode; surefire/failsafe 3.0.0
@cherylking
Copy link
Member

It still managed to fail in the latest build of the main branch after PR #1669.

https://github.com/OpenLiberty/ci.maven/actions/runs/4908575363/jobs/8764429807#step:8:9314

Error: [ERROR]   PollingDevTest>DevTest.configIncludesChangeTest:106->BaseDevTest.verifyLogMessageExists:445->BaseDevTest.readFile:369 » FileNotFound /tmp/temp11414764513369619137/target/liberty/wlp/usr/servers/defaultServer/extraFeatures.xml (No such file or directory)

@cherylking cherylking reopened this May 7, 2023
@bmarwell
Copy link
Contributor Author

bmarwell commented May 8, 2023

Yes, it's hard to verify for me. I cannot see the test, and the local arc runner takes ages.

@bmarwell bmarwell changed the title "Toggling" IT: dev-test Flaky tests: IT dev-test Jun 20, 2023
@scottkurz
Copy link
Member

I just looked at: https://github.com/OpenLiberty/ci.maven/actions/runs/5252160439/jobs/9487879161?pr=1692
from mid-June and I see a similar failure as back in #1633 (comment)

2023-06-13T07:39:19.7365999Z [INFO] [ERROR] Failures: 
2023-06-13T07:39:19.7366308Z [INFO] [ERROR]   ConverterUnitTest.testHeightFeet:12 expected: <2> but was: <4>

That strikes me as probably different than: #1633 (comment)

I wonder how many different problems we might be dealing with.

@bmarwell
Copy link
Contributor Author

@cherylking so, the ITs are failing for some reason. My approach would be:

  1. move the tests from src/it into their own module, one by one (more on that later)
  2. Use buildhelper-maven-plugin to get a random port for each so they can run in parallel

That is:

  • create a pom-module called integration-tests.
  • In integration-tests/pom.xml set the groupId to io.openliberty.tools.its or similar and the packaging to pom.
  • Move tests from src/it to integration-tests/${testname} one by one; replace the @{} parameters with their real ones.
  • Remove versions of dependencies, instead manage them in the root ./pom.xml.
  • Optional: set to not deploy the artifacts of integration-tests to central. I think those are not needed. We do the same in Shiro, we leave out samples and integration-tests.

That should help. Do you want me to create an example PR with just one project moved over to its own module?

@scottkurz
Copy link
Member

@bmarwell Thank you for bringing this up. I'm not trying to suggest the current approach is the best conceivable one, but what's the main goal or improvement you're targeting here? Is it the ability to run the tests in parallel, for a shorter overall execution time? Are you saying the logs are easy to read and debug from?

That should help. Do you want me to create an example PR with just one project moved over to its own module?

I think that would help me understand the idea better.

@bmarwell
Copy link
Contributor Author

@bmarwell Thank you for bringing this up. I'm not trying to suggest the current approach is the best conceivable one, but what's the main goal or improvement you're targeting here? Is it the ability to run the tests in parallel, for a shorter overall execution time? Are you saying the logs are easy to read and debug from?

Actually, all of that. Easier to maintain is probably the most interesting part for you both, which comes with readability. Parallel execution would only be one of the additional benefits.
The logs can be streamed easier to stdout, which helps, too.

@bmarwell
Copy link
Contributor Author

Here's a starter:
#1708

IF that worked (and it doesn't work with <extensions>true</>, then it would be a major rewrite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants