-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky integration tests fix #343
Conversation
@@ -34,4 +34,4 @@ jobs: | |||
run: ./gradlew build test | |||
- name: Build in Linux | |||
if: runner.os == 'Linux' | |||
run: ./gradlew build check test integrationTest | |||
run: ./gradlew build check test integrationTest -i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to improve the readout on the build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, noticed when the test failed that there were almost no details about failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok cool, just wanted to make sure it was on purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, sorry looks like I commented instead of approved before.
static List<Integer> getKafkaListenerPorts() throws IOException { | ||
try (ServerSocket socket = new ServerSocket(0); ServerSocket socket2 = new ServerSocket(0)) { | ||
return Arrays.asList(socket.getLocalPort(), socket2.getLocalPort()); | ||
} catch (IOException e) { | ||
throw new IOException("Failed to allocate port for test", e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this code lead to hard to detect collisions and flaky tests?
Reasoning:
The socket is opened and a port assigned.
We save the port number.
we close the socket as we exit the try. (TIMEOUT LINGER now applies to the port number)
The calling code then uses the port numbers.
If 2 tests are running and the timing is just right. The second test can retrieve the same port number that were retrieved in the first test.
Solution:
Return ServerSocket not port numbers. Then you can get as many sockets as needed and hold them until the code is finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reserving a free port for future use is pretty tricky to do: even if you return the ServerSocket, you need to close it before the port is freed and any process on the host machine (including parallel tests) could sneak right in and occupy it.
The right thing to do might be to detect if that specific failure occurred (that the supposedly free port is no longer free) and retry, but I'd take the point of view that change makes the test less flaky than it was for now. The two free ports can no longer be assigned to the same value during the same test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Claudenw AFAIU you are talking about parallel test execution that we are not doing as of now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- @Claudenw has a point, but this change leaves the code better than it was.
No description provided.