-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve SteadyTimer tests #1129
Conversation
instead of checking when the callback was actually called, check for when it was added to the callback queue. This *should* make the test more reliable.
8b13dbb
to
a371c9f
Compare
And the final 9th build failed again 😞 @flixr Please look into this test and propose a patch which passes reliably. This is rather urgent since Lunar is waiting for |
Is there any point of the late check now? It's relaxed so much it might as well be removed. I vote just get rid of it. The build farm is just not real-time enough. The whole test is by nature stochastic, which is not ideal. An improved (but extreme) test would be to make sure that |
With the introduction of |
Right, I meant just the late check, not the whole test. I just saw that there was the WallTimerHelper, which is essentially the same. Ideally there would be less code duplication between those two test helpers so that they would be nearly idententical. That begs the question. Why is SteadyTimer failing, and WallTimer not? The WallTimer test has been there for 7 years. Are the builds using docker, or virtual machines? (edit: I see it's docker, there goes that theory) |
@dirk-thomas I could never reproduce a test failure on my machine, so this is a bit hard to investigate. |
I am not sure what detailed information beside the console output and the junit result file you are referring to. Maybe you can clarify what you are looking for.
You can run the devel job locally following these instructions: https://github.com/ros-infrastructure/ros_buildfarm/blob/master/doc/jobs/devel_jobs.rst#run-the-devel-job-locally |
I only found the standard output.
Thanks, will check this out... |
I don't have loads to contribute here, except to say that we run our tests on a heavily virtualized environment, and experience a ton of this kind of thing, particularly as the tests rise in complexity— And yet, these kinds of tests still do have value. Basically, I think this reality has to be accepted, and should be fixed at the tooling layer: specifically, rostest should learn to execute N retries on a failing test to truly prove the failure, rather than giving up the first time. |
I think this is a problem with rostest which doesn't capture
While that might be valuable in some cases I disagree that this should be applied generally. A test which even fails only sometime might be a real bug and shouldn't be swept under the carpet. Imo the tests should be written in a way that they are resilient to these kind of timing differences. |
@mikepurvis Hypervisor scheduling + clock adjustment was my thought if the slaves were VMs, but docker is pretty bare metal... What is strange is that there has never been a problem with @flixr If you're going to try to debug locally, you could try testing under high load (to simulate parallel builds) with |
@kmactavish Yeah, but the dockerized buildfarm slaves are themselves running on Amazon's infrastructure: http://build.ros.org/computer/ So yeah, tests that are perfectly reliable running on someone's laptop or a robot computer can absolutely fall to pieces when attempted by the buildfarm machines. Having at least a per-test option in CMake for "please retry N times" would be great, and would help avoid what's happened here, where a test's timing constraints are relaxed to the point where it is no longer all that meaningful. |
@dirk-thomas Is there an issue for this? Any idea how to fix that? In general I think it is also impossible to make these tests 100% reliable as @mikepurvis pointed out. |
No.
No.
I have looked at it in the past and it didn't seem like an easy to address problem.
The fact that it did work and test before and doesn't now is alarming enough for me to not release the current state. Maybe it is not that important and only a "test issue" but I don't want to take a risk with this. |
Update: I can get the tests to fail for both
In short, these tests have always been flaky, Proposed fix: Keep the tight check for the timer being called early, change the late check for both helpers to 1s or even 10s, since it's unreliable in high-load, containerized test environments. P.S. @flixr I added a |
Replaced by #1132. |
Rebased and squashed #1120 in the hope that it further reduces the flakyness of the SteadyTimer test.