-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for cleanup hook. #8
base: master
Are you sure you want to change the base?
Conversation
The hook may be useful if the embedder wishes to clean up some global state (which may leak out of flaky tests) to prevent interferention with other tests that may depend on this state. The hook is run once after tearing down all the processes in parallel pool and after each test in async pool to allow cleanup after each test is run (may be essential to fully isolate tests in serial retry runs).
Thanks for the patch! I'm trying to understand it and have some questions :) It seems like this patch is trying to do two different things. First, there's a "clean up at the end of everything" use case, where cleanup_fn is called at the end of each test run (i.e., once after the parallel round, and once after each serial retry round). Normally I'd expect the caller to just clean up whatever they needed to after the test run (i.e., after run() returned); however, that does not give you a chance to do any additional cleanup in between the initial parallel round and any subsequent serial rounds). Is that the scenario you're trying to address? Second, there's the "clean up in between invocations" use case, but that only takes effect in the serial mode (--jobs=1, i.e., using AsyncPool). If you are worried about cleaning up after each test run, shouldn't that be used in each worker as well? And is there some reason you feel like you need the additional cleanup beyond the work done by the normal teardown/teardownClass/teardownProcess ? It seems like it might be a bad idea to use the same mechanism for both cases, but I'm not sure about this yet, since I'm not sure I fully understand the intent. In the normal case (after tearing down all parallel processes), couldn't the caller just as easily invoke the cleanup function themselves after calling join()? |
Partly, I'll try to explain: some tests may brake and leak some common resource, which is required to be in certain state by the other ones. The real world example I'm facing in Telemetry tests in Chromium is the following:
Since Telemetry tests have serial retries by default, such failures are not a big deal: 99% of simple tests do not impose any exotic requirements and happily execute in the parallel run, while 1% of complicated ones may fail and finally succeed in the serial retry round. Now suppose that something breaks and one of simple tests fails and leaks some browser process in 1% of the runs. Now there's a big chance that this will lead to failures for some innocent complicated tests: after the resource leaks, it will be in such state until some outer level process cleans up (this may be the final cleanup after the test run on the buildbot, for example). But that will obviously happen too late, after all the serial retries, which in this case won't do any help. There appear to be a couple of solutions:
So the proposed solution tries to fit this restrictions by modelling a set of independent test runs each of which ends with a cleanup:
I surely expect there may be other solutions to the problem described above, and will be very glad if you can offer better alternative. |
Thanks, that more-or-less confirms what I thought you were trying to do. I think it makes sense to have a cleanup function that is invoked between running each set of tests, at least (i.e., between the parallel round and the first serial round, and in between each serial round). I'm less convinced that it makes sense to run an additional cleanup in between each serially executed test. The setup/teardown code for each test should be trying to ensure that the environment is in the state it needs to be in. There is a slight difference you're describing, as well, in the idea that some tests can be executed in parallel, and some can't. The TYP APIs allow you to indicate which sort of test is which, so if you know ahead of time that some tests have this restriction, you can tell typ that and it'll do the right thing. |
(a little less than one year passed, and all of a sudden I decided to show up again and reply :))
I have been thinking about it for a while and, eventually, came up with solution which does not require modifications in typ:
In this case we do not need any special cleanup, however, downside is that if we forget to mark some "special" test as serial-only, that test will try to run in parallel with others, killing innocent processes and almost certainly causing other tests to fail with mysterious reasons. Do you find this reasonable and doing a better job than cleaning up after each run? Or maybe there's another durable solution to this problem? |
Sorry for the delay in replying ... I think what you suggest could work. I think it is perhaps a bit more clunky than I would like. Looking at the code now -- which is a long while after I wrote it -- I'm realizing that the semantics around setup_fn and teardown_fn aren't very clearly documented. I think I meant for them to be things that were guaranteed to be called once at the start of every subprocess and once at the end of every subprocess, but there's not anything that really says when subprocesses are created, and so that isn't as useful as you might hope. I think adding a callback function that could be invoked in between the parallel set and the serial set still feels like a good idea. An alternative would be to break apart Runner.run() and expose the different phases of it so that the caller could effectively interpose stuff in between the calls of ._run_one_set(), but that exposes a lot of complexity just to be able to interject the one hook. The latter could also let the caller do mostly arbitrary things between each test, like what you were originally asking for, but that doesn't actually seem like a great idea to me. However, if you can make it work for now without needing modifications to TYP, I encourage you to do that and, once you have some experience with it, then maybe that can help us decide what we'd want to add and where. |
The hook may be useful if the embedder wishes to clean up some global state
(which may leak out of flaky tests) to prevent interferention with other tests
that may depend on this state.
The hook is run once after tearing down all the processes in parallel pool and
after each test in async pool to allow cleanup after each test is run (may be
essential to fully isolate tests in serial retry runs).