-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ct_master #8946
base: master
Are you sure you want to change the base?
Improve ct_master #8946
Conversation
The ct_run:run_test function already takes care of the node's logs. The ct_master_logs module takes care of ct_master itself.
Needed to file:set_cwd like in normal CT.
Before this commit the CT docs were lying as ct_master only handled a small number of testspec instructions. This commit fixes that.
It makes more sense to sort by node name, than to have the results in the order they finished.
Breaking change: instead of returning just `ok` to indicate that the spec file was handled, we return an OK tuple with the results of the tests (number of successful, failed, user and auto skipped tests). This allows the caller to know whether any test error occurred.
At the end of a ct_master run. This uses the builtin CT Master event handler to gather the results.
CT Test Results 2 files 58 suites 1h 24m 23s ⏱️ For more details on these failures, see this check. Results for commit fd46027. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
The test failure is because of the breaking change mentioned earlier. |
At RabbitMQ we have started using
ct_master
to greatly speed up our test runs in CI, moving away from an approach that was caching test results and guessing what tests we had to run again, to an approach that runs all tests but with greater concurrency. (We changed approaches due to circumstances beyond our control, not based on technical merit, but that's a story for another time.)Greater concurrency here means running multiple test suites at the same time in a single machine (the same that
ct_master
runs on).With
ct_master
we quickly were able to run all test suites in therabbit
application (https://github.com/rabbitmq/rabbitmq-server/tree/main/deps/rabbit/test), and there are a lot, in under 30 minutes on our development machines.We pushed forward and applied the same principles to CI for our two biggest applications and were able to cut down the run time to around 13 minutes using 4 workers for
rabbit
, and 1 worker forrabbitmq_mqtt
, both usingct_master
. Our other applications are smaller and have not needed this treatment applied to them.I am also interested in making this parallel execution a feature of Erlang.mk at a later time, when the needed functionality is available in OTP directly.
While it is functional,
ct_master
is in a fairly bad state, and this PR aims to improve that. It includes a fix for #8911 as well as additional functionality. Most of the changes are not controversial, although the last two commits may be:ct_master
outside us (and we are using a forked module).I also noted that the ct_master_status module appears to be completely unused, happy to add a commit to delete it.
The equivalent RabbitMQ PR is at rabbitmq/rabbitmq-server#12502 and one of the comments there links to a few test runs that use
ct_master
with all changes included in our PR here.Note that I have not run (or added to!) OTP's CT tests at this point, hoping to get some feedback on the more controversial points, and whether
master
should be the target or ifmaint
would be OK.cc @lhoguin