-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask] speed up tests #7020
[dask] speed up tests #7020
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7020 +/- ##
==========================================
+ Coverage 81.71% 81.86% +0.15%
==========================================
Files 13 13
Lines 3916 3916
==========================================
+ Hits 3200 3206 +6
+ Misses 716 710 -6
Continue to review full report at Codecov.
|
The slow test is probably not caused by dask but by hypothesis. |
I see. The total runtime for the tests in this PR was 14 minutes and the current master is around 17 minutes. Should I add the xgboost/tests/python/test_with_dask.py Line 67 in 7beb2f7
xgboost/tests/python/test_with_dask.py Line 110 in 7beb2f7
That could probably reduce the runtime by a couple more minutes. Or if you have any other suggestions for maybe improving the hypothesis ones I could look into it. |
Some tests use specific number of workers so that they have to define their own cluster.
Sorry I don't have any suggestion, you know these better than me. ;-) |
I ran the tests that take the most time and pretty much all the time is spent training so I don't think there's something that could be improved there. There are a couple of clusters that get created with This didn't improve as much as I hoped haha so please let me know if its useful at all. |
That's fine, the hypothesis tests take longer than we would like but highly effective at catching bugs.
Thank you for looking into them!
I will follow up on making those changes since I wrote most of the tests, I should cleanup my own mess.
Of course it's useful and thank you! After merging the PR we know that we should focus on other places. |
@trivialfis I think this is ready for review, looking forward to your thoughts. |
This aims to reduce the runtime of the dask tests. Following #6816 (comment), the first step was to replace the
client
fixture with one that reuses the same cluster and just creates new clients in every test, and makes the least changes to the existing code.There are some other tests that are building clusters instead of using the
client
fixture that could be benefited by this, however adding this only reduced the runtime by about a minute, so I ranpytest
with--durations=0
and got this:Will investigate the ones that take the most time.