Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dask] speed up tests #7020

Merged
merged 1 commit into from
Jun 11, 2021
Merged

[dask] speed up tests #7020

merged 1 commit into from
Jun 11, 2021

Conversation

jmoralez
Copy link
Contributor

@jmoralez jmoralez commented Jun 3, 2021

This aims to reduce the runtime of the dask tests. Following #6816 (comment), the first step was to replace the client fixture with one that reuses the same cluster and just creates new clients in every test, and makes the least changes to the existing code.

There are some other tests that are building clusters instead of using the client fixture that could be benefited by this, however adding this only reduced the runtime by about a minute, so I ran pytest with --durations=0 and got this:

=============================================================== slowest durations ================================================================
155.91s call     python/test_with_dask.py::TestWithDask::test_approx
138.92s call     python/test_with_dask.py::TestWithDask::test_hist
22.91s call     python/test_with_dask.py::test_from_dask_array
9.20s call     python/test_with_dask.py::TestWithDask::test_feature_weights
7.02s call     python/test_with_dask.py::test_dask_ranking
6.61s call     python/test_with_dask.py::test_with_asyncio
6.44s call     python/test_with_dask.py::test_parallel_submit_multi_clients

Will investigate the ones that take the most time.

@jmoralez jmoralez marked this pull request as draft June 3, 2021 02:43
@codecov-commenter
Copy link

codecov-commenter commented Jun 3, 2021

Codecov Report

Merging #7020 (f159c58) into master (655e699) will increase coverage by 0.15%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7020      +/-   ##
==========================================
+ Coverage   81.71%   81.86%   +0.15%     
==========================================
  Files          13       13              
  Lines        3916     3916              
==========================================
+ Hits         3200     3206       +6     
+ Misses        716      710       -6     
Impacted Files Coverage Δ
python-package/xgboost/core.py 82.96% <0.00%> (+0.10%) ⬆️
python-package/xgboost/dask.py 82.02% <0.00%> (+0.66%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 655e699...f159c58. Read the comment docs.

@trivialfis
Copy link
Member

The slow test is probably not caused by dask but by hypothesis.

@jmoralez
Copy link
Contributor Author

jmoralez commented Jun 4, 2021

I see. The total runtime for the tests in this PR was 14 minutes and the current master is around 17 minutes. Should I add the client fixture to tests that currently build a cluster internally like these?

def test_from_dask_dataframe() -> None:

def test_from_dask_array() -> None:

That could probably reduce the runtime by a couple more minutes. Or if you have any other suggestions for maybe improving the hypothesis ones I could look into it.

@trivialfis
Copy link
Member

hould I add the client fixture to tests that currently build a cluster internally like these?

Some tests use specific number of workers so that they have to define their own cluster.

Or if you have any other suggestions for maybe improving the hypothesis ones I could look into it.

Sorry I don't have any suggestion, you know these better than me. ;-)

@jmoralez
Copy link
Contributor Author

jmoralez commented Jun 4, 2021

I ran the tests that take the most time and pretty much all the time is spent training so I don't think there's something that could be improved there. There are a couple of clusters that get created with kWorkers and no threads_per_worker, I could create an additional fixture like kCluster and pass that to those tests so that they reuse that one.

This didn't improve as much as I hoped haha so please let me know if its useful at all.

@trivialfis
Copy link
Member

I ran the tests that take the most time and pretty much all the time is spent training so I don't think there's something that could be improved there.

That's fine, the hypothesis tests take longer than we would like but highly effective at catching bugs.

There are a couple of clusters that get created with kWorkers and no threads_per_worker

Thank you for looking into them!

could create an additional fixture like kCluster and pass that to those tests so that they reuse that one.

I will follow up on making those changes since I wrote most of the tests, I should cleanup my own mess.

This didn't improve as much as I hoped haha so please let me know if its useful at all.

Of course it's useful and thank you! After merging the PR we know that we should focus on other places.

@jmoralez jmoralez marked this pull request as ready for review June 8, 2021 02:46
@jmoralez jmoralez changed the title [WIP][dask] speed up tests [dask] speed up tests Jun 9, 2021
@jmoralez
Copy link
Contributor Author

jmoralez commented Jun 9, 2021

@trivialfis I think this is ready for review, looking forward to your thoughts.

@trivialfis trivialfis merged commit 25514e1 into dmlc:master Jun 11, 2021
@jmoralez jmoralez deleted the cluster-fixture branch June 11, 2021 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants