-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimum Dask/Distributed versions #848
Comments
It would be even better if we could automate it though that may be more work than it's worth. However, I want to bring up an issue with the proposal. Often when trying to debug regressions devs want to easily switch out versions of dask/distributed possibly going further back than 2 versions. Pinning in the conda version would probably be ok but pinning in setup.py or for development environments would present a bit of barrier to quickly swapping out dask/distributed versions |
The problem with that is users can also mistakenly install an older Dask/Distributed version. I've seen this happen quite a few times last year when users install conda packages incrementally, rather than installing everything at conda environment creation time. Given dask-cuda is pure-Python, a solution for devs that want to go back further is to clone dask-cuda, change |
My preference would be to keep the same pinning for the setup.py and conda recipe, as we are still publishing packages on PyPI and it could get confusing having different Dask/Distributed pinnings between the two distributions. Beyond this, there are a decent amount of ways for devs to get the latest dask-cuda co-existing with older Dask versions such that I don't think it should come up as an issue.
If we're okay with having automation that operates independently of RAPIDS' release cycle, it shouldn't be difficult to put together a workflow similar to the one updating Dask's gpuCI images to achieve this - happy to open a PR doing that once we decide a pattern moving forward 🙂 |
I'm +1 on that if you find an easy way to parse Dask latest As a side note for you @charlesbluca : ops is maintaining a versioning system that can query the correct UCX-Py version based on the RAPIDS version and automatically update CI scripts which I think could be useful in dask-cuda/ci/release/update-version.sh Lines 25 to 35 in 48c4929
|
This issue has been labeled |
This issue has been labeled |
From experience with past couple of releases, it seems unfeasible to try and be less restrictive with Dask releases. They simply occur too often that is impractical for us to prevent no bugs get introduced that impact RAPIDS use cases. Therefore, I am closing this and we will continue deciding on probably a single release pin at RAPIDS code freeze time. If anybody has further ideas on how to improve this in the future, please feel free to reopen or write a new issue. |
Don't think this option is anywhere on the horizon (so not reopening the issue for it), but would minimum version testing suit our needs here? I've been exploring this in dask-contrib/dask-sql#555 and it's been helpful in verifying that our core dependencies are constrained properly. There are some cases that are missed in that PR (e.g. min version testing only runs on python 3.8, we still aren't testing core dependency versions between our minimum and the latest), but I feel like something like that could allow us to support a wider range of Dask versions with the security that they will work as expected. |
I don't think this would help. A minimum version would suffice if there were any guarantees of stability moving forward, as happens with software that follow SemVer, where it's guaranteed with reasonable probability that a patch release will still work with other Major.Minor versions. In Dask there is simply no forwards-/backwards-compatibility, that we could with high probability assume things would still work. To exemplify, let me use the latest Dask releases that we could potentially consider for the RAPIDS 22.06:
In the case above, if we had minimum assumed to be 2022.05.0, we would be fine, except for 2022.05.1. However, if we decided to move to 2022.05.2, we still have no guarantees that 2022.06.0 will present no issues that will be too late for RAPIDS 22.06 to resolve. We have seen this sort of issue in the past, so unless we are able to test or verify in some way that all versions our pinning supports work well I don't see this as feasible. |
Given we don't really test for old Dask and Distributed versions, I would propose we limit the oldest version (minimum) to 2 or 3 Dask releases, as ensuring older versions work is unrealistic. For example, we currently pin
>=2021.11.1
, and at the time of writing this, this is the 5th most recent release, with the 5 most recent releases being, from newest to oldest:With my proposal, we would pin
>=2021.01.0
(second most recent) or>=2021.12.0
(third most recent). To avoid overburdening ourselves changing pinnings after each new Dask release we could do this only once per RAPIDS release, around code freeze time.What are people's thoughts on the proposal?
The text was updated successfully, but these errors were encountered: