Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dask-cudf wheel name #14713

Merged

Conversation

raydouglass
Copy link
Member

@raydouglass raydouglass commented Jan 5, 2024

A recent change in pypa/setuptools#4159 may have caused our dask-cudf wheels to be published as dask_cudf-cu12 instead of dask-cudf-cu12.

Additionally, cudf_kafka wheels would have this issue, but 1) we do not publish wheels for cudf_kafka and 2) the conda packages are published as cudf_kafka (with underscore), so be a larger refactor later on.

@raydouglass raydouglass added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 5, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jan 5, 2024
@raydouglass raydouglass added bug Something isn't working non-breaking Non-breaking change labels Jan 5, 2024
@bdice
Copy link
Contributor

bdice commented Jan 5, 2024

Thanks @raydouglass. Currently there is no intention to publish wheels for cudf_kafka, because the wheel would have to include libcudf. I experimented a bit with this while working on #14292. I think we're safe with just fixing dask-cudf.

@github-actions github-actions bot added the ci label Jan 5, 2024
@raydouglass raydouglass removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 8, 2024
@raydouglass raydouglass marked this pull request as ready for review January 8, 2024 15:22
@raydouglass raydouglass requested review from a team as code owners January 8, 2024 15:22
@charlesbluca
Copy link
Member

Looking at the failures, seems like we might want to reconsider how we construct the sed commands updating ${package_dir}/${package_name}/_version.py?

sed: can't read python/dask_cudf/dask-cudf/_version.py: No such file or directory

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice
Copy link
Contributor

bdice commented Jan 8, 2024

Do these lines also need to be changed?

No changes are needed there. Those are rules for telling isort how to sort imports. The import name is still dask_cudf because Python module names can't have dashes (only underscores). The package name is dask-cudf but the import name (module name) is dask_cudf.

@raydouglass raydouglass added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 8, 2024
@raydouglass
Copy link
Member Author

Adding back DO NOT MERGE because I want to examine the built wheel before merging.

@@ -23,7 +23,7 @@ pyproject_file="${package_dir}/pyproject.toml"

sed -i "s/^name = \"${package_name}\"/name = \"${package_name}${PACKAGE_CUDA_SUFFIX}\"/g" ${pyproject_file}
echo "${version}" > VERSION
sed -i "/^__git_commit__/ s/= .*/= \"${commit}\"/g" "${package_dir}/${package_name}/_version.py"
sed -i "/^__git_commit__/ s/= .*/= \"${commit}\"/g" "${package_dir}/${package_name//-/_}/_version.py"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the better fix here would be to rename the directory python/dask_cudf to python/dask-cudf, to align with the package name. However, that is more breaking. It would also affect cudf_kafka. It's probably good to separate that discussion from this PR since its scope is considerably larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong feelings about this. I just don't think there's a good reason to favor a string replacement in a script (more of a workaround) over the principle of matching the directory structure to the package names. The change would come with some nonzero cost and potential for breakage, though, so I wouldn't do it lightly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the change here changes python/dask_cudf/dask-cudf => python/dask_cudf/dask_cudf basically fixing the module name.

But I think you're right that it should be python/$package_name/$module_name and therefore python/dask-cudf/dask_cudf.

@ajschmidt8
Copy link
Member

Do these lines also need to be changed?

No changes are needed there. Those are rules for telling isort how to sort imports. The import name is still dask_cudf because Python module names can't have dashes (only underscores). The package name is dask-cudf but the import name (module name) is dask_cudf.

Good catch. I didn't see those lines were in the isort section when I did my global search. There are a lot of dask_cudf strings in this repository.

@raydouglass
Copy link
Member Author

The wheel's contents including METADATA/Name looks correct now, so this is ready to merge.

@raydouglass raydouglass removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Jan 8, 2024
@bdice
Copy link
Contributor

bdice commented Jan 8, 2024

@raydouglass Feel free to admin merge if you'd like. Tests are blocked by #14723.

@raydouglass raydouglass merged commit fa8db7a into rapidsai:branch-24.02 Jan 8, 2024
63 of 66 checks passed
@raydouglass raydouglass deleted the test-dask-cudf-wheel-name branch January 8, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants