Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run pyupgrade 2.31.0. #10141

Merged
merged 13 commits into from
Feb 4, 2022
Merged

Run pyupgrade 2.31.0. #10141

merged 13 commits into from
Feb 4, 2022

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 26, 2022

This PR runs pyupgrade to modernize some syntax in the Python code base. (I am preparing a "better code" talk for our team that will reference this PR.)

The kinds of modernizing changes that pyupgrade makes are similar in purpose to those made by clang-tidy (#9860), but for Python. These include things like:

  • Using literals for empty dicts {} and lists [], because they are a bit faster and more readable than dict() or list()
  • Removing extra parentheses or braces wrapping generator expressions if not needed
  • Preferring format strings
  • Using canonical error names, e.g. IOError became an alias for OSError in Python 3.3.
  • Replacing the "new-style" class definitions like class A(object): with the simpler syntax class A:
  • Removing unnecessary __future__ imports that were needed for Python 2 compatibility, like print_function and division
  • Using raw strings for regular expressions, to clarify intent and avoid extra escaping

The pyupgrade README has a long showcase of other rules.

Notes for reviewers:

  • pyupgrade has a lot of opinions on how to use type annotations, and I'm not sure if I like all of them (some may have compatibility issues?). I have skipped over the type annotations for now, to set a more reasonable scope for this PR. See commit 54b16b9 (since reverted) for examples. Please let me know what you think about these changes and I may investigate them further in a follow-up PR!
  • I have not added this tool to our CI style checks via a pre-commit hook yet. That can be done in a follow-up PR if we agree that we would like these types of changes to be required in the future.
  • I used the argument --py37-plus to only perform changes that are compatible with Python 3.7+.
  • I can revert the changes to the vendored code in versioneer.py if desired. (It's pretty outdated anyway, it looks like we're using versioneer 0.18, released on January 1, 2017.)

@github-actions github-actions bot added Python Affects Python cuDF API. gpuCI libcudf Affects libcudf (C++/CUDA) code. labels Jan 26, 2022
@bdice bdice self-assigned this Jan 26, 2022
@bdice bdice added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 26, 2022
@codecov
Copy link

codecov bot commented Jan 27, 2022

Codecov Report

Merging #10141 (8d33659) into branch-22.04 (a7d88cd) will increase coverage by 0.04%.
The diff coverage is 1.27%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.04   #10141      +/-   ##
================================================
+ Coverage         10.42%   10.47%   +0.04%     
================================================
  Files               119      122       +3     
  Lines             20603    20500     -103     
================================================
- Hits               2148     2147       -1     
+ Misses            18455    18353     -102     
Impacted Files Coverage Δ
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/main.py 0.00% <0.00%> (ø)
python/cudf/cudf/_version.py 0.00% <0.00%> (ø)
python/cudf/cudf/comm/gpuarrow.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0581975...8d33659. Read the comment docs.

@bdice bdice marked this pull request as ready for review February 2, 2022 03:26
@bdice bdice requested review from a team as code owners February 2, 2022 03:26
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me. I made some comments, but in hindsight I think a significant chunk of them were in the versioneer files which I think we should just revert as you point out in your comment. No need to change vendored files. You should change the pre-commit hook to exclude those files when you do add it. I'm approving now because I suspect that once you remove the versioneer changes I only have 2-3 other comments and they're pretty trivial.

Regarding annotations, I like some of the changes but not all. Removing quoted types where possible is nice, but I don't really like using the pipe operator for or everywhere. I'm not sure what best practices are as far as using built-in types for annotations vs using their aliases from the typing module, I'd say let's see what some other projects do? But I definitely don't think we need to make these changes here

python/cudf/cudf/tests/test_api_types.py Show resolved Hide resolved
python/cudf/cudf/tests/test_groupby.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/versioneer.py Outdated Show resolved Hide resolved
python/cudf/versioneer.py Outdated Show resolved Hide resolved
python/custreamz/versioneer.py Outdated Show resolved Hide resolved
python/dask_cudf/dask_cudf/io/orc.py Outdated Show resolved Hide resolved
python/dask_cudf/dask_cudf/io/tests/test_parquet.py Outdated Show resolved Hide resolved
python/dask_cudf/versioneer.py Outdated Show resolved Hide resolved
python/dask_cudf/versioneer.py Outdated Show resolved Hide resolved
@bdice bdice mentioned this pull request Feb 3, 2022
@bdice
Copy link
Contributor Author

bdice commented Feb 3, 2022

I think a significant chunk of them were in the versioneer files which I think we should just revert as you point out in your comment. [...] You should change the pre-commit hook to exclude those files when you do add it.

Reverted changes to versioneer. Yup, I'll exclude those files if we get to the point of adding a pre-commit hook for pyupgrade.

Regarding annotations, I like some of the changes but not all. Removing quoted types where possible is nice, but I don't really like using the pipe operator for or everywhere.

I agree, with your general sentiment -- pyupgrade's typing changes are the part I'm least sure about. (As an aside, I do agree with pyupgrade that replacing Union with T1 | T2 and Optional with T | None is just fine, or perhaps slightly better, because it lets you remove from typing import ... statements.) The use of annotations-as-strings vs. annotations-as-objects was discussed extensively in light of some recent PEPs and changes to Python 3.10 (see this issue from pydantic, for example). I'm generally uninformed about typing at this level of detail, but I think that typing behavior varies significantly among recent Python versions so I am taking an approach of "if it ain't broke... don't fix it." Is having mypy pass our only requirement for typing support?

I'm not sure what best practices are as far as using built-in types for annotations vs using their aliases from the typing module, [...]

The best practice is probably to use the built-in dict / list. See StackOverflow answer quoted below:

If you are using Python 3.7 or newer, you can use dict as a generic type if you start your module with from __future__ import annotations, and as of Python 3.9, dict (as well as other standard containers) supports being used as generic type even without that directive.

rapids-bot bot pushed a commit that referenced this pull request Feb 3, 2022
This PR refactors some DataFrame tests, including [suggestions](#10141 (comment)) from @vyasr in #10141 that are not related to that PR's focus.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #10204
rapids-bot bot pushed a commit that referenced this pull request Feb 3, 2022
…data` (#10206)

As pointed out in [this issue comment](#10141 (comment)), we should be good to update some of dask-cudf's parquet testing to reflect bug fixes merged in dask/dask#6047.

cc @rjzamora in case you see some other tests that could be resolved here

Authors:
  - Charles Blackmon-Luca (https://github.com/charlesbluca)

Approvers:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

URL: #10206
@bdice
Copy link
Contributor Author

bdice commented Feb 4, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 42d86c4 into rapidsai:branch-22.04 Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants