Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Older stubs still explicitly using Any rather than Incomplete #9550

Open
Avasam opened this issue Jan 16, 2023 · 11 comments
Open

Older stubs still explicitly using Any rather than Incomplete #9550

Avasam opened this issue Jan 16, 2023 · 11 comments
Labels
stubs: improvement Improve/refactor existing annotations, other stubs issues

Comments

@Avasam
Copy link
Collaborator

Avasam commented Jan 16, 2023

Thanks to typeshed-stats (#9386), I was able to make the following list of stubs majorly using Any, indicating that they've likely not been switched over to Incomplete yet.
Ordered by ratio of "explicit Any / explicit Incomplete + 1" parameters and variables.
Excludes stubs marked as obsolete and ratio <= 1 rounded down.

Package Name Explicit Any Explicit Incomplete Ratio
WebOb 119 0 119
WTForms 114 0 114
simplejson 59 0 59
python-gflags 50 0 50
regex 41 0 41
pyflakes 40 0 40
Markdown 33 0 33
jmespath 60 1 30
toml 30 0 30
decorator 28 0 28
python-xlib 28 0 28
uWSGI 21 0 21
gevent 114 5 19
polib 18 0 18
slumber 16 0 16
six 15 0 15
PyYAML 216 14 14
pynput 14 0 14
stdlib 3295 237 13
opentracing 25 1 12
hdbcli 24 1 12
python-jose 56 4 11
fanstatic 11 0 11
Flask-Cors 10 0 10
mypy-extensions 10 0 10
protobuf 144 15 9
parsimonious 27 2 9
capturer 9 0 9
croniter 9 0 9
greenlet 8 0 8
singledispatch 8 0 8
translationstring 8 0 8
xmltodict 8 0 8
html5lib 179 24 7
paramiko 45 6 6
console-menu 11 1 5
waitress 5 0 5
python-crontab 18 3 4
cachetools 4 0 4
colorama 4 0 4
qrbill 4 0 4
Pygments 178 47 3
vobject 93 24 3
PyMySQL 88 28 3
dateparser 86 21 3
Send2Trash 3 0 3
untangle 3 0 3
usersettings 3 0 3
seaborn 186 65 2
mock 137 45 2
psycopg2 97 37 2
beautifulsoup4 72 33 2
aws-xray-sdk 61 21 2
humanfriendly 54 22 2
requests 53 19 2
JACK-Client 2 0 2
chevron 2 0 2
first 2 0 2
ibm-db 2 0 2
libsass 2 0 2
retry 2 0 2
tabulate 2 0 2
toposort 2 0 2
gdb 2 0 2

(updated as of 2023-11-02)


We can also list 3rd party stubs by how many module-level and class-level variables they have:
for d in stubs/*/; do echo "$d,$( grep -ERoi '(^| )[[:alnum:]]+?: Any$' $d | wc -l )"; done

Package Name Var Any count
html5lib 144
ldap3 133
Pygments 131
oauthlib 85
psutil 83
vobject 78
PyYAML 77
commonmark 71
google-cloud-ndb 55
passlib 35
PyMySQL 35
psycopg2 33
dateparser 32
httplib2 32
protobuf 27
beautifulsoup4 27
aws-xray-sdk 25
humanfriendly 20
mock 16
fpdf2 12
paramiko 9
python-jose 9
requests 6
Markdown 5
WebOb 2
boltons 2
pep8-naming 2
pyflakes 2
python-gflags 2

(updated as of 2024-01-29)

@Avasam
Copy link
Collaborator Author

Avasam commented Jan 17, 2023

Do you think it's fine doing a single PR for those (single search & replace change across many third party stubs) instead of 40-70 PRs?

Still gotta validate that those with few Any actually meant it. Bigger ones can probably be mostly blindly changed?

@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 17, 2023

Do you think it's fine doing a single PR for those (single search & replace change across many third party stubs) instead of 40-70 PRs?

Maybe we could start off doing a grep for (or using a script to auto-update) function signatures with foo: Any | None = .... We can be pretty confident that those are artefacts of stubgen, so we can probably update all of those to foo: Incomplete | None = ... in a bulk PR pretty safely.

@srittau
Copy link
Collaborator

srittau commented Jan 17, 2023

I'd say fairly safe, with only few false positives, are:

  • Function arguments that contain a union with Any. (But not return types.)
  • Fields that contain Any, either standalone or as union.

It's also not a complete disaster if a wrong Any gets changed to Incomplete, it just means it needs to be rechecked at some point.

@Avasam
Copy link
Collaborator Author

Avasam commented Jan 17, 2023

Similar to : Any | None = ..., : Any | None\n seems like an equivalent stubgen artefact for class variables.

@AlexWaygood
Copy link
Member

  • Fields that contain Any, either standalone or as union.

: Any | None\n seems very safe to do a search-and-replace for. For : Any\n, I feel a little queasy. But I'd feel more confident if we only replaced places which had, say, three Any attributes in a row, e.g.

class Foo:
    a: Any
    b: Any
    c: Any

We could find those using regexes or AST.

@Avasam
Copy link
Collaborator Author

Avasam commented Jan 17, 2023

Function arguments that contain a union with Any. (But not return types.)

Seems fine to me after running those we deem "safer".

Fields that contain Any, either standalone [...]

+1 What Alex said.


After : Any | None = ... and : Any | None\n there's 15 Any | None left. And except for mock they do mostly look like they could be Incomplete | None.

  • 4 in mock
  • 1 in pyOpenSSL
  • 1 in redis
  • 8 in SQLAlchemy

After a series of "autofixes". I can update the table above to see if there's still any obvious ones.

@JelleZijlstra
Copy link
Member

It might actually be fine to change every Any in old stubs to Incomplete. In theory, we'll go over the Incompletes later and change them back to Any if appropriate.

@srittau
Copy link
Collaborator

srittau commented Jan 17, 2023

The only thing I'm a bit uneasy about is changing Any | None return types, because they are fairly common workarounds for the missing permissive union type, and they are not that easy to spot.

@Avasam
Copy link
Collaborator Author

Avasam commented Jan 17, 2023

@srittau
I agree with return types. I didn't include them in my table above for similar reasons.

TypeAliases should probably also be left untouched if I"m to guess.

@AlexWaygood
Copy link
Member

Similar to : Any | None = ..., : Any | None\n seems like an equivalent stubgen artefact for class variables.

Fancy taking this on as "stage 2", @Avasam? (Regardless of how far we want to go, I think I'd prefer to keep doing this in stages, so we can evaluate the risk level for each stage independently.)

@Avasam
Copy link
Collaborator Author

Avasam commented Jan 18, 2023

so we can evaluate the risk level for each stage independently

plus we may learn things so the same can be done for stdlib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stubs: improvement Improve/refactor existing annotations, other stubs issues
Projects
None yet
Development

No branches or pull requests

4 participants