-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stubgen improvements: preserving existing annotations #3169
Conversation
Just checked other stubgen issues and this seems to fix #1782 too |
test-data/unit/stubgen.test
Outdated
@@ -20,42 +20,52 @@ def g(arg): ... | |||
def f(a, b=2): ... | |||
def g(b=-1, c=0): ... | |||
[out] | |||
def f(a, b: int = ...): ... | |||
def g(b: int = ..., c: int = ...): ... | |||
def f(a, b: int=...): ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I like the old style better, also see this example from PEP 8:
Yes:
def munge(sep: AnyStr = None): ...
def munge(input: AnyStr, sep: AnyStr = None, limit=1000): ...
No:
def munge(input: AnyStr=None): ...
def munge(input: AnyStr, limit = 1000): ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeshed's convention is to put spaces around the =.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I was used to the old rule, not the special case for annotations. I'll change that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this PR, it will definitely make stubgen
more usable. Here are more detailed comments.
test-data/unit/stubgen.test
Outdated
|
||
y: C | ||
[out] | ||
x: 'C' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ForwardRefs are not needed in stub files. So that I would propose to keep all annotations simple, it would create less "noise" when reading stubs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that (and a lot of the patch could be made simpler by removing forward ref handling). I couldn't find that documented anywhere (looking at PEP 484); is that actually a defined behaviour or an implementation artifact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a consequence of the fact that .pyi
files are actually never executed, and the only reason of ForwardRef's is to "hide" something from the Python runtime. This is already heavily used in typeshed stubs to simplify them, if you think it is worth mentioning this in PEP 484, then you could make a short PR to python/peps
repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, changed this
y: Any | ||
|
||
[case testMultipleAssignmentAnnotated] | ||
x, y = 1, "2" # type: int, str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add few tests with wrong annotations, for example with more types than variables, and other possible mismatches, just to be sure that stubgen will not crash on those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... the current code is not crashing, but generating "something" instead of erroring out. I can put an error message although I'm a bit converned that validation is a bit out of scope from the stubgen tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit converned that validation is a bit out of scope from the stubgen tool.
Just not crashing on invalid input is already OK, I think. If you can produce a reasonable warning, then it's perfect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. the current status is that it doesn't crash, just ignores the "extra" data
def f(self): ... | ||
def g(self): ... | ||
|
||
[case testExportViaRelativeImport] | ||
from .api import get | ||
[out] | ||
from .api import get as get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this form should be preserved. There is currently rule in PEP 484 that says that such form should be used for names that are actually imported (and therefore re-exported by Python runtime). Mypy currently does not follow this rule and treats all names as re-exported, see #2927, but it will be fixed soon.
Therefore, to prevent breakage of stubgen stubs after the fix, and to comply with PEP 484 this form from mod import X as X
should be preserved for names that are actually imported in original file. At the same time, a short form from mod import X
should be used for auxiliary/fake imports created by stubgen (like typing
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after reading this comment and pep-484, I'm not sure I fully understand what is required here. What I have is:
- If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?
- But then, how does stubgen know which imported names where "intended" to be reexported? should it use
__all__
and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?
I could use a hand here at least with the test cases, once I have some boundaries defined by those I could probably tweak the implementation to match the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a stub needs to export an imported name, it should import it with an alias (even if the alias is name is the same as the original). Is that correct?
Yes.
But then, how does stubgen know which imported names where "intended" to be reexported? should it use all and ignore everything else? should it reexport everything just in case? should it reexport only names imported in a certain way?
I think the safest bet would be to re-export all globally visible names (i.e. ignore imports in function bodies), if there is not __all__
defined, and only re-export names form __all__
if there is one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm implementing the __all__
scenario. I'm not doing the "reexport everything if there's no __all__
" because I think the result ends up being unnecessarily noisy for most cases (you'll end up with tons of modules saying from typing import List as List
because they don't have an __all__
). In any case, it's easy to change our minds later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll end up with tons of modules saying
from typing import List as List
because they don't have an__all__
.
You can just manually exclude names imported from typing
. In general, this is an important point, since it already caused some bugs in typeshed, see e.g. python/typeshed#1484. I would prefer this test to work.
(Also I don't think people will be unhappy about a bit more bulky stub, but they will be certainly unhappy about false positives when as name
isn't present).
def visit_unbound_type(self, t: UnboundType)-> str: | ||
s = t.name | ||
base = s.split('.')[0] | ||
self.stubgen.import_tracker.require_name(base) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow the logic here. If I have an unbound type X.Y.Z
, will this require import X
? But how do we know X
is actually a module? Maybe it would be better to just use Any
for unbound types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you have an X.Y.Z
annotatation, what other things could X
be? I'm assuming module here because if it's not the only choice, it's probably the best one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure it is a good idea to guess something for an unbound type. I still think it would be better to use Any
. Why do you think we should do something at all with unbound types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it. So essentially 99% of the type annotations it will find will still be unbound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stubgen runs mostly at a syntactical level. Semantic analysis phase is not run within it.
OK, is it easy to fix this? If not, then maybe add a # TODO:
item here and/or open an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's easy to fix at all; it's probably a major change in the tool, the design decision of working syntactically is heavily embedded on it from day one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also have an issue for this. Maybe some day we will have time to improve this.
mypy/stubgen.py
Outdated
if alias: | ||
self.reverse_alias[alias] = name | ||
|
||
def add_import(self, module: str, alias: str=None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it was already discussed that PEP 8 specifies spaces around =
for this case, it should be alias: str = None
not alias: str=None
. Also note that we are moving towards explicit Optional
types for args with default None
.
(This also appears few time below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
self._state = CLASS | ||
|
||
def is_type_expression(self, expr: Expression, top_level: bool=True) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see more tests for this, to be sure that complex (generic) aliases like Union[T, List[T]]
are recognized, but things like type_map[int]
are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a test for the former...
The latter will actually go through as an alias and I'm not sure how to prevent it with the kind of information I have in stubgen (I'm assuming that in most libraries that kind of things won't happen too much at a module top level).
mypy/stubgen.py
Outdated
def require_name(self, name: str) -> None: | ||
self.required_names.add(name.split('.')[0]) | ||
|
||
def import_lines(self) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is difficult to follow the logic of this method. Maybe a doctring (and comments) will help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mypy/stubgen.py
Outdated
self.module_for = {} # type: Dict[str, Optional[str]] | ||
self.direct_imports = {} # type: Dict[str, str] | ||
self.reverse_alias = {} # type: Dict[str, str] | ||
self.required_names = set() # type: Set[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would document the meaning of these four variables. Maybe just add a comment above every line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these
test-data/unit/stubgen.test
Outdated
import urllib.parse | ||
__all__ = ['urllib'] | ||
[out] | ||
import urllib.parse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should still be an error comment in output like
# Names in __all__ with no definition:
# urllib
before or after the import statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running import urllib.parse
defines the urllib
name and attaches it to the correct module, why this should be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, then ignore this comment (or add another test where this produce an error, like from urllib import parse
)
@@ -552,14 +602,148 @@ def f(a): ... | |||
[case testInferOptionalOnlyFunc] | |||
class A: | |||
x = None | |||
def __init__(self, a=None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also keep an additional test for __init__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
@dmoisset Are you still working on this? I think this would be a useful addition (in particular in the context of helping maintainers of third-party libraries that want to support typing). |
@ilevkivskyi I've been dealing with other stuff, I might take a look at this again during the rest of the month |
@dmoisset Sorry for pinging again, this is the oldest open PR now (and IMO it contains an important feature). |
This also changes the style of initializer in generated stubs to be closer to PEP8 style, this implied a few test changes
PEP8 actually suggest this eception to the normal (no spaces) rule for annotated arguments
e9cfae2
to
df0d5cf
Compare
Thanks for an update! I replied to your questions above, will make a thorough review soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is almost ready, just few minor comments/questions.
mypy/stubgen.py
Outdated
|
||
def get_init(self, lvalue: str, rvalue: Expression) -> Optional[str]: | ||
def get_init(self, lvalue: str, rvalue: Expression, | ||
annotation: Optional[Type]=None) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space around =
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
mypy/stubgen.py
Outdated
init_stmt = arg_.initialization_statement | ||
if init_stmt: | ||
if isinstance(init_stmt.rvalue, NameExpr) and init_stmt.rvalue.name == 'None': | ||
initializer = 'None' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a recent agreement that all optional types should be explicit in typeshed, and it is now checked with --no-implicit-optional
. If the implicit Optional
are added already at the parse stage, then I think the initializer should be always ...
, even for optional types. Otherwise, we can keep None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure of the exact context here, but the typeshed convention is that all defaults should be ...
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok removing this (actually the result was already ok, but this was a leftover when I tried to take advantage of the implicit optional)
mypy/stubgen.py
Outdated
@@ -35,6 +35,7 @@ | |||
- we don't seem to always detect properties ('closed' in 'io', for example) | |||
""" | |||
|
|||
import builtins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably leftover of some previous change, removing...
test-data/unit/stubgen.test
Outdated
|
||
[case testClassVariable] | ||
class C: | ||
x = 1 | ||
[out] | ||
class C: | ||
x = ... # type: int | ||
x: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For class attributes it is important to know whether an attribute was just declared like x: int
or actually defined like x: int = 1
. There is a difference for protocol classes -- in former case an explicit subclass cannot be instantiated. Therefore, I think the translation for class attributes should be like this:
x: int -> x: int
x = 1 -> x: int = ...
x: int = 1 -> x: int = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have self.x = 1
in __init__
). Should those be just x: int
or should they also be x: int = ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... what about instance attributes that stubgen finds because of self accesses? (for example, when you have
self.x = 1
in__init__
). Should those be justx: int
or should they also bex: int = ...
?
I think it should be x: int = ...
if there is an assignment in __init__
(again mostly from the protocols POW).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've made the change (and some others), I will be pushing un update after some testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
test-data/unit/stubgen.test
Outdated
@@ -367,19 +413,19 @@ from re import match, search, sub | |||
__all__ = ['match', 'sub', 'x'] | |||
x = 1 | |||
[out] | |||
from re import match as match, sub as sub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget about fixing this as discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
noalias1: Any | ||
noalias2: Any | ||
noalias3: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you preserve @abstractmethod
and other decorators? I think it is important to preserve @abstractmethod
, @classmethod
, @staticmethod
, and @property
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't make any changes here wrt the original stubgen. I see some tests checking for property/staticmethod/classmethod being preserved, and other decorators removed (I think abstractmethod is not preserved). If you don't mind, I would add that as a separate issue to avoid scope creep here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, could you please open an issue for this?
Would love to see this land -- who's waiting for whom? |
I am waiting for @dmoisset to implement few last comments. |
As discussed in python/mypy#3169 (comment)
OK, I'm pushed a new update. There are a few issues that we've discussed in the comments that were actually issues in the previous version. Something that could be done if you agree is merge this and add these items as separate issues. What I have is:
|
(to clarify: I can add the issues, just wanted to sync on how we proceed) |
I haven't reviewed this but I am glad that it is moving forward and I hope
we can iterate more quickly in the future! (We dropped the ball on code
review here, sorry.)
|
Oh, the apologies are on me; code reviews have been very quick, and the delays have been on my side. My chances to work on this are limited so it can take me a long time to move forward after the reviews |
The problem is that
These tests might be older than PEP 484 itself, so I am not surprised. We just need to update the tests, although I am fine to do this in a separate PR. To summarize, I am fine with merging this PR as is to avoid unnecessary growth (it is already large). But, please open four separate issues for the problems you mentioned and mention this PR in every issue. |
@dmoisset |
Thanks so much Daniel for working on this! And Ivan for reviewing. Stubgen doesn't get enough developer love. |
This PR modifies heavily the stubgen to be able to create better stubs when processing already annotated files (Implements #2106 ). It essentially adds annotations on the stubs for:
It also tries to add all the required imports into the stub to make the resulting file valid, and handle properly forward references when necessary. It also adds some fixes (handling of generics as base classes, some formatting fixes in the output) that were natural consequences of some of the refactoring I made.
I know this is a large patch (and refactors some pieces of the generator), so feel free to comment if it needs discussion.