-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it easier to work with circular module dependencies #481
Comments
Another idea is improving type inference when there are circular module dependencies. A multi-phase approach might work: if a function fails to type check because it depends on something that has no inferred type yet, don't report the errors but put the function in a work list. After a circular module group has been type checked, type check any functions in the work list again. This can be repeated until the work list is empty, up to a certain maximum number of iterations. |
I tentatively added this to the 0.4.0 milestone, as this could be a big issue as users start annotating code with big dependency cycles as we have at Dropbox. Some additional things we could do to help this:
|
A specific issue about decorators: #1303 |
Here is more detailed idea for import prioritization:
The intuition here is that runtime import initialization order is the "natural" order to type check modules, and it will ensure that base classes are checked before derived classes when they are defined in different modules. This should make many annotations to |
Some other thoughts:
|
Additional thoughts:
Failure to infer types for decorated functions is a common problem. A decorator can arbitrarily modify the signature of the decorated function, but we can't declare the signature of the decorated function anywhere, so in order to call a decorated function, we must have first type checked the definition. Example of how this could work, using Python 3.6 variable annotation syntax:
|
I want to propose a specific design: generalize the "deferred checking" implemented in checker.py to work within a cycle of imports. (I'm not claiming the idea is original, I'm just working out the details.) How it currently works: During type checking, types are inferred for variables (a concept which includes attributes). When a function contains a variable reference and the type of the variable has not yet been inferred, the function is added to a queue of "deferred nodes" and checking the function is abandoned. Then, when all type checking for a module is done, the functions in the queue are re-checked, under the assumption that all variables have now been inferred, and the re-checks will succeed. If this hypothesis fails, the error "Cannot determine type of 'Thing'" is issued. (There is only one place where that specific error is issued; it is issued immediately when not occurring inside a function or during the last pass.) The proposal: Lift the processing of the deferred queue out of TypeChecker.visit_file(). That function's logic is roughly:
It looks like steps (3) and (4) commute, and then we could refactor things a bit so that the caller (process_stale_sccs() in build.py) has access to the queue and controls when the second pass runs. It then can be changed so that it runs the first pass for all nodes in a cycle, collecting deferred queues for each, and then runs all the second passes. (And possibly third passes, etc.) How this helps: First, an example of how a forward reference within one file is handled by the current mechanism. def f() -> int:
return x
x = 1 + 1 In the first pass, But what if we have a function that references a variable in another module? And what if through import cycles that other module is checked after the one with # a.py
import b
def f() -> int:
return b.x # b.py
import a
x = 1 + 1 If we run
Now, in this case we might have solved this by better analysis of module dependencies, e.g. by noticing that # a.py
import b
def f() -> int:
return b.x
y = 0 + 0 # b.py
import a
def g() -> int:
return a.y
x = 1 + 1 Here tweaking the processing order just changes the error message -- we either get a complaint about But I believe that extracting the deferred queues and processing them after the first pass has run for both UPDATE: I have a prototype. |
Results from the prototype: Remember how I withdrew #2167 when I found that it disturbed the processing order of some internal codebase so as to introduce new errors? Combining that PR with this prototype made those errors go away. So I'm very optimistic about the combination of these. We need both, because the example in #2015 is not solved by the prototype, because that deferred processing only happens for functions, not at the module level (this makes sense since functions are typically called after a module is imported, while module-level code runs immediately at import). |
I haven't look at the prototype yet, but this all sounds good to me. One thing to note is that I'm not 100% convinced that the current deferred processing is correct in every scenario (it started as a quick hack). However, we can solve these issues (if any) independently of generalizing deferred processing to import cycles. |
I have tried it on a fairly large test set (our "C" repo), combined with
#2167 (adding a lower priority for imports inside "if MYPY"). The latter by
itself caused a disturbance in module processing order that reported a
bunch of new errors. Combining that with #2264 (my non-prototype
implementation of this idea) silenced those errors. From other (smaller)
tests I am confident that we still catch real errors due to import cycles
(e.g. two modules importing each other and referencing class variables in
each other during module loading).
Would you mind reviewing both of those?
|
Excellent! I can review both the PRs, likely on Tuesday or Wednesday. I also have #1748, the big binder PR to review this week, which I'll likely do first. |
Where this doesn't help: Deferred processing only happens for functions. Therefore cross-module references outside functions (either at the global level or in class bodies outside the methods) still require the modules to be processed in the right order. This problem also exists at runtime. Mypy does not exactly emulate the import order at runtime, but it uses heuristics to approximate it. These heuristics work by giving import dependencies different priorities based on the shape and position of the import (e.g. "from X import Y" has a higher priority than "import X", and imports inside functions or inside "if MYPY" are even lower. To break a tie the order in which imports were encountered dynamically is used. I expect that over time we'll refine these heuristics somewhat, but once #2167 is merged I think they are pretty good, and once we have deferred processing of functions the majority of cases will actually be taken care of already. What if you need more than two passes? It's theoretically possible to construct worst-case examples that need an arbitrary number of passes. The deferred code can be easily extended to support it (it can just iterate until nothing gets taken out of the queue) but those cases are pretty rare and I'd rather not go there yet -- most likely when we see this it's a sign of overly complex code. |
These are still extremely painful... Are they entirely fixed by following the advice in https://www.python.org/dev/peps/pep-0484/#forward-references ? |
@Juanlu001 They are entirely fixed by that, but are still a bit painful. You may be interested in https://www.python.org/dev/peps/pep-0563/ which may make this even easier. |
Thanks @ethanhs, will keep an eye on that. |
About time, since hammer_vlsi.py was getting really large (2k+ LoC!). In addition, mypy has trouble with import cycles: python/mypy#481
About time, since hammer_vlsi.py was getting really large (2k+ LoC!). In addition, mypy has trouble with import cycles: python/mypy#481
About time, since hammer_vlsi.py was getting really large (2k+ LoC!). In addition, mypy has trouble with import cycles: python/mypy#481
About time, since hammer_vlsi.py was getting really large (2k+ LoC!). In addition, mypy has trouble with import cycles: python/mypy#481
It's sometimes painful to work around issues related to circular module dependencies. Some examples:
Not sure how these should be resolved. Some ideas:
The text was updated successfully, but these errors were encountered: