-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly speed up file handling error paths #17920
Conversation
I'm probably measuring wrong, but I think I got a 1.5x speed up on the following when running compiled: ``` mypy -c 'import torch' --no-incremental ```
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
Okay, I think I am benchmarking this right, but I think I only see the win on If you have a benchmarking setup, I'd be curious if you see an effect! |
I'm not seeing much of a difference on my Linux desktop:
Some hypotheses that could explain the differences:
My config:
|
If the differences are due to speed of file system access (which seems likely), this PR looks great! Fast file system access is not very universal and we shouldn't assume a fast local disk. |
Another hypothesis is that the module search path has an impact:
|
Hm, this PR doesn't change the amount of file system access. The saving should just be CPU time from cloning and raising and catching exceptions. My environment does have some differences from yours, I'll try to narrow it down tomorrow morning. I do have a very long module search path in this environment, which is a very likely guess for what's going on. |
Would it help if we'd try to avoid some of these exceptions by doing |
Another random idea would be to cache results of |
I added 100 empty directories to PYTHONPATH. This slowed things a bit, but now this PR gives me a 20% performance gain (13.1s vs 10.9s when using a compiled mypy). |
Yep, I can confirm that longer search path is what makes the error handling code here so hot. In my main work environment, we install all first party packages editably, so it's common to have 100s of entries in the search path. I tried this script:
Normal venv (v1):
Many search paths venv (v2):
When run interpreted, the gap is a little smaller in absolute terms (maybe mypyc exceptions are more expensive?), and obviously much smaller in relative terms There's still some more gap between this and the environment I was running in earlier, I'll look into it further. |
Yeah, I do this in some other module resolving code I wrote a while back. It works pretty well, esp since scandir might mean we could avoid separate stats in some cases, so could conceivably be a win on short search paths as well. |
This can have a huge overall impact on mypy performance when search paths are long
It's late at night, so I'm probably measuring wrong, but I think I got a 1.5x speed up on the following when running at compile level 3, going from 50s to 33s on a Linux box:
Not really a measurable change when run interpreted.
See #17919