-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an explicit cache on Python entry points #614
Conversation
568ae80
to
1b85894
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #614 +/- ##
==========================================
+ Coverage 83.34% 83.54% +0.20%
==========================================
Files 66 66
Lines 3794 3816 +22
Branches 739 745 +6
==========================================
+ Hits 3162 3188 +26
+ Misses 557 554 -3
+ Partials 75 74 -1 ☔ View full report in Codecov by Sentry. |
I think we can do better than this, actually. I didn't know this, but the I didn't realize that the startup performance had regressed so badly. SSDs and OS caching hide how much IO is happening here. I can imagine that cold invocations on spinning disks are brutal... |
Alright, I dropped the |
893cc79
to
5ed75ac
Compare
f834f9a
to
86c11bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Whenever we enumerate Python entry points to load colcon extension points, we're re-parsing metadata for every Python package found on the system. Worse yet, accessing attributes on importlib.metadata.Distribution typically results in re-reading the metadata each time, so we're hitting the disk pretty hard. We don't generally expect the entry points available to change, so we should cache that information once and parse each package's metadata a single time. This change jumps through a lot of hoops to specifically use the `importlib.metadata.entry_points()` function wherever possible because it has an optimization that allows us to avoid reading each package's metadata while still properly handling package shadowing between paths. This has a measurable impact on extension point loading performance.
86c11bb
to
69e20f9
Compare
Whenever we enumerate Python entry points to load colcon extension points, we're re-parsing metadata for every Python package found on the system. Worse yet, accessing attributes on importlib.metadata.Distribution typically results in re-reading the metadata each time, so we're hitting the disk pretty hard.
We don't generally expect the entry points available to change, so we should cache that information once and parse each package's metadata a single time.
Closes #600