You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tl;dr
I'd like __spec__ to be passed through by importlib.util._LazyModule.__getattribute__ without triggering the full load of the module. That way, the regular, internal import machinery doesn't accidentally trigger the full load when fishing the lazy module out of sys.modules. This can be caused by re-(lazy-)importing the module, and the result, a fully loaded module, is pretty unexpected.
Full Story
I've been trying to use importlib.util.LazyLoader lately and found a way it could be made more ergonomic.
importimportlib.utilimportsysdeflazy_import(name):
# Personal addition to take advantage of the module cache.try:
returnsys.modules[name]
exceptKeyError:
passspec=importlib.util.find_spec(name)
loader=importlib.util.LazyLoader(spec.loader)
spec.loader=loadermodule=importlib.util.module_from_spec(spec)
sys.modules[name] =moduleloader.exec_module(module)
returnmodulelazy_typing=lazy_import("typing")
# Let's import it a second time before actually using it here.# This could even happen in another file. Ideally, *still* doesn't execute yet# because we pull from the sys.modules cache.lazy_typing=lazy_import("typing")
lazy_typing.TYPE_CHECKING# Only *now* does the actual module execute.
The above recipe works, but without the sys.modules caching I added, the second import would cause the module to execute and populate, even though the user hasn't gotten an attribute from it yet.
Fair enough, it's just a recipe for the docs. It's not meant to cover all the edge cases and use cases. What about a different code snippet that tries not to manually perform every part of the import process, though?
Let's try again, but using an import hook like, say, a custom finder on the meta path to wrap the found spec's loader with LazyLoader. That way, it'll take advantage all the thread locks importlib uses internally and can even affect normal import statements. Here's an example:
# NOTE: This is not as robust as it could be, but it serves well enough for demonstration.importimportlib.utilimportsys# threading is needed due to circular import issues from importlib.util importing it while# LazyFinder is on the meta path. Not relevant to this issue.importthreadingclassLazyFinder:
"""A module spec finder that wraps a spec's loader, if it exists, with LazyLoader."""@classmethoddeffind_spec(cls, fullname: str, path=None, target=None, /):
forfinderinsys.meta_path:
iffinderisnotcls:
spec=finder.find_spec(fullname, path, target)
ifspecisnotNone:
breakelse:
raiseModuleNotFoundError(...)
ifspec.loaderisnotNone:
spec.loader=importlib.util.LazyLoader(spec.loader)
returnspecclassLazyFinderContext:
"""Temporarily "lazify" some types of import statements in the runtime context."""def__enter__(self):
ifLazyFindernotinsys.meta_path:
sys.meta_path.insert(0, LazyFinder)
def__exit__(self, *exc_info):
try:
sys.meta_path.remove(LazyFinder)
exceptValueError:
passlazy_finder=LazyFinderContext()
withlazy_finder:
importtyping# Does the same thing as the earlier snippet, but for a normal import statement.
Unfortunately, the above code has the same flaw as the original importlib recipe when used directly: the module cache isn't being taken advantage of. However, it's not possible to work around from user code without a ton of copying.
Adding import typing again at the bottom will cause typing to get fully executed. This is demonstrable in two ways:
Adding print statements checking the type of the module:
...
withlazy_finder:
importtypingprint(type(typing))
# Doesn't matter if we're using the context manager again or not, the result is the same.# with lazy_finder:importtypingprint(type(typing))
By putting the above code snippet in a file(e.g. scratch.py) then checking the output of python -X importtime -c "import scratch" before and after adding a second import typing statement:
Because the __spec__ is requested even when checking the module cache, and importlib.util._LazyModule makes no exceptions for attribute requests, well, the original loader will always execute and the module will populate. To get around this, a user would have to copy importlib.util._LazyModule and importlib.util.LazyLoader, modify them (see suggested patch below), and use those local versions instead.
Thus, I propose adding a small special case within _LazyModule to make usage with sys.modules more ergonomic:
If __spec__ is requested, just return that without loading the whole module yet. That way, instances of _LazyModule within sys.modules won't be forced to resolve immediately by regular import machinery, not until the module is visibly accessed by the user. The diff would be quite small:
--- current_3.14.py 2024-11-19 15:35:57.218717430 -0500+++ modified_3.14.py 2024-11-19 15:36:21.608717512 -0500@@ -171,6 +171,10 @@
def __getattribute__(self, attr):
"""Trigger the load of the module and return the attribute."""
__spec__ = object.__getattribute__(self, '__spec__')
++ if "__spec__" == attr:+ return __spec__+
loader_state = __spec__.loader_state
with loader_state['lock']:
# Only the first thread to get the lock should trigger the load
I hope this makes sense and isn't too long-winded.
EDIT: Added a tl;dr at the top.
EDIT2: Added an easier way to demonstrate the full load being triggered.
EDIT3: Adjusted phrasing.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Sachaa-Thanasius
changed the title
Allow lazily loaded modules to be imported multiple times without resolving
Allow lazily loaded modules to be imported multiple times without forced resolution
Nov 19, 2024
Feature or enhancement
Proposal:
tl;dr
I'd like
__spec__
to be passed through byimportlib.util._LazyModule.__getattribute__
without triggering the full load of the module. That way, the regular, internal import machinery doesn't accidentally trigger the full load when fishing the lazy module out ofsys.modules
. This can be caused by re-(lazy-)importing the module, and the result, a fully loaded module, is pretty unexpected.Full Story
I've been trying to use
importlib.util.LazyLoader
lately and found a way it could be made more ergonomic.To start off, a demonstration of what I want to work, based on the lazy import recipe in the importlib docs:
The above recipe works, but without the sys.modules caching I added, the second import would cause the module to execute and populate, even though the user hasn't gotten an attribute from it yet.
Fair enough, it's just a recipe for the docs. It's not meant to cover all the edge cases and use cases. What about a different code snippet that tries not to manually perform every part of the import process, though?
Let's try again, but using an import hook like, say, a custom finder on the meta path to wrap the found spec's loader with
LazyLoader
. That way, it'll take advantage all the thread locks importlib uses internally and can even affect normal import statements. Here's an example:Unfortunately, the above code has the same flaw as the original importlib recipe when used directly: the module cache isn't being taken advantage of. However, it's not possible to work around from user code without a ton of copying.
Adding
import typing
again at the bottom will causetyping
to get fully executed. This is demonstrable in two ways:Adding print statements checking the type of the module:
Output:
By putting the above code snippet in a file(e.g.
scratch.py
) then checking the output ofpython -X importtime -c "import scratch"
before and after adding a secondimport typing
statement:Before
After
The reason for this, in my eyes, lack of correspondence, is a small implementation detail:
cpython/Lib/importlib/_bootstrap.py
Lines 1348 to 1355 in 1c0a104
Because the
__spec__
is requested even when checking the module cache, andimportlib.util._LazyModule
makes no exceptions for attribute requests, well, the original loader will always execute and the module will populate. To get around this, a user would have to copyimportlib.util._LazyModule
andimportlib.util.LazyLoader
, modify them (see suggested patch below), and use those local versions instead.Thus, I propose adding a small special case within _LazyModule to make usage with
sys.modules
more ergonomic:If
__spec__
is requested, just return that without loading the whole module yet. That way, instances of _LazyModule within sys.modules won't be forced to resolve immediately by regular import machinery, not until the module is visibly accessed by the user. The diff would be quite small:I hope this makes sense and isn't too long-winded.
EDIT: Added a tl;dr at the top.
EDIT2: Added an easier way to demonstrate the full load being triggered.
EDIT3: Adjusted phrasing.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
importlib.util._LazyModule.__getattribute__
to special-case requests for__spec__
. #127038The text was updated successfully, but these errors were encountered: