Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importlib.abc.Traversable.read_text() incompatible with importlib.resources._functional.read_text() usage (Python 3.13) #127012

Open
kurtmckee opened this issue Nov 19, 2024 · 6 comments · May be fixed by python/importlib_resources#321
Assignees
Labels
stdlib Python modules in the Lib dir topic-importlib type-bug An unexpected behavior, bug, or error

Comments

@kurtmckee
Copy link
Contributor

kurtmckee commented Nov 19, 2024

Bug report

Bug description:

I'm writing a custom importer and discovered that the function signature for importlib.abc.Traversable.read_text() is incompatible with the usage in importlib.resources._functional.read_text(), specifically on Python 3.13.

  1. importlib.abc.Traversable.read_text() is a concrete method; its implementation calls the .open() method, which is an abstract method that must be implemented. The expectation is therefore that implementing .open() in a Traversable subclass is sufficient for .read_text() to work.

    Note below that the .read_text() method is not marked as abstract, and includes only one parameter: encoding:

    def read_text(self, encoding: Optional[str] = None) -> str:
    """
    Read contents of self as text
    """
    with self.open(encoding=encoding) as strm:
    return strm.read()

  2. Application code that attempts to read a package resource, like importlib.resources.read_text(module, "resource.txt") ultimately leads to a call to importlib.resources._functional.read_text(), which attempts to call the .read_text() method of a Traversable subclass, but includes an errors parameter that doesn't exist in Traversable's default concrete method:

    def read_text(anchor, *path_names, encoding=_MISSING, errors='strict'):
    """Read and return contents of *resource* within *package* as str."""
    encoding = _get_encoding_arg(path_names, encoding)
    resource = _get_resource(anchor, path_names)
    return resource.read_text(encoding=encoding, errors=errors)

  3. Consequently, it appears to be necessary for all Traversable subclasses to not only re-implement its concrete .read_text() method, but also to override its signature.

I think that the Traversable .read_text() method signature, and the call site in importlib.resources._functional.read_text(), need to align with each other.

I'd like to submit a PR for this! However, I would like confirmation that an errors parameter should be added to the Traversable.read_text() method.

Note that adding an errors parameter was previously discussed in #88368.

Demonstration of TypeError bug

import io
import sys
import typing
import pathlib
import types
import importlib.abc
import importlib.machinery
import importlib.metadata
import importlib.resources.abc


class ExampleFinder(importlib.abc.MetaPathFinder):
    def find_spec(
        self,
        fullname: str,
        path: typing.Sequence[str] | None,
        target: types.ModuleType | None = None,
    ) -> importlib.machinery.ModuleSpec | None:
        if fullname != "demonstrate_error":
            return None

        print(f"ExampleFinder.find_spec('{fullname}')")
        spec = importlib.machinery.ModuleSpec(
            name=fullname,
            loader=ExampleLoader(),
            is_package=True,
        )
        return spec


sys.meta_path.append(ExampleFinder())


class ExampleLoader(importlib.abc.Loader):
    def exec_module(self, module: types.ModuleType) -> None:
        print(f"ExampleLoader.exec_module({module})")
        exec("", module.__dict__)

    def get_resource_reader(self, fullname: str) -> "ExampleTraversableResources":
        print(f"ExampleLoader.get_resource_reader('{fullname}')")
        return ExampleTraversableResources(fullname)


class ExampleTraversableResources(importlib.resources.abc.TraversableResources):
    def __init__(self, fullname: str) -> None:
        self.fullname = fullname

    def files(self) -> "ExampleTraversable":
        print("ExampleTraversableResources.files()")
        return ExampleTraversable(self.fullname)


# ----------------------------------------------------------------------------
# ExampleTraversable implements all five of the Traversable abstract methods.
# Specifically, it is expected that implementing `.open()` will be sufficient,
# but this will not be the case.
#

class ExampleTraversable(importlib.resources.abc.Traversable):
    def __init__(self, path: str):
        self._path = path

    def iterdir(self) -> typing.Iterator["ExampleTraversable"]:
        yield ExampleTraversable("resource.txt")

    def is_dir(self) -> bool:
        return False

    def is_file(self) -> bool:
        return True

    def open(self, mode='r', *args, **kwargs) -> typing.IO[typing.AnyStr]:
        return io.StringIO("Nice! The call to .read_text() succeeded!")

    # Uncomment this `.read_text()` method to make `.read_text()` calls work.
    # It overrides the `Traversable.read_text()` signature.
    #
    # def read_text(self, encoding: str | None, errors: str | None) -> str:
    #     print(f"ExampleTraversable.read_text('{encoding}', '{errors}')")
    #     return str(super().read_text(encoding))

    @property
    def name(self) -> str:
        return pathlib.PurePosixPath(self._path).name


# -------------------------------------------------------------------------------
# Everything above allows us to import this hard-coded module
# and demonstrate a TypeError lurking in the Traversable.read_text() signature.
#

import demonstrate_error


# The next line will raise a TypeError.
# `importlib/resources/_functional.py:read_text()` calls `Traversable.read_text()`
# with an `errors` argument that is not supported by the default concrete method.
print(importlib.resources.read_text(demonstrate_error, "resource.txt"))

CPython versions tested on:

3.13

Operating systems tested on:

Linux

@kurtmckee kurtmckee added the type-bug An unexpected behavior, bug, or error label Nov 19, 2024
@ZeroIntensity ZeroIntensity added topic-importlib stdlib Python modules in the Lib dir labels Nov 19, 2024
@kanishka-coder0809
Copy link

hey @kurtmckee I can work on this issue ...can you assign me.

@kurtmckee
Copy link
Contributor Author

kurtmckee commented Nov 20, 2024

@kanishka-coder0809 As I wrote in the text above, I'd like to fix this myself but I'm waiting for confirmation how to proceed, because I think this issue implicates importlib_resources and the standard library.

In any case, I don't have GitHub permissions to assign this to anyone.

@williamwen42
Copy link
Contributor

I'm encountering this issue while trying to add 3.13 support to PyTorch.

cc @brettcannon @jaraco @warsaw @FFY00

@jaraco
Copy link
Member

jaraco commented Nov 23, 2024

Thanks Kurt for the detailed report elucidating the issue. I do think I agree that adding the required parameter to the default read_text implementation sounds right.

When reasoning about where to create the fix, I'm fine if it goes here or in importlib_resources. I personally find it easier and faster to develop on importlib_resources first, in part because it will apply to older Python versions (3.9+) and also because the test suite is easier to run (just run tox). But if you choose to develop it here on CPython first, that's fine and I'll backport it to importlib_resources (and thus also get immediate feedback when it's released in short order).

At first, I was thinking that this change would be backward-incompatible, because it's a change to a protocol (on which downstream consumers rely), but since it's a change to a concrete method on the protocol, I'm thinking a backward-compatible change is possible, but we'll want to think carefully about that aspect of the change.

Please feel free to proceed with a proposed fix.

@kurtmckee
Copy link
Contributor Author

@jaraco Understood. I'll work to introduce a PR to the importlib_resources repository. Thanks for the quick response!

kurtmckee added a commit to kurtmckee/pr-importlib_resources that referenced this issue Nov 25, 2024
This adds an in-memory finder, loader, and traversable implementation,
which allows the `Traversable` protocol and concrete methods to be tested.

This additional infrastructure demonstrates python/cpython#127012,
but also highlights that the `Traversable.joinpath()` concrete method
raises `TraversalError` which is not getting caught in several places.
kurtmckee added a commit to kurtmckee/pr-importlib_resources that referenced this issue Nov 25, 2024
`importlib_resources.read_text()` calls the `Traversable.read_text()`
concrete method with an `errors` argument that doesn't exist in the
method signature, resulting in an `TypeError`.

This is resolved by adding an `errors` parameter to
`Traversable.read_text()`.

Fixes python/cpython#127012
@zooba
Copy link
Member

zooba commented Nov 28, 2024

adding the required parameter to the default read_text implementation sounds right.

That's fine, but we also need to remove the call that passes it, since code already exists that can't receive the parameter. Changing the protocol requires a deprecation cycle and can't be backported. Adding a new (optional) method that also accepts errors would be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-importlib type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants