Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iter_files_distros will fail if sys.path contains a PosixPath (AttributeError: 'PosixPath' object has no attribute 'rstrip') #43

Open
afparsons opened this issue Jun 29, 2021 · 2 comments

Comments

@afparsons
Copy link

iter_files_distros(path=None, repeated_distro='first') will fail if one of the elements in sys.path is not a string. For example, my sys.path contained a PosixPath:

['/usr/local/bin', '/usr/local/lib/python38.zip', '/usr/local/lib/python3.8', '/usr/local/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/site-packages', PosixPath('/model_infrastructure')]

PosixPath objects do not implement an rstrip() method, and so the following happens [permalink]:

if path is None:
    path = sys.path

# `path` is now a list containing some strings and in my accidental case, a PosixPath

for folder in path:
    if folder.rstrip('/\\').endswith('.egg'):

# oops

I'll defer to someone with far more Python experience than I have to decide whether this is just a silly user error (i.e. entirely my fault), or something that should be handled within iter_files_distros(...).

Maybe it seems absurd that anything other than a string would find its way into sys.path, but for newcomers to Python (i.e. me, since ~3.5) who have grown up with pathlib, it doesn't feel too out of place. In fact, here's a PR.


For search engines and passers-by:

I had encountered this stacktrace in my Django + Dagster + MLflow + Gensim + Docker project:

AttributeError: 'PosixPath' object has no attribute 'rstrip'
  File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/plan/utils.py", line 42, in solid_execution_error_boundary
    yield
  File "/usr/local/lib/python3.8/site-packages/dagster/utils/__init__.py", line 382, in iterate_with_context
    next_output = next(iterator)
  File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/plan/compute_generator.py", line 65, in _coerce_solid_compute_fn_to_iterator
    result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
  File "/model_infrastructure/solids/gensim.py", line 160, in cyclic_word2vec
    experiment = mlflow.get_experiment_by_name(experiment_name)
  File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 815, in get_experiment_by_name
    return MlflowClient().get_experiment_by_name(name)
  File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/client.py", line 434, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
  File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/_tracking_service/client.py", line 128, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
  File "/usr/local/lib/python3.8/site-packages/mlflow/store/tracking/rest_store.py", line 263, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
  File "/usr/local/lib/python3.8/site-packages/mlflow/store/tracking/rest_store.py", line 55, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/usr/local/lib/python3.8/site-packages/mlflow/utils/rest_utils.py", line 163, in call_endpoint
    response = http_request(
  File "/usr/local/lib/python3.8/site-packages/mlflow/utils/rest_utils.py", line 47, in http_request
    from mlflow.tracking.request_header.registry import resolve_request_headers
  File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/request_header/registry.py", line 39, in <module>
    _request_header_provider_registry.register_entrypoints()
  File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/request_header/registry.py", line 21, in register_entrypoints
    for entrypoint in entrypoints.get_group_all("mlflow.request_header_provider"):
  File "/usr/local/lib/python3.8/site-packages/entrypoints.py", line 236, in get_group_all
    for config, distro in iter_files_distros(path=path):
  File "/usr/local/lib/python3.8/site-packages/entrypoints.py", line 128, in iter_files_distros
    if folder.rstrip('/\\').endswith('.egg'):

It took me a while to find the cause.

My BASE_DIR in Django's settings.py was a pathlib Path object:

from pathlib import Path

# Build paths inside the project like this: BASE_DIR / 'subdir'.
BASE_DIR = Path(__file__).resolve().parent.parent

For whatever reason, I had to append BASE_DIR to sys.path from within a Dagster solid:

sys.path.append(BASE_DIR)

This resulted in a PosixPath object being added to sys.path. As explained above, sys.path is expected presently to hold only strings, and so iter_files_distros(path=None) will just call rstrip(...) on each element without type consideration.

A solution is ensure that a string is appended to sys.path:

sys.path.append(str(BASE_DIR.absolute()))
frosklis pushed a commit to frosklis/entrypoints that referenced this issue Jan 26, 2023
it can happen that the folders are actually `PosixPath`. They should be
converted to strings before applying rstrip.
@Scaramir
Copy link

Scaramir commented Sep 1, 2023

Would be awesome, if this could be included in a pypi release, 'cause it would be the same for WindowsPath.

@takluyver
Copy link
Owner

takluyver commented Sep 11, 2023

For now, the docs on the sys module say that sys.path should only contain strings:

A program is free to modify this list for its own purposes. Only strings should be added to sys.path; all other data types are ignored during import.

So I think the right thing to do in entrypoints is to just ignore pathlib objects, rather than converting them to [edit: string] paths.

I should also make it clear that entry points are now supported in the Python standard library, by the importlib.metadata module, so this package is no longer all that important. But I'm happy to merge a fix if people are still using it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants