-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dataset][usability] Dataset dependencies #18346
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do I need to install all of the above packages to get ray.data to work?
Can you just only install pyarrow and raise an error/import suggestion when user tries to call the others?
Ping on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you address Clark's comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix
Why did you convert an import error to type error?
…On Mon, Sep 27, 2021, 7:39 PM Alex Wu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/ray/data/datasource/file_based_datasource.py
<#18346 (comment)>:
> @@ -202,11 +201,17 @@ def _resolve_paths_and_filesystem(
raise ValueError("Must provide at least one path.")
if filesystem and not isinstance(filesystem, FileSystem):
+ err_msg = f"The filesystem passed must either conform to " \
+ f"pyarrow.fs.FileSystem, or " \
+ f"fsspec.spec.AbstractFileSystem. The provided " \
+ f"filesystem was: {filesystem}"
+ try:
+ import fsspec
+ except ModuleNotFoundError:
+ raise TypeError(err_msg)
What error do you think should be raised? This means they passed in a fs
object that isn't a pa.fs or fsspec fs.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#18346 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSXLA344B74FP2WCSKTUEETE5ANCNFSM5DMP54KQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
The import error just means that they didn't install the fsspec module, which should be fine. If they didn't install fsspec, that just implies that their file system type isn't an fsspec file system. |
Shouldn't we raise an error telling them to install fsspec? How else is the
user supposed to resolve the issue?
…On Mon, Sep 27, 2021, 8:15 PM Alex Wu ***@***.***> wrote:
The import error just means that they didn't install the fsspec module,
which should be fine. If they didn't install fsspec, that just implies that
their file system type isn't an fsspec file system.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#18346 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSRPHWWCFJLXCLFYNHDUEEXMBANCNFSM5DMP54KQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Well they don't have to pass in an fsspec file system, for example they could just pass in a regular |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the suggested changes
Co-authored-by: Clark Zinzow <[email protected]>
Co-authored-by: Clark Zinzow <[email protected]>
Why are these changes needed?
This creates extras for dataset optional dependencies, so they can be installed via
pip install ray[data]
.It also moves the fsspec import so it won't be invoked if a pyarrow fs is passed in.
Related issue number
#18262
Checks
scripts/format.sh
to lint the changes in this PR.