Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dataset][usability] Dataset dependencies #18346

Merged
merged 12 commits into from
Sep 30, 2021

Conversation

wuisawesome
Copy link
Contributor

Why are these changes needed?

This creates extras for dataset optional dependencies, so they can be installed via pip install ray[data].

It also moves the fsspec import so it won't be invoked if a pyarrow fs is passed in.

Related issue number

#18262

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do I need to install all of the above packages to get ray.data to work?

Can you just only install pyarrow and raise an error/import suggestion when user tries to call the others?

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Sep 4, 2021
@ericl
Copy link
Contributor

ericl commented Sep 8, 2021

Ping on this

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address Clark's comments?

@wuisawesome wuisawesome removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Sep 27, 2021
Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Sep 27, 2021
@ericl
Copy link
Contributor

ericl commented Sep 28, 2021 via email

@wuisawesome
Copy link
Contributor Author

The import error just means that they didn't install the fsspec module, which should be fine. If they didn't install fsspec, that just implies that their file system type isn't an fsspec file system.

@ericl
Copy link
Contributor

ericl commented Sep 28, 2021 via email

@wuisawesome
Copy link
Contributor Author

Well they don't have to pass in an fsspec file system, for example they could just pass in a regular pa.fs file system.

Copy link
Contributor

@clarkzinzow clarkzinzow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the suggested changes

@wuisawesome wuisawesome added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Sep 29, 2021
@ericl ericl merged commit 5709c65 into ray-project:master Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants