Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only import BlazingSQL as necessary #109

Merged
merged 1 commit into from
Nov 6, 2020
Merged

Conversation

esoha-nvidia
Copy link
Contributor

BlazingSQL is broken: BlazingDB/blazingsql#1037

This workaround allows the repo to work when BlazingSQL isn't enabled.

BlazingSQL is broken: BlazingDB/blazingsql#1037

This workaround allows the repo to work when BlazingSQL isn't enabled.
@esoha-nvidia
Copy link
Contributor Author

Any progress here?

Copy link
Member

@ayushdg ayushdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing change looks good.
Another additional case where blazingsql is imported would be here: https://github.com/rapidsai/tpcx-bb/blob/53b1eb2546e83a0e4a8e8937ddb56c2aaf0c3c33/tpcx_bb/xbb_tools/cluster_startup.py#L169

Called from
https://github.com/rapidsai/tpcx-bb/blob/main/tpcx_bb/benchmark_runner.py#L31
and
https://github.com/rapidsai/tpcx-bb/blob/main/tpcx_bb/benchmark_runner.py#L46

It would be great to handle that case as well to ensure the benchmark can be run without the presence of blazinsql in the existing environment.

@esoha-nvidia
Copy link
Contributor Author

It would be great to handle that case as well to ensure the benchmark can be run without the presence of blazinsql in the existing environment.

I agree, but that change can be made separately from this one. It's better practice to have smaller commits that do just one thing each rather than one big commit which will be harder to revert if a revert is needed. As suggested here: https://softwareengineering.stackexchange.com/a/206984

So how about we submit this one and then do more in another one?

@beckernick
Copy link
Member

I agree, but that change can be made separately from this one. It's better practice to have smaller commits that do just one thing each rather than one big commit which will be harder to revert if a revert is needed. As suggested here: https://softwareengineering.stackexchange.com/a/206984

If the premise of the PR is "to make blazingSQL optional" the suggested change is necessary. It doesn't make this PR less atomic to include a necessary change to accomplish the goal. With that said, a follow-up is fine.

@esoha-nvidia
Copy link
Contributor Author

Would it even be possible to fix the other one? We can only do it at runtime. Does that code run after we already know if blazingsql will be used or not?

@beckernick
Copy link
Member

beckernick commented Nov 6, 2020

Python will throw an ImportError or ModuleNotFoundError if it's not available, which could be caught as an exception and skipped at runtime.

Alternatively, I believe that code only exists due to a period in the spring/summer in which the ordering of imports of RAPIDS libraries was critical to avoiding some subtle errors. I suspect this might no longer be necessary, in which case it could be ripped out.

EDIT: As @ayushdg pointed out offline, it was actually to avoid long import times with the interaction between cuml and UCX.

@ayushdg
Copy link
Member

ayushdg commented Nov 6, 2020

Approving existing changes, the rest can be handled in a followup Pr. Will raise an issue for tracking.

@ayushdg ayushdg merged commit 58bd34d into rapidsai:main Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants