Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support SplitAndRetry for GpuFastSampleExec #8313

Open
revans2 opened this issue May 17, 2023 · 0 comments
Open

[FEA] Support SplitAndRetry for GpuFastSampleExec #8313

revans2 opened this issue May 17, 2023 · 0 comments
Labels
feature request New feature or request reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented May 17, 2023

Is your feature request related to a problem? Please describe.
GpuFastSampleExec is a much faster, but not 100% Spark Compatible implementation of GpuSamleExec.

We should add in SplitAndRetry support to it. It is off by default so we should probably have this be lower priority.

Technically the current implementation is not deterministic when it comes to how the data is batches, but none of the sampling implementations are agnostic to the order of the rows and most of the time the order of the rows is not guaranteed, so it is probably good enough. With that context we probably can just put the split/retry blocks around the main part of the code. It should mostly do the right thing. If we really want to we could also checkpoint/restore the index that is used to help generate the seed for the filtering.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels May 17, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

2 participants