-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] dask_cudf.repartition fails with OOM very often #178
Comments
Counter-intuitively, setting Concatenating fewer large DataFrames seems to be easier on GPU memory than many smaller concatenations (which happens with the default parallel CSV reader). Even with |
is this on-disk size or in-memory size? These may differ considerably. CSV is actually a decently space-efficient format relative to common in-memory representations.
My first step here would be to watch the dashboard to see what is going on. |
I would also suggest setting |
Is this still relevant @randerzander ? |
Friendly nudge @randerzander 😉 |
Given TPCx-BB effort was successful, I'm assuming this has been resolved/improved, I'm tentatively closing this but feel free to reopen if this is observed again. |
Repartitioning about 350GB of CSV files (50 uncompressed files, each about 7gb) causes my dask-cuda cluster of 8 16 GB GPUs to fail with an OOM.
I'm attempting to use
npartitions=100
(3.5gb/partition) which I wouldn't expect to tax individual workers to the point of causing OOMs.Am I thinking about this correctly?
The text was updated successfully, but these errors were encountered: