-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Have a global pinned memory pool by default #15612
Comments
Benchmarking results: Benchmarks consistently show improvement with pooled resource compared to pinned allocations. TODO: run benchmarks from #15585 because we expect to see higher impact in multi-threaded use cases. |
…5665) Issue #15612 Adds a pooled pinned memory resource that is created on first call to `get_host_memory_resource` or `set_host_memory_resource`. The pool has a fixed size: 0.5% of the device memory capacity, limited to 100MB. At 100MB, the pool takes ~30ms to initialize. Size of the pool can be overridden with environment variable `LIBCUDF_PINNED_POOL_SIZE`. If an allocation cannot be done within the pool, a new pinned allocation is performed. The allocator uses a stream from the global stream pool to initialize and perform synchronous operations (`allocate`/`deallocate`). Users of the resource don't need to be aware of this implementation detail as these operations synchronize before they are completed. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) Approvers: - Nghia Truong (https://github.com/ttnghia) - Alessandro Bellina (https://github.com/abellina) - Jake Hemstad (https://github.com/jrhemstad) - Vyas Ramasubramani (https://github.com/vyasr) URL: #15665
closes #15612 Expanded the set of vector factories to cover pinned vectors. The functions return `cudf::detail::host_vector`, which use a type-erased allocator, allowing us to utilize the runtime configurable global pinned (previously host) resource. The `pinned_host_vector` type has been removed as it can only support the non-pooled pinned allocations. Its use is not replaced with `cudf::detail::host_vector`. Moved the global host (now pinned) resource out of cuIO and changed the type to host_device. User-specified resources are now required to allocate device-accessible memory. The name has been changed to pinned to reflect the new requirement. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Alessandro Bellina (https://github.com/abellina) - Yunsong Wang (https://github.com/PointKernel) - Mark Harris (https://github.com/harrism) - David Wendt (https://github.com/davidwendt) URL: #15895
Users outside of Spark-RAPIDS still use the default, non-pooled, host memory resource and thus have the overhead of pinned memory allocations in
hostdevice_vector
, and any other places where pinned memory is used for faster data transfer.Proposal: Default to a memory resource with a small pinned pool. When the pool is full, the resource should fall back to new pinned allocations to keep consistent with the old behavior when too much pinned memory is used.
To ensure we don't impact CPU performance, the default size of the pool can be a set percentage of the total system memory. Pinning a small minority of system memory (~5%) should not have a negative impact.
Initially, only
hostdevice_vector
would use this resource but we can expand the pinned memory use in libcudf once a default pool resource is in place.Details to consider:
Pool should probably be created on first use - avoids duplicated pool is users set the resource before the first use.
Switching the host resource should work at any point, even if we must have two pools at the same time.
Can the default pool be safely destroyed? streams can't be destroyed on exit, not sure about
cudaFreeHost
The text was updated successfully, but these errors were encountered: