-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Currently the IO writers aren't using pinned memory #4020
Comments
Seems related to 4019 (at least, it seems like an abstract interface would solve both issues). |
Agree this is related to 4019. We just need more control over how the memory is allocated. I don't think it's as simple as a boolean "use pinned or not" concept. Pinned memory is a critical resource that often takes a long time to allocate, so we tend to pool it to amortize the cost. The caller doesn't know how big the output will be, so it can't always make a good decision up-front whether to use pinned memory or not. Sometimes the answer will be: "use pinned memory if it fits" or maybe even "use pinned memory for the parts that can fit".
I think that's a perfectly acceptable solution. Another alternative is to not copy the output data anywhere but simply inform the caller where the output data has been generated. For example, returning a vector of address,size pairs that when considered in order compose the desired output data. Then the caller can decide how to extract them, be that copy to pinned memory, transfer to another device, etc. Both the callable object and the memory block descriptors approach address not only this request but also the ownership request in #4019. Therefore I propose we close this issue and add pinned memory to the discussion to #4019. |
Also, @jrhemstad is working on a |
Is your feature request related to a problem? Please describe.
We would like the writers to use pinned memory if possible before falling back to the default
Describe the solution you'd like
@jlowe: "provide an abstracted writer object API that is passed and used during writing to allocate output memory, or maybe output isn't directly placed contiguously in host memory but instead a vector of <addrspace, addr, size> tuples/structs are returned so the caller can fetch the data themselves. addrspace in this context is host vs. device memory."
The text was updated successfully, but these errors were encountered: