[FEA] Currently the IO writers aren't using pinned memory #4020

razajafri · 2020-01-31T00:53:39Z

Is your feature request related to a problem? Please describe.
We would like the writers to use pinned memory if possible before falling back to the default

Describe the solution you'd like
@jlowe: "provide an abstracted writer object API that is passed and used during writing to allocate output memory, or maybe output isn't directly placed contiguously in host memory but instead a vector of <addrspace, addr, size> tuples/structs are returned so the caller can fetch the data themselves. addrspace in this context is host vs. device memory."

OlivierNV · 2020-01-31T07:08:21Z

Seems related to 4019 (at least, it seems like an abstract interface would solve both issues).
Can the cpp layer call back an object whose functions are implemented on the cython or java side ?

jlowe · 2020-01-31T14:37:38Z

Agree this is related to 4019. We just need more control over how the memory is allocated. I don't think it's as simple as a boolean "use pinned or not" concept. Pinned memory is a critical resource that often takes a long time to allocate, so we tend to pool it to amortize the cost. The caller doesn't know how big the output will be, so it can't always make a good decision up-front whether to use pinned memory or not. Sometimes the answer will be: "use pinned memory if it fits" or maybe even "use pinned memory for the parts that can fit".

Can the cpp layer call back an object whose functions are implemented on the cython or java side ?

I think that's a perfectly acceptable solution. Another alternative is to not copy the output data anywhere but simply inform the caller where the output data has been generated. For example, returning a vector of address,size pairs that when considered in order compose the desired output data. Then the caller can decide how to extract them, be that copy to pinned memory, transfer to another device, etc.

Both the callable object and the memory block descriptors approach address not only this request but also the ownership request in #4019. Therefore I propose we close this issue and add pinned memory to the discussion to #4019.

harrism · 2020-02-10T03:19:25Z

Also, @jrhemstad is working on a host_memory_resource for RMM that would allow allocating buffers of host pinned memory via RMM. I think a fix for this should use that functionality. rapidsai/rmm#260

GregoryKimball · 2022-07-01T06:12:07Z

Closed by #4231 as discussed in #4019

razajafri added feature request New feature or request Needs Triage Need team to review and classify labels Jan 31, 2020

sameerz added the Spark Functionality that helps Spark RAPIDS label Jan 31, 2020

harrism added cuIO cuIO issue RMM and removed Needs Triage Need team to review and classify labels Feb 10, 2020

harrism mentioned this issue Feb 11, 2020

[REVIEW] host_memory_resource rapidsai/rmm#272

Merged

7 tasks

GregoryKimball closed this as completed Jul 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Currently the IO writers aren't using pinned memory #4020

[FEA] Currently the IO writers aren't using pinned memory #4020

razajafri commented Jan 31, 2020

OlivierNV commented Jan 31, 2020

jlowe commented Jan 31, 2020

harrism commented Feb 10, 2020

GregoryKimball commented Jul 1, 2022

[FEA] Currently the IO writers aren't using pinned memory #4020

[FEA] Currently the IO writers aren't using pinned memory #4020

Comments

razajafri commented Jan 31, 2020

OlivierNV commented Jan 31, 2020

jlowe commented Jan 31, 2020

harrism commented Feb 10, 2020

GregoryKimball commented Jul 1, 2022