Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add a argument or flag to control whether the constructor and operator= should be async #557

Open
JustPlay opened this issue Sep 16, 2020 · 12 comments

Comments

@JustPlay
Copy link

JustPlay commented Sep 16, 2020

Add a argument or flag to control whether the constructor and operator= should be async

device_buffer(void const* source_data,

device_buffer& operator=(device_buffer const& other)

void copy(void const* source, std::size_t bytes)

the copy() is async and the above ctor and operator= do not wait for the copy to be done, so the caller may need to sync t he stream explicitly (e.g. if the caller operate on more than two different stream(s))

I think it's better to add a extra parameter or flag to indicate whether we need a sync patten,and we can give it a default argument so no code need to be changed for RMM user (e.g. cuDF and rapids)

@JustPlay JustPlay changed the title [FEA] Add a argument to control whether the constructor and operator= should be async [FEA] Add a argument or flag to control whether the constructor and operator= should be async Sep 16, 2020
@JustPlay
Copy link
Author

JustPlay commented Sep 16, 2020

auto set_value(T host_value, cudaStream_t stream = 0)

auto set_value(T host_value, cudaStream_t stream = 0)

inline void _memcpy(void *dst, const void *src, cudaStream_t stream) const

Why the set operator is async by default? (i thinks this put too much work onto the caller side)

@harrism
Copy link
Member

harrism commented Sep 16, 2020

I think the intended design is that everything is async and stream-ordered. A user is free to add a synchronous wrapper if needed, but in general in RAPIDS we need asynchronicity.

You could use a thrust::device_vector but you may find, like we did, that the convenience it provides comes at a performance cost.

@jrhemstad
Copy link
Contributor

jrhemstad commented Sep 16, 2020

I think the intended design is that everything is async and stream-ordered. A user is free to add a synchronous wrapper if needed, but in general in RAPIDS we need asynchronicity.

You could use a thrust::device_vector but you may find, like we did, that the convenience it provides comes at a performance cost.

I think the original issue brings up valid points. There's a lot of things I don't like about device_buffer that I fixed in device_uvector:

  • There is no constructor to copy from arbitrary host/device data (only from another device_uvector)
    • I don't like controlling the (a)synchrony of a constructor with a parameter. Instead, I would make the constructor synchronous and use a factory for the async version, e.g., auto buff = device_buffer::make_async(n, stream).
  • operator= is deleted
  • Modifiers are split based on asynchrony, e.g., set_element and set_element_async
  • No default stream arguments

I would like to make similar changes to device_buffer.

@JustPlay
Copy link
Author

#570

@github-actions
Copy link

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@github-actions
Copy link

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

@harrism
Copy link
Member

harrism commented May 25, 2021

I think the original issue brings up valid points. There's a lot of things I don't like about device_buffer that I fixed in device_uvector:

1. There is no constructor to copy from arbitrary host/device data (only from another `device_uvector`)
  
  * I don't like controlling the (a)synchrony of a constructor with a parameter. Instead, I would make the constructor synchronous and use a factory for the async version, e.g., `auto buff = device_buffer::make_async(n, stream)`.

2. `operator=` is deleted

3. Modifiers are split based on asynchrony, e.g., [`set_element`](https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_uvector.hpp#L193) and [`set_element_async`](https://github.com/rapidsai/rmm/blob/6a3a62fce9c002c8390561566a71f544d7a017c7/include/rmm/device_uvector.hpp#L230)

4. No default stream arguments

I would like to make similar changes to device_buffer.

#775 addresses #2 and #4 in Jake's list. It deletes the copy assignment operator and copy constructor, and requires an explicit stream everywhere possible (e.g. resize() and shrink_to_fit()).

@jrhemstad are you suggesting we remove this constructor?

device_buffer(void const* source_data,
std::size_t size,
cuda_stream_view stream = cuda_stream_view{},
mr::device_memory_resource* mr = mr::get_current_device_resource())
: _stream{stream}, _mr{mr}
{
allocate(size);
copy(source_data, size);
}

And when you say "I would make the constructor synchronous ", you don't mean to synchronize the specified stream, do you?

@harrism
Copy link
Member

harrism commented Jun 24, 2021

@jrhemstad are you suggesting we remove this constructor?

@jrhemstad can you comment? I'd like to make progress on closing or implementing this FEA.

@jrhemstad
Copy link
Contributor

@jrhemstad are you suggesting we remove this constructor?

@jrhemstad can you comment? I'd like to make progress on closing or implementing this FEA.

I think I meant making that constructor synchronous (sync the specified stream) and have an async factory.

@harrism
Copy link
Member

harrism commented Jun 24, 2021

Wouldn't an sync factory need to call an async constructor?

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

3 participants