-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Remove device buffer default ctor #424
[WIP] Remove device buffer default ctor #424
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
@jrhemstad what is blocking this? |
As I mentioned in the PR, it turned out to be more tedious than I expected, so I never got around to finishing it. I can try and do so this week. |
So while I have to preface this with saying that I haven't looked at I also have a fundamental usability concerns about data types without default constructor. If one uses such a type as a member of a class, that class itself becomes not default-constructible, and the chain repeats from there. I think having a default constructed empty vector can well be considered a valid state. And as it doesn't have any data yet, it doesn't need to know a stream yet, either. Sometimes, you may construct a class containing a Also the default stream, in particular when using default-stream-per-thread, might well be just what you wanted, anyway. (And of course sometimes, it's not, and it hurts performance, I see that point, too.) |
Breaking this compatibility is intentional. There are crucial differences between
vs.
This is why we require the stream to be explicit to make it more likely for users to be aware of this potential error.
Not quite. For example:
Sure, I agree the default stream is often what people want. The point is that it Now then, I could be swayed by an argument that we could allow |
So I agree what you're saying from your point of view, but on the other hand there's also a need (at least by a single person, me) for an uninitialized vector type that's a [Sure in your example above, I can explicitly initialize a class member. But that essentially stops working when the class member is a template type that could be either I might argue, though, that maybe the issue is more with the naming -- when seeing |
Sure, a synchronous, uninitialized vector would be useful, but that's in Thrust's wheelhouse, not RMM's. Thrust is a higher level library of (mostly) synchronous data structures and algorithms that are relatively easy to use. RMM is a library for high performance, asynchronous memory allocation. That performance comes at the cost of some ease of use. This is the classic balance between abstraction vs. performance. We try and keep RMM as high level and easy to use as we can, but there is no denying that RMM sits on a point in the spectrum closer to performance than abstraction compared to Thrust.
I can understand the confusion, but I think it stems from confusion about RMM itself as opposed to the particular
Everything is built around this core interface, as such, everything* is stream aware. Asynchrony and stream awareness is part and parcel in using RMM. (*) This is a bit of a white lie since |
I do agree with this -- but I don't think this is ever going to happen in thrust, since I think it's been asked for for many years. Obviously that doesn't mean that it's RMM's job to fill the gap, though it would have been convenient, since you already have a uvector, and you generally seem to care about thrust interoperability.
If you consider RMM's main feature to be asynchronous allocations/deallocations, I agree. I've found RMM because it was recommended to me as a way to avoid the cost of repeated device memory allocations/deallocations, the issue not being that they're synchronizing but just plain slow. You may want to put more emphasis on asynchronous behavior in your README.md, as the overview there really doesn't talk about streams at all, and mentions asynchronicity only with regard to memcpy, not allocation itself.
Your example, though is yet another case where if no stream is specified, it defaults to the default stream, just as elsewhere in CUDA ;). In any case, it's obviously up to you guys to define what RMM is aimed at and what you prioritize in your interfaces. I just meant to bring up an argument for keeping uvector default constructible, so it's (easily) usable both for people who don't care about streams at all (or just per-thread) as well as for people who do more elaborate stream-based design, but I of course I acknowledge that this does make it easier to misuse for people who do use streams directly, so it is a trade-off. |
It's a small team with a lot on their plate. I've talked with them in the past and they'd support adding a It's less work than you might think. I've looked into it and all it would really require is defining a new type based on So something like:
Not a lot of work, somebody just has to do it and test it.
A large part of what makes
I definitely appreciate having external feedback on usability. It definitely helps to expose biases and blindspots. I think I'm okay with making |
Okay, thanks, I might try to get around to doing this some time. Based on googling I had thought they're kinda philosophically opposed to this (I think it's not really C++ standard conforming to have objects in not-initialized state), but apparently not.
Maybe. I don't know, in my application, I'd guess cudaMalloc is > 100x slower than a
I'd say you should do whatever you think is best for the uvector you're envisioning. I'll admit that it doesn't fit what I was hoping to use it for, and that's perfectly okay. (FWIW, the |
@jrhemstad can you update the status of this? Moving to 0.17. |
Based on conversation, I think we can leave the default ctor and just require all operations to take an explicit stream. |
Closes #422
Deletes the default ctor for device_buffer and removes all uses.