-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Scalar views/Non-owning scalar type #6558
Comments
What's wrong with |
We did try using a const reference. The issue ended up being with cython. If I understand correctly the way it parses the code it ends up trying to declare the reference and then assign it later which I believe is illegal. To fix it they'd have to change the way they handle a bunch of scoping rules and they've opted to throw in this case instead. |
We pass |
We can't write this in cython at all:
It will actually throw |
Why do you need to create a reference variable on the stack? Why not just pass the object to the function directly? I'm still not following why this can be done for Making an explicit |
Passing the scalar objects directly is the current approach and it works. However now we want to persist those scalars and maintain ownership of them through a python object. If we just pass the object directly we don't know what happens to it after that point, so if we need it in python again something might go wrong. |
It would be a shame to have to build and maintain in scalar_view in C++ if it is only needed because Cython can't handle reference variables correctly. Surely there must be a Cython / Python workaround. |
It's a constant reference. Nothing should happen to it. Cython would be fundamentally broken if it couldn't truly pass objects by reference into C++ APIs. I presume that the Python
|
Or in Cython:
Or something like that. Cython is weird. |
Apologies if this is incorrect. But if that is the case why have column views instead of just passing columns by const reference? I had assumed it was to clearly communicate that the object in question was owned by something else and to make it impossible to modify or release the data. Even if all the libcudf functions accept const references, and thus nothing can harm the scalar from within that function, I'd still be creating an API that produces something that can be used unsafely by someone/something else, which I have to assume will happen. The problem with the python scalar holding a unique_ptr, pointing to the cudf::scalar, which is then dereferenced, is that to return that unique pointer from a function, we have to move it to the caller, which then invalidates the python scalar. |
I don't understand what you mean by this.
Why return the |
To clear up some of the confusion re: Cython What doesn't work right now is writing a
The above doesn't compile because there is something broken in Cython which results in the generated C++ code declaring a reference without initializing it. Suppose we had a libcudf function
So why don't we just do that? I believe the reason is currently, our
This is one of the reasons why I previously proposed a level of indirection, where we break up the implementation into two classes. A
And now we call our libcudf function as follows:
|
Couldn't you also just return the object by a Then you could do:
|
Yes, we could totally do that. |
The reason I want to return the unique pointer from a function, is because I want to potentially do things with the object before returning. In the current scheme when we construct the python scalar, we don't immediately set the device pointer with the value, because there might be other reasons the python code needs to use/work with that value before it goes to libcudf. We don't want to keep having to go back and forth to the device to do these things. Instead it would execute lazily and only copy the value to the device when needed by libcudf. To illustrate the problem with the const reference approach ( and frankly to convince myself why it would not work ) I put together a basic example of what cython tries to compile the resulting function to, and why it won't work. Suppose we had this basic function to return an int in cython:
When compiled, this becomes the following. I've removed a lot of the fluff that cython generates as part of, presumably, the optimization:
So now, if you modify this to be
Cython would now write
Which won't compile. |
Additionally, after giving it some thought, I realized that a |
@brandon-b-miller why does the generated Cython-C++ code from your example never assign the literal Also, that example is fundamentally different. It is returning a reference to a stack variable, which clearly shouldn't compile. The real example we are talking about is a method on an object that returns a |
Apologies. It is the code cython generates, minus a bunch of reference counting machinery, and with the variable names demangled. I cut out a bunch of stuff including the assignment. I think the below example might be better.
running
Which when you trim down, as far as I can tell, basically says
When the compiler hits then we get
For completeness here's the full compiled code for the simple function I had previously written
Which has the missing line I think you might be looking for. All this said, |
Has anyone filed a bug against Cython for this? |
There's cython/cython#1695 where the case is basically made that supporting this would basically require them to change how the scoping rules between c++ and python are woven together, so my impression is that it's a wont-fix. |
Interesting. OK, well I think on the C++ side we feel you should go with the |
The developer guide references a `scalar_view` class that does not exist. This PR removes that reference. See #6558 for the rationale of why no such class exists. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Nghia Truong (https://github.com/ttnghia) - https://github.com/nvdbaranec URL: #11132
Is your feature request related to a problem? Please describe.
Column objects support
view
s which make it so that we can own those column objects with python objects, and when we want to, safely convert the data to a view and pass it libcudf functions without giving up that ownership. But we currently can't do the same thing for scalars. Up until recently it didn't really matter because we only ever created scalars immediately before passing them to libcudf, and didn't need to persist them thereafter - so it was OK if they were released. But now, we'd like to hold onto those scalars and reuse them.Describe the solution you'd like
A non-owning scalar class similar to column views.
Describe alternatives you've considered
Passing around a raw pointer to a scalar
Hacking a wrapper around a unique pointer and passing that around
Using a shared pointer
Additional context
cc @shwina @cwharris
xref #6297
The text was updated successfully, but these errors were encountered: