Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Scalar views/Non-owning scalar type #6558

Closed
brandon-b-miller opened this issue Oct 19, 2020 · 21 comments
Closed

[FEA] Scalar views/Non-owning scalar type #6558

brandon-b-miller opened this issue Oct 19, 2020 · 21 comments
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on

Comments

@brandon-b-miller
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Column objects support views which make it so that we can own those column objects with python objects, and when we want to, safely convert the data to a view and pass it libcudf functions without giving up that ownership. But we currently can't do the same thing for scalars. Up until recently it didn't really matter because we only ever created scalars immediately before passing them to libcudf, and didn't need to persist them thereafter - so it was OK if they were released. But now, we'd like to hold onto those scalars and reuse them.

Describe the solution you'd like
A non-owning scalar class similar to column views.

Describe alternatives you've considered
Passing around a raw pointer to a scalar
Hacking a wrapper around a unique pointer and passing that around
Using a shared pointer

Additional context
cc @shwina @cwharris
xref #6297

@brandon-b-miller brandon-b-miller added feature request New feature or request Needs Triage Need team to review and classify labels Oct 19, 2020
@brandon-b-miller brandon-b-miller added libcudf Affects libcudf (C++/CUDA) code. and removed feature request New feature or request labels Oct 19, 2020
@jrhemstad
Copy link
Contributor

What's wrong with scalar const&?

@brandon-b-miller
Copy link
Contributor Author

We did try using a const reference. The issue ended up being with cython. If I understand correctly the way it parses the code it ends up trying to declare the reference and then assign it later which I believe is illegal. To fix it they'd have to change the way they handle a bunch of scoping rules and they've opted to throw in this case instead.

@jrhemstad
Copy link
Contributor

jrhemstad commented Oct 19, 2020

We did try using a const reference. The issue ended up being with cython. If I understand correctly the way it parses the code it ends up trying to declare the reference and then assign it later which I believe is illegal. To fix it they'd have to change the way they handle a bunch of scoping rules and they've opted to throw in this case instead.

We pass column_view const& all over the place in libcudf public apis. If that works then surely scalar const& works.

@kkraus14 kkraus14 added feature request New feature or request and removed Needs Triage Need team to review and classify labels Oct 19, 2020
@brandon-b-miller
Copy link
Contributor Author

brandon-b-miller commented Oct 20, 2020

We can't write this in cython at all:

cdef int var = 2
cdef const int& other_var = var

It will actually throw C++ references cannot be declared; use a pointer instead. I believe the reason this isn't a problem for column objects is we just create column_view objects and pass those, which is fine.

@jrhemstad
Copy link
Contributor

jrhemstad commented Oct 20, 2020

Why do you need to create a reference variable on the stack? Why not just pass the object to the function directly? I'm still not following why this can be done for column_view but not scalar.

Making an explicit scalar_view object would be a lot of extra machinery that I'd really like to avoid.

@brandon-b-miller
Copy link
Contributor Author

Passing the scalar objects directly is the current approach and it works. However now we want to persist those scalars and maintain ownership of them through a python object. If we just pass the object directly we don't know what happens to it after that point, so if we need it in python again something might go wrong.

@harrism
Copy link
Member

harrism commented Oct 20, 2020

It would be a shame to have to build and maintain in scalar_view in C++ if it is only needed because Cython can't handle reference variables correctly. Surely there must be a Cython / Python workaround.

@jrhemstad
Copy link
Contributor

If we just pass the object directly we don't know what happens to it after that point, so if we need it in python again something might go wrong.

It's a constant reference. Nothing should happen to it. Cython would be fundamentally broken if it couldn't truly pass objects by reference into C++ APIs.

I presume that the Python scalar object would hold a unique_ptr<scalar> and to pass it into a libcudf API you'd just de-reference the scalar object:

void cudf::foo(scalar const& s);
...
unique_ptr<scalar> s{...};

cudf::foo(*s);

@harrism
Copy link
Member

harrism commented Oct 20, 2020

Or in Cython:

unique_ptr[scalar] s(...)

foo(dereference(s))

Or something like that. Cython is weird.

@brandon-b-miller
Copy link
Contributor Author

Apologies if this is incorrect. But if that is the case why have column views instead of just passing columns by const reference? I had assumed it was to clearly communicate that the object in question was owned by something else and to make it impossible to modify or release the data. Even if all the libcudf functions accept const references, and thus nothing can harm the scalar from within that function, I'd still be creating an API that produces something that can be used unsafely by someone/something else, which I have to assume will happen.

The problem with the python scalar holding a unique_ptr, pointing to the cudf::scalar, which is then dereferenced, is that to return that unique pointer from a function, we have to move it to the caller, which then invalidates the python scalar.

@jrhemstad
Copy link
Contributor

I'd still be creating an API that produces something that can be used unsafely by someone/something else, which I have to assume will happen.

I don't understand what you mean by this.

The problem with the python scalar holding a unique_ptr, pointing to the cudf::scalar, which is then dereferenced, is that to return that unique pointer from a function, we have to move it to the caller, which then invalidates the python scalar.

Why return the unique_ptr from a function?

@shwina
Copy link
Contributor

shwina commented Oct 20, 2020

To clear up some of the confusion re: Cython

What doesn't work right now is writing a cdef function in Cython that returns a reference. Say we have a class Scalar that encapsulates a unique_ptr[scalar:

cdef class Scalar:
    unique_ptr[scalar] c_obj

    cdef scalar& get_scalar_ref(self):  # this won't compile
        return dereference(self.c_obj.get())

The above doesn't compile because there is something broken in Cython which results in the generated C++ code declaring a reference without initializing it.

Suppose we had a libcudf function libcudf_func that accepted a const scalar&, we are still able to call it simply by doing:

f = Scalar()
libcudf_func(dereference(f.c_obj.get()) # this works just fine

So why don't we just do that?

I believe the reason is currently, our Scalar class initializes c_obj on first use. Which means get_scalar_ref would actually need to do something like this:

    cdef scalar& get_scalar_ref(self):  # this won't compile
        if not self.device_initialized:  
            self.c_obj = make_unique[scalar](...)
            self.device_initialized = True
        return dereference(self.c_obj.get())

This is one of the reasons why I previously proposed a level of indirection, where we break up the implementation into two classes. A DeviceScalar that wraps a unique_ptr[scalar] and a Scalar class that is composed of that:

class DeviceScalar:
    cdef unique_ptr[scalar] c_obj # c_obj will be initialized in the ctor of DeviceScalar
class Scalar:
    cdef DeviceScalar get_device_scalar(self):
        if not self.device_initialized:
            self._device_scalar =DeviceScalar(...)
            self.device_initialized = True
        return self._device_scalar

And now we call our libcudf function as follows:

f = Scalar(...)
libcudf_func(dereference(f.get_device_scalar().c_obj.get()))

@jrhemstad
Copy link
Contributor

cdef class Scalar:
unique_ptr[scalar] c_obj

cdef scalar const* get_scalar_ptr(self): 
    return self.c_obj.get()

Couldn't you also just return the object by a scalar const*?

Then you could do:

libcudf_func(dereference(f.get_scalar_ptr()) 

@shwina
Copy link
Contributor

shwina commented Oct 20, 2020

Yes, we could totally do that.

@brandon-b-miller
Copy link
Contributor Author

The reason I want to return the unique pointer from a function, is because I want to potentially do things with the object before returning. In the current scheme when we construct the python scalar, we don't immediately set the device pointer with the value, because there might be other reasons the python code needs to use/work with that value before it goes to libcudf. We don't want to keep having to go back and forth to the device to do these things. Instead it would execute lazily and only copy the value to the device when needed by libcudf.

To illustrate the problem with the const reference approach ( and frankly to convince myself why it would not work ) I put together a basic example of what cython tries to compile the resulting function to, and why it won't work. Suppose we had this basic function to return an int in cython:

cdef int return_an_int():
    cdef int the_int = 1
    return the_int

When compiled, this becomes the following. I've removed a lot of the fluff that cython generates as part of, presumably, the optimization:

static int __pyx_return_an_int(void) {
  int __pyx_v_the_int;
  int __pyx_r;
 
  return __pyx_r;
}

So now, if you modify this to be

cdef int& return_an_int():
    cdef int the_int = 1
    return the_int

Cython would now write

static int &__pyx_return_an_int(void) {
  int __pyx_v_the_int;
  int &__pyx_r;
  return __pyx_r;
}

Which won't compile.

@shwina
Copy link
Contributor

shwina commented Oct 20, 2020

Additionally, after giving it some thought, I realized that a scalar_view would only be useful if we are interested in constructing non-owning scalars whose underlying memory is owned by some external library. (cc: @cwharris )

@harrism
Copy link
Member

harrism commented Oct 20, 2020

@brandon-b-miller why does the generated Cython-C++ code from your example never assign the literal 1 as in the Cython code? Makes me think this isn't the actual generated code.

Also, that example is fundamentally different. It is returning a reference to a stack variable, which clearly shouldn't compile. The real example we are talking about is a method on an object that returns a const& to a data member of the object. These are two very different things.

@brandon-b-miller
Copy link
Contributor Author

Apologies. It is the code cython generates, minus a bunch of reference counting machinery, and with the variable names demangled. I cut out a bunch of stuff including the assignment. I think the below example might be better.

# example.pyx
# distutils: language = c++

cdef class ExampleClass:
    cdef const int& get_member(self):
        return self.member
# example.pxd
cdef class ExampleClass:
    cdef int member
    cdef const int& get_member(self)
# setup.py
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("example.pyx")
)

running python setup.py build_ext --inplace this compiles to a 4687 line .cpp file. The relevant part is:

# example.cpp
static int const &__pyx_f_7example_12ExampleClass_get_member(struct __pyx_obj_7example_ExampleClass *__pyx_v_self) {
  int const &__pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("get_member", 0);

  /* "example.pyx":5
 * cdef class ExampleClass:
 *     cdef const int& get_member(self):
 *         return self.member             # <<<<<<<<<<<<<<
 */
  __pyx_r = __pyx_v_self->member;
  goto __pyx_L0;

  /* "example.pyx":4
 * 
 * cdef class ExampleClass:
 *     cdef const int& get_member(self):             # <<<<<<<<<<<<<<
 *         return self.member
 */

  /* function exit code */
  __pyx_L0:;
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

Which when you trim down, as far as I can tell, basically says

static int const &ExampleClass_get_member(struct __pyx_obj_ExampleClass *__pyx_v_self) {
  int const &__pyx_r;

  __pyx_r = __pyx_v_self->member;
  return __pyx_r;
}

When the compiler hits then we get

example.cpp: In function ‘const int& __pyx_f_7example_12ExampleClass_get_member(__pyx_obj_7example_ExampleClass*)’:
example.cpp:1302:14: error: ‘__pyx_r’ declared as reference but not initialized
   int const &__pyx_r;
              ^~~~~~~

For completeness here's the full compiled code for the simple function I had previously written

static int __pyx_f_7example_return_an_int(void) {
  int __pyx_v_the_int;
  int __pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("return_an_int", 0);

  /* "example.pyx":3
 * # distutils: language = c++
 * cdef int return_an_int():
 *     cdef int the_int = 1             # <<<<<<<<<<<<<<
 *     return the_int
 */
  __pyx_v_the_int = 1;

  /* "example.pyx":4
 * cdef int return_an_int():
 *     cdef int the_int = 1
 *     return the_int             # <<<<<<<<<<<<<<
 */
  __pyx_r = __pyx_v_the_int;
  goto __pyx_L0;

  /* "example.pyx":2
 * # distutils: language = c++
 * cdef int return_an_int():             # <<<<<<<<<<<<<<
 *     cdef int the_int = 1
 *     return the_int
 */

  /* function exit code */
  __pyx_L0:;
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

Which has the missing line I think you might be looking for.

All this said, scalar const * seems to work just fine modulo a few tweaks in the scatter API. So if we feel a conclusion has been reached here feel free to close.

@harrism
Copy link
Member

harrism commented Oct 20, 2020

Has anyone filed a bug against Cython for this?

@brandon-b-miller
Copy link
Contributor Author

There's cython/cython#1695 where the case is basically made that supporting this would basically require them to change how the scoping rules between c++ and python are woven together, so my impression is that it's a wont-fix.

@harrism harrism added the wontfix This will not be worked on label Oct 21, 2020
@harrism
Copy link
Member

harrism commented Oct 21, 2020

Interesting. OK, well I think on the C++ side we feel you should go with the scalar const* approach. Closing.

@harrism harrism closed this as completed Oct 21, 2020
rapids-bot bot pushed a commit that referenced this issue Jun 22, 2022
The developer guide references a `scalar_view` class that does not exist. This PR removes that reference.

See #6558 for the rationale of why no such class exists.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - https://github.com/nvdbaranec

URL: #11132
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants