Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cython doesn't handle mixed imports and cimports of enums cleanly #5609

Open
vyasr opened this issue Aug 10, 2023 · 8 comments
Open

[BUG] Cython doesn't handle mixed imports and cimports of enums cleanly #5609

vyasr opened this issue Aug 10, 2023 · 8 comments

Comments

@vyasr
Copy link
Contributor

vyasr commented Aug 10, 2023

Describe the bug

I would like to use Cython to make a C++ scoped enum accessible in Cython and Python code. I'm using a cpdef enum class declaration to accomplish this in one pxd file. In another file I would like to define a cpdef function that accepts this enum as a parameter. The cpdef function should use the C++ enum directly when invoked from typed Cython code, while it should use the Python wrapper enum when called in pure Python code. In order to support this, the enum needs to be both imported and cimported.

What I observe is that imports in the second module are confused when there is both a pyx and a pxd file, leading to conflicts that are not easily resolved.

Code to reproduce the behaviour:

This MWE requires four separate files:

# declarations.pxd
# distutils: language = c++
cdef extern from *:
    """
    enum class my_enum : int
    {
        a = 1,
        b = 0
    };
    """
    cpdef enum class my_enum(int):
        a
        b
# declarations.pyx
# distutils: language = c++
# definitions.pxd
# distutils: language = c++

# Doesn't work
# from declarations cimport my_enum
# cpdef void f(my_enum x)

# Works
cimport declarations
cpdef void f(declarations.my_enum x)
# definitions.pyx
# distutils: language = c++

from declarations import my_enum
from declarations cimport my_enum


cpdef void f(my_enum x):
    pass

If I run cythonize -3 -i declarations.pyx && cythonize -3 -i -f definitions.pyx with these files, everything compiles successfully. However, if I comment out the two lines below "Works" in definitions.pxd and instead include the two lines below "Doesn't work", I get the following error:

------------------------------------------------------------                                                                                                                                                                        [39/421]
...
# distutils: language = c++

from declarations import my_enum
                         ^
------------------------------------------------------------

definitions.pyx:3:25: Assignment to non-lvalue 'my_enum'

Note that the error is coming from the definitions.pyx file, not the pxd file. What appears to be happening is that the line from declarations cimport my_enum in definitions.pxd makes Cython treat my_enum as a strongly typed C object, and therefore in Python it forbids the import because my_enum is no longer a valid PyObject name.

The workaround above isn't too bad, but it does mean that my_enum is not cimportable from definitions, only pure Python importable. It can be cimported from declarations in this example, but in my production use case declarations is an internal module not intended for public consumption and I'm looking to export the enum into the public API (trying to solve the same problem as intended in #5602). I could equivalently remove the import from definitions.pyx and only include the cimport, which would create the inverse problem where my_enum is only visible in Cython, not Python.

Expected behaviour

I expect to be able to cimport the module into the pxd file with the same name as in the pyx file without any name collisions.

OS

Linux

Python version

3.11.4

Cython version

3.0.0

Additional context

An aside: declarations.pyx is an empty file that's required to trigger compilation of the Python version of my_enum. It makes sense why this file is necessary, but it's a bit of a footgun that simply changing my_enum from a cdef enum class to a cpdef triggers the requirement that this class exist. It would be nice if the Cython compiler could detect that a pxd file contains a cpdef and errors if a corresponding pyx file doesn't exists. That's orthogonal to the current issue though, and not something new in Cython, just a bit extra surprising in this case IMHO.

@vyasr
Copy link
Contributor Author

vyasr commented Aug 10, 2023

CC @shwina

@vyasr
Copy link
Contributor Author

vyasr commented Aug 26, 2023

The issue here seems to be an incorrect determination of whether or not the enum is an lvalue, which then blocks assignment. I'm not sure what the correct change to make here is, though, because it's not clear to me exactly how to access the necessary information. AFAICT what we need is to update NameNode.is_lvalue to also return True for a cpdef enum class. IOW I would like to apply the following diff:

(main) vyasramasubramani@mac2020 compile % git diff
diff --git a/Cython/Compiler/ExprNodes.py b/Cython/Compiler/ExprNodes.py
index b73bb6734..997ea77e7 100644
--- a/Cython/Compiler/ExprNodes.py
+++ b/Cython/Compiler/ExprNodes.py
@@ -2403,6 +2403,9 @@ class NameNode(AtomicExprNode):
         ) or (
             self.entry.is_cfunction and
             self.entry.is_overridable
+        ) or (
+            self.entry.type.is_cpp_enum and
+            self.entry.create_wrapper
         )

     def is_addressable(self):

However, it seems like even though my_enum is an scoped enumeration in the declarations module, at the point at which we reach this import the definitions module doesn't know that yet, so is_cpp_enum returns False and the above condition fails. I could use some advice to find an alternative solution here.

@scoder
Copy link
Contributor

scoder commented Sep 1, 2023

an incorrect determination of whether or not the enum is an lvalue

Well, an enum is not an lvalue. It's a type. It cannot be assigned to.

the enum needs to be both imported and cimported.

This sounds wrong, but I'm not sure I've fully understood the problem. (Besides, if you really need both, you can rename one of them on import with import … as … to avoid naming collisions).

So, you are trying to write a cpdef function that receives an enum value as argument, and you want it to accept both the bare C++ enum value and the Python wrapper value. Right?

That sounds like you should use the C++ enum type as argument type and that's it. You should not need the Python wrapper type at all in your code. Also because from the point of view of Cython, that's not actually a special enum type but an arbitrary Python type like any other.

If you try to use only the C++ enum as argument type, what doesn't work then?

@vyasr
Copy link
Contributor Author

vyasr commented Sep 1, 2023

This sounds wrong, but I'm not sure I've fully understood the problem

Let me try to explain a bit better. We can first focus on what I see as the inconsistency in Cython's current behavior, then go back to my use case (which is a bit more complex, but I'll explain that too to avoid the XY problem here in case there's a different solution to that problem).

Consider the following example:

# declarations.pxd
# distutils: language = c++

cdef class A:
    pass

cpdef void f(A a)


# declarations.pyx
# distutils: language = c++

cpdef void f(A a):
    pass


# definitions.pyx
# distutils: language = c++

cimport declarations
import declarations

from declarations cimport A
from declarations import A

from declarations cimport f
from declarations import f

f = 10

# This is not allowed
# A = 20

cdef void g():
    cdef A a = A()

    # This compiles, but will fail at runtime because f is now an int
    f(a)

In this example, I am able to both cimport and import (using the same name) a Cython module, a cdef class, and a cpdef function from the declarations module into the definitions module. My first expectation is therefore that I should be able to do the same thing with a cpdef enum. However, that triggers the error originally noted in issue. Therefore, my conclusion was that it should be possible to treat the enum as an lvalue, at least in the context of an import being considered an "assignment" to a name that is already reserved by a cimport.

Well, an enum is not an lvalue. It's a type. It cannot be assigned to.

This appears to be true as shown in the example above since the class A cannot be assigned to. It is interesting to note that the function object f is assignable, although doing so apparently overrides both the C and Python types so that f is no longer callable. I can't imagine a sane use case for this type of code, though, so I think Cython is doing enough of the right thing here. The main issue is with consistency since the class A, the function f, and an enum (or enum class) get treated differently for the purpose of importing.

(Besides, if you really need both, you can rename one of them on import with import … as … to avoid naming collisions)

That is absolutely true, and it is how I'm working around this issue at present.

My actual use case is roughly the following. I have one pxd file declarations.pxd that exports C++ definitions to Cython. One of those definitions is an enum class E. I have another Cython module (definitions.pyx + definitions.pxd) that establishes my public API. In definitions I have a cpdef function f(E e). I would like the definitions module to be usable from both pure Cython (i.e. fully typed) and pure Python contexts. IOW, I want both of the following to be possible:

# Pure Python
import definitions
definitions.f(definitions.E.A)

# Pure Cython
cimport definitions
definitions.f(definitions.E.A)  # This should call the typed version of the cpdef function with a suitably typed argument

In order for this to be possible, I need E to be part of the public API of definitions, which means it needs to be both imported in definitions.pyx and cimported in definitions.pxd so that client code can either import or cimport E.

Does this make sense/seem possible?

@scoder
Copy link
Contributor

scoder commented Sep 1, 2023

So, basically, what you are saying is that you want to write from declarations cimport E in definitions.pxd and from declarations import E in definitions.pyx. Is that right? And that is rejected with an error?

The cimport makes declarations.pxd part of your public API as well, so I wonder why you can't move the cpdef enum declaration to definitions.pxd instead. It seems reasonable to (c)import API modules from implementation modules. More than the other way round.

@vyasr
Copy link
Contributor Author

vyasr commented Sep 1, 2023

Is that right? And that is rejected with an error?

Yup that is exactly correct (see the original example, specifically the comments "Works" and "Doesn't work" in the definitions.pxd).

so I wonder why you can't move the cpdef enum declaration to definitions.pxd instead. It seems reasonable to (c)import API modules from implementation modules. More than the other way round.

I hadn't considered this option at all. Since declarations.pxd is currently exporting functions defined in C++ that accept the enum as a parameter, it seemed natural to export the enum in that file as well, and when I wanted to make it available in Python I just changed it from a cdef to a cpdef. I think exporting the enum in definitions.pyx would work, though, since in that case I would only need to cimport it into declarations.pxd and I wouldn't need declarations.pyx at all. That seems like a pretty reasonable workaround for my use case, let me give it a shot.

@vyasr
Copy link
Contributor Author

vyasr commented Sep 1, 2023

That does indeed work for the toy example above. Unfortunately it wouldn't translate into my real use case because switching around the dependency structure like that would introduce a circular import dependency. I'm not sure that there is a way to resolve that since in my real use case all the functions in definitions.pyx are just wrappers around C++ functions. Perhaps this example will help illustrate the real use case more clearly (apologies for cluttering this issue with so many code snippets, let me know if there's different information that you think would help more):

# declarations.pxd
# distutils: language = c++

# In my real use case the C++ was actually defined in a separately compiled C++ library's header, not inline
cdef extern from *:
    """
    enum class my_enum : int
    {
        a = 1,
        b = 0
    };

    void f(my_enum e) {}
    """

    cpdef enum class my_enum(int):
        a
        b

    cdef void f(my_enum e)
# declarations.pyx
# distutils: language = c++
# This file only exists to compile the Python my_enum 
# definitions.pxd
# distutils: language = c++

# Needed to support passing typed my_enum arguments to the function f below if it is cimported 
# into other Cython modules. To your point, maybe this indicates that users of the Cython interface
# have to cimport declarations instead of definitions; that's not ideal IMO, but is an option.
from declarations cimport my_enum

# Needed so that the function is cimportable
cpdef void f(my_enum e)
# definitions.pyx
# distutils: language = c++

# Needed to call the C function f
cimport declarations

# Needed to type the argument of f
from declarations cimport my_enum

# Needed so users can access my_enum from Python
from declarations import my_enum

cpdef void f(my_enum e):
    # The real function has a bit more preprocessing of other arguments, but this example
    # captures the core requirement for the enum.
    declarations.f(e)

@vyasr
Copy link
Contributor Author

vyasr commented Feb 17, 2024

@scoder given that #5657 was closed, I'd like to reopen this discussion. I didn't do a great job of clarifying my intent in the various messages above, so let me try and start over with a simpler example. I think this statement above is the clearest argument:

In this example, I am able to both cimport and import (using the same name) a Cython module, a cdef class, and a cpdef function from the declarations module into the definitions module. My first expectation is therefore that I should be able to do the same thing with a cpdef enum. However, that triggers the error originally noted in issue. Therefore, my conclusion was that it should be possible to treat the enum as an lvalue, at least in the context of an import being considered an "assignment" to a name that is already reserved by a cimport.

Let's illustrate that with this simplified example using only two files (assume declarations.pyx exists and defines f; also, note that I am now using a C-style enum to keep things simple):

# declarations.pxd
cpdef enum my_enum:
    a = 1
    b = 2

cpdef void f()
# definitions.pyx
from declarations cimport f
from declarations import f

from declarations cimport my_enum
from declarations import my_enum

If we run cythonize -3 -i -f definitions.pyx, we'll see this error:

Error compiling Cython file:
------------------------------------------------------------
...
# definitions.pyx
from declarations cimport f
from declarations import f

from declarations cimport my_enum
from declarations import my_enum
                         ^
------------------------------------------------------------

definitions.pyx:6:25: Assignment to non-lvalue 'my_enum'

I see two issues with consistency here:

  1. We are able to cimport and then import the function f, but we cannot do so for the enum. As you mention above, an enum is not an lvalue, so from that perspective it makes sense not to allow assignment, but the same is true of a cdef function and yet we do allow the "assignment" to f that occurs in the import on the third line in definitions.pyx.
  2. If I change the order of the imports in definitions.pyx to import my_enum before cimporting it, the code compiles. That indicates that Cython only thinks this code is erroneous because by parsing my_enum as a C type first it thinks it is an lvalue and therefore disallows reassigning to it, while parsing it as a Python type first then allows creating the C type later. IOW this code compiles fine:
# definitions.pyx
from declarations cimport f
from declarations import f

# Now importing before cimporting
from declarations import my_enum
from declarations cimport my_enum

Therefore, I would argue that the Cython compiler is producing unexpected behavior for the two reasons above. Would you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants