-
Notifications
You must be signed in to change notification settings - Fork 782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce trampolines for methods #2705
Conversation
32f27ac
to
8a9b25a
Compare
It is not really related, but I had a look while reviewing: Doesn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great code bloat reduction! A nit and an unrelated question, but this looks good to me.
Ahh, yes, when working on PanicTrap was one of the moments when I was thinking a refactor like this could be in order... Then I forgot to install a PanicTrap here. Let me push 😆 |
@@ -571,21 +558,66 @@ impl<'a> FnSpec<'a> { | |||
CallingConvention::Noargs => quote! { | |||
_pyo3::impl_::pymethods::PyMethodDef::noargs( | |||
#python_name, | |||
_pyo3::impl_::pymethods::PyCFunction(#wrapper), | |||
_pyo3::impl_::pymethods::PyCFunction({ | |||
unsafe extern "C" fn trampoline( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be cool not to have to repeat the (almost) identical signature here and in the trampoline
fn. But the required macro trickery isn't worth it, likely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this, however I'd like to defer it to a follow-up PR sometime (there's a fair bit of refactoring which could improve the macro code, and I'd like it not to distract readers of this PR from the behavioural change).
9032f75
to
a3f59c8
Compare
3029: use dynamic trampoline for all getters and setters r=adamreichold a=davidhewitt This is an extension to the "trampoline" changes made in #2705 to re-use a single trampoline for all `#[getter]`s (and similar for all `#[setters]`). It works by setting the currently-unused `closure` member of the `ffi::PyGetSetDef` structure to point at a new `struct GetSetDefClosure` which contains function pointers to the `getter` / `setter` implementations. A universal trampoline for all `getter`, for example, then works by reading the actual getter implementation out of the `GetSetDefClosure`. Advantages of doing this: - Very minimal simplification to the macro code / generated code size. It made a 4.4% reduction to `test_getter_setter` generated size, which is an exaggerated result as most code will probably have lots of bulk that isn't just the macro code. Disadvantages: - Additional level of complexity in the `getter` and `setter` trampolines and accompanying code. - To keep the `GetSetDefClosure` objects alive, I've added them to the static `LazyTypeObject` inner. - Very slight performance overhead at runtime (shouldn't be more than an additional pointer read). It's so slight I couldn't measure it. Overall I'm happy to either merge or close this based on what reviewers think! Co-authored-by: David Hewitt <[email protected]>
This is a refactoring to the generated
#[pyfunction]
and#[pymethods]
with the intention of reducing compile time and binary size.The motivation for the change is that
cargo llvm-lines
currently reports many monomorphizations ofstd::panicking::try::do_call
and friends being instantiated - in fact, it's one per#[pyfunction]
and every method in#[pymethods]
, which is quite a lot of instantiations.The change follows a similar idea to #2478 changing
extract_argument
to make use of "dynamic dispatch". Here, we create "trampoline" wrappers for all possible C-API function pointers we expose to Python. These wrappers take a function pointer to the real implementation of the function in question, and call it after starting thestd::panic::catch_unwind
machiner and creating aGILPool
.Much like in #2704 I suspect LLVM can do a pretty good job at optimizing out the dynamic dispatch. I saw no change at all in benchmarks for the
pytests
crate. So overall I believe this should have a compile-time win with little-to-no impact on runtime performance.For our
pytests
crate I get a 10% reduction incargo llvm-lines
count, and for real-world use cases potentially with many small function wrappers (e.g. getters / setters) I suspect the reduction could be even greater.Detail-wise, for some method
MyClass::foo
, the generated code change from a single C-ABI symbol:to an C-ABI symbol which just calls the trampoline wrapper with a pointer to the actual implementation.