[performance] Equivalent for CPython's _Py_Identifier #2223

alex · 2022-03-13T18:31:27Z

CPython has an internal API for caching PyObject * for string literals, for example:

https://github.com/python/cpython/blob/d87f1b787ed38dfd307d82452f2efe9dc5b93942/Modules/_json.c#L326-L327

_Py_IDENTIFIER declares a static variable, and _PyObject_GetAttrId knows how to translate that back into a PyObject *. This importantly avoids the overhead of allocating a new str object and copying a value into it, each time the function is called.

pyo3 has the same issue, a standard py_any.getattr("attribute") will call <&str as ToPyObject>::to_object which will allocate a new str object. Inefficient!

The goal of this issue is to brainstorm on what can be done to improve this. To start the conversation I did an implementation that reused the CPython implementation:

#![feature(test)]
extern crate test;

#[repr(C)]
struct CPythonIdentifier {
    next: *mut CPythonIdentifier,
    string: *const std::os::raw::c_char,
    object: *mut pyo3::ffi::PyObject,
}

struct PyIdentifier {
    raw: std::cell::UnsafeCell<CPythonIdentifier>,
}

unsafe impl Sync for PyIdentifier {}

impl PyIdentifier {
    /// SAFETY: `val` must have a trailing NUL.
    pub const unsafe fn new(val: &'static str) -> PyIdentifier {
        PyIdentifier {
            raw: std::cell::UnsafeCell::new(CPythonIdentifier {
                next: std::ptr::null_mut(),
                string: val.as_ptr().cast(),
                object: std::ptr::null_mut(),
            }),
        }
    }
}

impl pyo3::ToPyObject for PyIdentifier {
    fn to_object(&self, py: pyo3::Python) -> pyo3::Py<pyo3::PyAny> {
        extern "C" {
            fn _PyUnicode_FromId(id: *mut CPythonIdentifier) -> *mut pyo3::ffi::PyObject;
        }
        let ptr = self.raw.get();
        let val = unsafe { py.from_borrowed_ptr::<pyo3::PyAny>(_PyUnicode_FromId(ptr)) };
        val.to_object(py)
    }
}

macro_rules! pyid {
    ($var:ident) => {{
        static ID: $crate::PyIdentifier =
            unsafe { $crate::PyIdentifier::new(concat!(stringify!($var), "\0")) };
        &ID
    }};
}

#[cfg(test)]
mod tests {
    #[bench]
    fn bench_normal_getattr(b: &mut test::Bencher) {
        pyo3::prepare_freethreaded_python();

        pyo3::Python::with_gil(|py| {
            let sys = py.import("sys").unwrap();

            b.iter(|| sys.getattr("version").unwrap());
        });
    }

    #[bench]
    fn test_bench_id_getattr(b: &mut test::Bencher) {
        pyo3::prepare_freethreaded_python();

        pyo3::Python::with_gil(|py| {
            let sys = py.import("sys").unwrap();

            b.iter(|| sys.getattr(pyid!(version)).unwrap());
        });
    }
}

You can see the only difference between the benchmarks is sys.getattr("version") vs sys.getattr(pyid!(version)). This achieves roughly a 4x speedup:

~/p/h/pyo3-identifier ❯❯❯ cargo +nightly bench
    Finished bench [optimized] target(s) in 0.01s
     Running unittests src/lib.rs (target/release/deps/pyo3_identifier-0ace08c40c3c7437)

running 2 tests
test tests::bench_normal_getattr  ... bench:          60 ns/iter (+/- 0)
test tests::test_bench_id_getattr ... bench:          15 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 0.75s

Unfortunately, this digs deep into the bowels of CPython implementation details -- _Py_Identifier is not part of the public API (much less the stable ABI). The implementation I have here works with CPython 3.9, but not 3.10 (I didn't test older versions).

Hopefully this can be a jumping off point to thinking about whether there's a way pyo3 can offer a similarly better performing API that doesn't rely on a CPython implementation detail.

The text was updated successfully, but these errors were encountered:

birkenfeld · 2022-03-13T18:47:57Z

I also remember a recent thread on python-dev about replacing the _Py_Identifier API: https://mail.python.org/archives/list/[email protected]/thread/DNMZAMB4M6RVR76RDZMUK2WRLI6KAAYS/ (the linked PR seems to have been merged).

alex · 2022-03-13T19:01:19Z

This seems to use module state, if I'm reading correctly. A similar approach could work for pyo3, though I don't believe there's currently a Rust API for it.

mejrs mentioned this issue Apr 3, 2022

RFC: Add intern! macro which can be used to amortize the cost of creating Python objects by storing them inside a GILOnceCell. #2269

Merged

birkenfeld changed the title ~~[performance] Equivilant for CPython's _Py_Identifier~~ [performance] Equivalent for CPython's _Py_Identifier Apr 3, 2022

davidhewitt closed this as completed in #2269 Apr 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[performance] Equivalent for CPython's _Py_Identifier #2223

[performance] Equivalent for CPython's _Py_Identifier #2223

alex commented Mar 13, 2022 •

edited

Loading

birkenfeld commented Mar 13, 2022

alex commented Mar 13, 2022

[performance] Equivalent for CPython's _Py_Identifier #2223

[performance] Equivalent for CPython's _Py_Identifier #2223

Comments

alex commented Mar 13, 2022 • edited Loading

birkenfeld commented Mar 13, 2022

alex commented Mar 13, 2022

alex commented Mar 13, 2022 •

edited

Loading