Performance of FromPyObject::extract #2968
-
I'm working on a Python package which uses numpy arrays as a primary input/output objects. I need to support generic numpy float arrays, so I implemented a simple enum: #[derive(FromPyObject)]
pub enum GenericFloatArray1<'a> {
#[pyo3(transparent, annotation = "np.ndarray[float32]")]
Float32(Arr<'a, f32>),
#[pyo3(transparent, annotation = "np.ndarray[float64]")]
Float64(Arr<'a, f64>),
} https://github.com/light-curve/light-curve-python/blob/abaa7d0970ce83dd955c0992a1bafddb768629b8/light-curve/src/np_array.rs#L7-L13 However I've found that inter-op overhead with my Python package is pretty large. Do debug this issue I created a repository where I benchmark #[derive(FromPyObject)]
pub enum Collection<'a> {
#[pyo3(transparent, annotation = "list")]
List(&'a PyList),
#[pyo3(transparent, annotation = "tuple")]
Tuple(&'a PyTuple),
#[pyo3(transparent, annotation = "set")]
Set(&'a PySet),
} In both cases I've found that // x is a PyObject
// x is a list
let y: &PyList = x.downcast(py)?; // 961.09 ps
let y: Collection = x.extract(py)?; // 5.1340 ns
// x is a set
let y: &PySet = x.downcast(py)?; // 1.2817 ns
let y: Collection = x.extract(py)?; // 650.84 ns
let y = if let Ok(x) = x.extract(py) {
Collection::List(x)
} else if let Ok(x) = x.extract(py) {
Collection::Tuple(x)
} else if let Ok(x) = x.extract(py) {
Collection::Set(x)
} else {
return Err(PyErr::new::<PyTypeError, _>(
"Expected a list, tuple or set",
));
}; // 48.850 ns See full code here, you could run it with As you could see even a manual implementation of the dynamic dispatch of this I've found an related discussion in #2278, but it is still unclear what I can do to make my code faster. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
This is a known foot gun of how the The best workaround at the moment is to manually call |
Beta Was this translation helpful? Give feedback.
This is a known foot gun of how the
FromPyObject
trait is defined insofar it has to return aPyResult
whereasdowncast
returns aResult<_, PyDowncastErr>
. Wrapping up thePyDowncastErr
into aPyErr
is the cost you are seeing.The best workaround at the moment is to manually call
downcast
in if-let-else-if-let chains. Note thatPyReadonlyArray
is not a valid target fordowncast
though, i.e. you would need to downcast intoPyArray
first and then manually callreadonly
on it.