concatenate_inferences is slow #57

mortonjt · 2021-07-31T16:04:54Z

I'm noticing that sometimes the concatenate_inferences method is the slowest part of the computation, even slower than MCMC sampling.

It looks like this can be speed up with dask -- the trick is to rechunk your az.InferenceData object, and I believe dask will do the rest (so no need to implement here I think). It does become very problematic when concatenating az.InferenceData objects with tens of thousands of features; possibly because the dask scheduler gets overwhelmed and all operations become single-threaded. I've raised this issue on the xarray discussions

In which case, the workaround is to turn concatenate_inferences into a reduction operation (i.e. merge only 1000 datasets at a time, and then merge those together).

Mainly raising this as an issue because this method is going to be problematic for larger datasets, and a reduce version of this function maybe necessarily.

The text was updated successfully, but these errors were encountered:

mortonjt mentioned this issue Jul 31, 2021

ENH: Enable xr.open_mfdataset like functionality arviz-devs/arviz#1750

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concatenate_inferences is slow #57

concatenate_inferences is slow #57

mortonjt commented Jul 31, 2021

concatenate_inferences is slow #57

concatenate_inferences is slow #57

Comments

mortonjt commented Jul 31, 2021