Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenate_inferences is slow #57

Open
mortonjt opened this issue Jul 31, 2021 · 0 comments
Open

concatenate_inferences is slow #57

mortonjt opened this issue Jul 31, 2021 · 0 comments

Comments

@mortonjt
Copy link

I'm noticing that sometimes the concatenate_inferences method is the slowest part of the computation, even slower than MCMC sampling.

It looks like this can be speed up with dask -- the trick is to rechunk your az.InferenceData object, and I believe dask will do the rest (so no need to implement here I think). It does become very problematic when concatenating az.InferenceData objects with tens of thousands of features; possibly because the dask scheduler gets overwhelmed and all operations become single-threaded. I've raised this issue on the xarray discussions

In which case, the workaround is to turn concatenate_inferences into a reduction operation (i.e. merge only 1000 datasets at a time, and then merge those together).

Mainly raising this as an issue because this method is going to be problematic for larger datasets, and a reduce version of this function maybe necessarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant