Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance of concat() #7833

Closed
alimanfoo opened this issue May 11, 2023 · 3 comments · Fixed by #7824
Closed

Slow performance of concat() #7833

alimanfoo opened this issue May 11, 2023 · 3 comments · Fixed by #7824
Labels

Comments

@alimanfoo
Copy link
Contributor

What is your issue?

In attempting to concatenate many datasets along a large dimension (total size ~100,000,000) I'm finding very slow performance, e.g., tens of seconds just to concatenate two datasets.

With some profiling, I find all the time is being spend in this list comprehension:

var_idx = [

I don't know exactly what's going on here, but it doesn't look right - e.g., if the size of the dimension to be concatenated is large, this list comprehension can run millions of loops, which doesn't seem related to the intended behaviour.

Sorry I don't have an MRE for this yet but please let me know if I can help further.

@alimanfoo alimanfoo added the needs triage Issue that has not been reviewed by xarray team member label May 11, 2023
@Illviljan
Copy link
Contributor

I've noticed this as well, see #7824.

@kmuehlbauer
Copy link
Contributor

@alimanfoo The slow code stems from my changes in #7400. Obviously the performance drop did not manifest in the tests/ benchmarks.

In #7824 @Illviljan is tackling concat performance.

@alimanfoo
Copy link
Contributor Author

Awesome, thanks @kmuehlbauer and @Illviljan 🙏🏻

@dcherian dcherian added topic-performance topic-combine combine/concat/merge and removed needs triage Issue that has not been reviewed by xarray team member labels May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants