Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

average() indentation error #1048

Closed
painter1 opened this issue Feb 18, 2015 · 8 comments
Closed

average() indentation error #1048

painter1 opened this issue Feb 18, 2015 · 8 comments
Assignees
Milestone

Comments

@painter1
Copy link
Contributor

In cdutil/times.py, the function average(), the block beginning "if len(sub)>1:" is indented too far. The effect is to repeat a calculation more than needed. This has major performance implications.

@painter1 painter1 self-assigned this Feb 18, 2015
@doutriaux1
Copy link
Contributor

@painter1 can you push a PR for this since you already have it on your computer? Thanks!

@painter1
Copy link
Contributor Author

Yes, I'll do a PR but want to test some more first.

@doutriaux1
Copy link
Contributor

That's awesome, even better would be to include the test in the test suite. If you don't remember how do it, just put the tests in testing/genutil and I will integrate them.

@painter1
Copy link
Contributor Author

What I did was to run some diagnostics which print out scalars. I happen to have some old values in my notes so I could eyeball them and see that they aren't changed. It will be another job to turn this into an automated test. Note that a few gigabytes of data are required. Here's what I ran:
diags --path ~/metrics_data/cam35_data/ --path2 /Users/painter1/metrics_data/obs_data_5.6/ --package AMWG --set 1 --seasons ANN

@durack1
Copy link
Member

durack1 commented Feb 18, 2015

@painter1 out of curiosity what's the speed up by removing this unintentional loop?

painter1 added a commit that referenced this issue Feb 18, 2015
…verager().

This will dramatically improve performance because some major calculations are done only once now.
This fixes issue #1048.
@painter1
Copy link
Contributor Author

I'm only profiling on Rhea (at Oak Ridge) which is abysmally slow for reasons which I haven't identified, but must be about the OS and file system. I'm not sure that you will notice the difference on any other machine, maybe you will. Anyway, on my test which is probably typical, the core average calculation was repeated 12 times before and is being done once now. On Rhea, the overall speedup is 3 times on the largish test data I've been using.

@painter1
Copy link
Contributor Author

The test I've been profiling is climatology.py in the UV-CDAT diagnostics.

@doutriaux1
Copy link
Contributor

fixed by #1050

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants