🩹 Fix Performance Regressions #740

joernweissenborn · 2021-07-08T13:36:15Z

This PR Addresses performance regressions introduce in staging branch.

Adds pytest benchmark for analysis.Problem class
*Adds numerous performance tweaks
Regressions: some internal objects are no longer available via public API, namely the reduced clps.

Checklist

✔️ Passing the tests (mandatory for all PR's)
👌 Closes issue (mandatory for ✨ feature and 🩹 bug fix PR's)

Closes issues

None

github-actions · 2021-07-08T13:36:33Z

👈 Launch a binder notebook on branch joernweissenborn/pyglotaran/performance/fixregressions

codecov · 2021-07-08T14:00:40Z

Codecov Report

Merging #740 (3287157) into staging (cf16572) will increase coverage by 0.0%.
The diff coverage is 96.0%.

@@           Coverage Diff           @@
##           staging    #740   +/-   ##
=======================================
  Coverage     80.1%   80.2%           
=======================================
  Files           70      70           
  Lines         3904    3950   +46     
  Branches       676     692   +16     
=======================================
+ Hits          3130    3169   +39     
- Misses         653     660    +7     
  Partials       121     121

Impacted Files	Coverage Δ
glotaran/examples/sequential.py	`0.0% <0.0%> (ø)`
glotaran/analysis/util.py	`90.9% <91.6%> (-0.6%)`	⬇️
glotaran/analysis/problem_grouped.py	`95.9% <97.7%> (+0.5%)`	⬆️
glotaran/analysis/problem.py	`90.0% <100.0%> (-1.8%)`	⬇️
glotaran/analysis/problem_ungrouped.py	`98.0% <100.0%> (-0.3%)`	⬇️
glotaran/analysis/simulation.py	`82.9% <100.0%> (+0.4%)`	⬆️
...tin/megacomplexes/baseline/baseline_megacomplex.py	`88.2% <100.0%> (-0.7%)`	⬇️
...coherent_artifact/coherent_artifact_megacomplex.py	`73.4% <100.0%> (-0.6%)`	⬇️
...n/builtin/megacomplexes/decay/decay_megacomplex.py	`82.2% <100.0%> (-0.3%)`	⬇️
...tin/megacomplexes/spectral/spectral_megacomplex.py	`81.5% <100.0%> (-0.5%)`	⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf16572...3287157. Read the comment docs.

s-weigand · 2021-07-08T15:22:49Z

~~I think those 4 functions could be one since benchmark is explicitly called.~~
Never mind the benchmark fixture can only be used once per tests function or you will get a pytest_benchmark.fixture.FixtureAlreadyUsed error.

to benchmark/pytest/analysis/test_problem.py Use 'pytest benchmark/pytest/' to run the benchmarks

jsnel · 2021-07-14T00:27:05Z

A note on the spectral model validation

Results for paramGUI case 05

Compare the estimated parameters from pyglotaran

  * __spectral__:
    | _Label_   |      _Value_ |    _StdErr_ |
    |-----------|--------------|-------------|
    | amp1      |     1        | 0           |
    | loc1      | 14282        | 0.0193176   |
    | width1    |   808.004    | 0.0343336   |
    | skew1     |     0.400081 | 8.92553e-05 |
    | amp2      |     1        | 0           |
    | loc2      | 13707        | 0.0331115   |
    | width2    |   654.987    | 0.0556831   |
    | skew2     |    -0.299748 | 0.000158642 |

with those from paramGUI as recored in specFitSummary.txt:
Parameters:

     Estimate Std. Error t value Pr(>|t|)    
t1  1.428e+04  1.971e-02  724710   <2e-16 ***
t2  8.080e+02  3.503e-02   23069   <2e-16 ***
t3  4.001e-01  9.106e-05    4394   <2e-16 ***
t4  1.371e+04  3.378e-02  405740   <2e-16 ***
t5  6.550e+02  5.681e-02   11529   <2e-16 ***
t6 -2.997e-01  1.619e-04   -1852   <2e-16 ***

Also note the similarity in plotted results: paramGUI vs pyglotaran
So you see the values estimated are essentially the same (when units are compatible).

By compatible I mean that the units of the spectral model parameters are the same as the spectral axis of the dataset (in this case both cm^-1), this is not possible in all cases (even with the option energy_spectrum: True).
Some mechanism is needed which not only allows one to specify whether a megacomplex is 'energy-spectrum' but also if the dataset is 'energy-spectrum', so you can 'cancel out' if they are already compatible and only convert when needed. In the future some kind of type-annotation of units is needed (nm, cm^-1, eV, etc) for all data-types: dataset axis, model axis, parameters.

For now the results are good enough to merge into staging, at least from this perspective.

jsnel

Taking note of my comment w.r.t the spectral model validation, this PR is good enough to merge to staging,
pending possibly some more detailed general comments by @s-weigand.

s-weigand

Additional issues I found.

pyglotaran/glotaran/analysis/util.py

Line 44 in cf16572

indices: dict[str, int] | None,

indices should have type dict[str, int] not dict[str, int] | None, since the mega complexes in builtin have that signature and don't defend against None.

glotaran/analysis/problem_ungrouped.py

glotaran/analysis/util.py

joernweissenborn · 2021-07-14T16:55:46Z

Additional issues I found.

pyglotaran/glotaran/analysis/util.py

Line 44 in cf16572

indices: dict[str, int] | None,

indices should have type dict[str, int] not dict[str, int] | None, since the mega complexes in builtin have that signature and don't defend against None.

Right, i will fix that.

Co-authored-by: Sebastian Weigand <[email protected]>

sourcery-ai · 2021-07-14T19:09:23Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.14%.

Quality metrics	Before	After	Change
Complexity	4.20 ⭐	4.70 ⭐	0.50 👎
Method Length	82.52 🙂	81.56 🙂	-0.96 👍
Working memory	9.06 🙂	8.99 🙂	-0.07 👍
Quality	63.93% 🙂	64.07% 🙂	0.14% 👍

Other metrics	Before	After	Change
Lines	2571	2544	-27

Changed files	Quality Before	Quality After	Quality Change
glotaran/analysis/problem.py	81.72% ⭐	80.73% ⭐	-0.99% 👎
glotaran/analysis/problem_grouped.py	55.74% 🙂	57.43% 🙂	1.69% 👍
glotaran/analysis/problem_ungrouped.py	62.94% 🙂	66.84% 🙂	3.90% 👍
glotaran/analysis/simulation.py	64.72% 🙂	63.93% 🙂	-0.79% 👎
glotaran/analysis/util.py	63.96% 🙂	62.39% 🙂	-1.57% 👎
glotaran/analysis/test/models.py	65.96% 🙂	66.66% 🙂	0.70% 👍
glotaran/analysis/test/test_optimization.py	25.03% 😞	25.08% 😞	0.05% 👍
glotaran/analysis/test/test_problem.py	61.90% 🙂	62.58% 🙂	0.68% 👍
glotaran/examples/sequential.py	45.29% 😞	46.88% 😞	1.59% 👍
glotaran/model/test/test_model.py	69.28% 🙂	69.21% 🙂	-0.07% 👎

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
glotaran/analysis/test/test_optimization.py	test_optimization	19 😞	520 ⛔	15 😞	25.08% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/analysis/problem_grouped.py	GroupedProblem._append_to_grouped_bag	12 🙂	288 ⛔	23 ⛔	28.33% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/analysis/util.py	calculate_matrix	19 😞	152 😞	13 😞	40.33% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/analysis/problem_ungrouped.py	UngroupedProblem._calculate_residual	13 🙂	228 ⛔	12 😞	40.50% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/analysis/problem_grouped.py	GroupedProblem.init_bag	10 🙂	176 😞	16 ⛔	41.68% 😞	Try splitting into smaller methods. Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

sonarqubecloud · 2021-07-14T19:09:37Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
0.0% Duplication

s-weigand · 2021-07-14T19:11:25Z

Last benchmark before review suggestions got applies

All benchmarks:

       before           after         ratio
     [dc00e6da]       [8cc6e6e0]
     <v0.4.0>                   
-      58.2±0.5ms       39.7±0.9ms     0.68  BenchmarkOptimize.time_optimize(False, False, False)
-         275±2ms         128±40ms     0.46  BenchmarkOptimize.time_optimize(False, False, True)
         82.9±3ms         67.2±2ms    ~0.81  BenchmarkOptimize.time_optimize(False, True, False)
+      82.4±0.9ms         128±20ms     1.55  BenchmarkOptimize.time_optimize(False, True, True)
-      58.6±0.9ms         48.5±1ms     0.83  BenchmarkOptimize.time_optimize(True, False, False)
-         267±5ms         109±30ms     0.41  BenchmarkOptimize.time_optimize(True, False, True)
         83.1±1ms         78.3±2ms     0.94  BenchmarkOptimize.time_optimize(True, True, False)
+        84.2±2ms         144±10ms     1.71  BenchmarkOptimize.time_optimize(True, True, True)
             183M             180M     0.98  IntegrationTwoDatasets.peakmem_create_result
             202M             198M     0.98  IntegrationTwoDatasets.peakmem_optimize
-         255±3ms          200±4ms     0.78  IntegrationTwoDatasets.time_create_result
-      5.43±0.02s       1.50±0.08s     0.28  IntegrationTwoDatasets.time_optimize

github-actions · 2021-07-14T20:25:53Z

Benchmark is done. Checkout the benchmark result page.
Benchmark differences below 5% might be due to CI noise.

Benchmark diff

Parametrized benchmark signatures:

BenchmarkOptimize.time_optimize(index_dependent, grouped, weight)


All benchmarks:

       before           after         ratio
     [dc00e6da]       [dc14f88a]
     <v0.4.0>                   
-        61.6±2ms         39.3±1ms     0.64  BenchmarkOptimize.time_optimize(False, False, False)
-         281±2ms         124±20ms     0.44  BenchmarkOptimize.time_optimize(False, False, True)
-        82.1±5ms         66.0±4ms     0.80  BenchmarkOptimize.time_optimize(False, True, False)
         87.1±5ms         127±20ms    ~1.45  BenchmarkOptimize.time_optimize(False, True, True)
         61.1±2ms         52.1±3ms    ~0.85  BenchmarkOptimize.time_optimize(True, False, False)
-         299±4ms         123±40ms     0.41  BenchmarkOptimize.time_optimize(True, False, True)
         89.7±2ms         78.9±2ms    ~0.88  BenchmarkOptimize.time_optimize(True, True, False)
         87.4±4ms         146±20ms    ~1.67  BenchmarkOptimize.time_optimize(True, True, True)
             181M             184M     1.02  IntegrationTwoDatasets.peakmem_create_result
             201M             202M     1.01  IntegrationTwoDatasets.peakmem_optimize
-         255±4ms          200±3ms     0.78  IntegrationTwoDatasets.time_create_result
       5.70±0.05s       1.34±0.08s    ~0.23  IntegrationTwoDatasets.time_optimize

* Added benchmark for Problem class * removed print * ♻️ 'Refactored by Sourcery' * 🧹 Moved glotaran/analysis/test/test_relations.py to benchmark/pytest/analysis/test_problem.py Use 'pytest benchmark/pytest/' to run the benchmarks * Numerous performance tweaks * Don't weight data with ones if no weight supplied * Fic duplicate call to create result_dataset * Removed dead code * Switched back to pure numpy in problem * Fixed example * Cleanunp * Update glotaran/analysis/problem_ungrouped.py Co-authored-by: Sebastian Weigand <[email protected]> * Update glotaran/analysis/problem_ungrouped.py Co-authored-by: Sebastian Weigand <[email protected]> * Update glotaran/analysis/util.py Co-authored-by: Sebastian Weigand <[email protected]> Co-authored-by: Sourcery AI <> Co-authored-by: s-weigand <[email protected]>

joernweissenborn requested a review from jsnel as a code owner July 8, 2021 13:36

sourcery-ai bot mentioned this pull request Jul 8, 2021

Added benchmark for Problem class (Sourcery refactored) #741

Closed

joernweissenborn requested a review from a team as a code owner July 9, 2021 12:01

s-weigand force-pushed the staging branch from 8ae1be7 to cf16572 Compare July 9, 2021 20:32

joernweissenborn requested a review from s-weigand as a code owner July 9, 2021 20:32

joernweissenborn and others added 4 commits July 9, 2021 22:35

Added benchmark for Problem class

72e79f9

removed print

7a1a683

♻️ 'Refactored by Sourcery'

3758839

🧹 Moved glotaran/analysis/test/test_relations.py

bd05752

to benchmark/pytest/analysis/test_problem.py Use 'pytest benchmark/pytest/' to run the benchmarks

s-weigand force-pushed the performance/fixregressions branch from acc295e to bd05752 Compare July 9, 2021 20:35

joernweissenborn changed the title ~~Added benchmark for Problem class~~ Fix Perfomance Regrssions Jul 10, 2021

s-weigand force-pushed the performance/fixregressions branch from 3876441 to 20a021a Compare July 10, 2021 13:07

Numerous performance tweaks

bc4e951

s-weigand force-pushed the performance/fixregressions branch from 20a021a to bc4e951 Compare July 10, 2021 13:14

Don't weight data with ones if no weight supplied

0e52c83

s-weigand changed the title ~~Fix Perfomance Regrssions~~ 🩹 Fix Performance Regressions Jul 10, 2021

s-weigand mentioned this pull request Jul 10, 2021

🔧 Mock '_create_result' in benchmarks #745

Closed

joernweissenborn added 5 commits July 11, 2021 14:05

Fic duplicate call to create result_dataset

6514471

Removed dead code

a11ed41

Switched back to pure numpy in problem

75a4ff1

Fixed example

d8ee9f9

Cleanunp

8cc6e6e

jsnel approved these changes Jul 14, 2021

View reviewed changes

s-weigand requested changes Jul 14, 2021

View reviewed changes

glotaran/analysis/problem_ungrouped.py Outdated Show resolved Hide resolved

glotaran/analysis/problem_ungrouped.py Outdated Show resolved Hide resolved

glotaran/analysis/util.py Outdated Show resolved Hide resolved

glotaran/analysis/util.py Show resolved Hide resolved

Update glotaran/analysis/problem_ungrouped.py

dc14f88

Co-authored-by: Sebastian Weigand <[email protected]>

joernweissenborn and others added 2 commits July 14, 2021 21:08

Update glotaran/analysis/problem_ungrouped.py

ed5d69d

Co-authored-by: Sebastian Weigand <[email protected]>

Update glotaran/analysis/util.py

3287157

Co-authored-by: Sebastian Weigand <[email protected]>

s-weigand mentioned this pull request Jul 15, 2021

♻️ Make '_get_area' return a single value instead of an array #746

Closed

s-weigand merged commit f792747 into glotaran:staging Jul 15, 2021

jsnel mentioned this pull request Sep 16, 2021

✨✨✨ Megacomplex based models, full models (spectrotemporal), damped oscillations models (DOAS), 🧪 test result validation ✔️ #778

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🩹 Fix Performance Regressions #740

🩹 Fix Performance Regressions #740

joernweissenborn commented Jul 8, 2021 •

edited

Loading

github-actions bot commented Jul 8, 2021

codecov bot commented Jul 8, 2021 •

edited

Loading

s-weigand commented Jul 8, 2021 •

edited

Loading

jsnel commented Jul 14, 2021

jsnel left a comment

s-weigand left a comment •

edited

Loading

joernweissenborn commented Jul 14, 2021

sourcery-ai bot commented Jul 14, 2021

sonarqubecloud bot commented Jul 14, 2021

s-weigand commented Jul 14, 2021

github-actions bot commented Jul 14, 2021

🩹 Fix Performance Regressions #740

🩹 Fix Performance Regressions #740

Conversation

joernweissenborn commented Jul 8, 2021 • edited Loading

Checklist

Closes issues

github-actions bot commented Jul 8, 2021

codecov bot commented Jul 8, 2021 • edited Loading

Codecov Report

s-weigand commented Jul 8, 2021 • edited Loading

jsnel commented Jul 14, 2021

jsnel left a comment

Choose a reason for hiding this comment

s-weigand left a comment • edited Loading

Choose a reason for hiding this comment

joernweissenborn commented Jul 14, 2021

sourcery-ai bot commented Jul 14, 2021

Sourcery Code Quality Report

Legend and Explanation

sonarqubecloud bot commented Jul 14, 2021

s-weigand commented Jul 14, 2021

github-actions bot commented Jul 14, 2021

joernweissenborn commented Jul 8, 2021 •

edited

Loading

codecov bot commented Jul 8, 2021 •

edited

Loading

s-weigand commented Jul 8, 2021 •

edited

Loading

s-weigand left a comment •

edited

Loading