♻️ Refactor Result and Scheme loading to to use 'file' fields #903

s-weigand · 2021-11-15T22:24:12Z

This PR removes the file representation fields from the augmented dataclasses completely and thus simplifies the API
from:

pyglotaran/glotaran/builtin/io/yml/test/test_save_scheme.py

Lines 38 to 45 in ded0711

    
           scheme = Scheme( 
        
               model, 
        
               parameter, 
        
               {"dataset_1": dataset}, 
        
               model_file="m.yml", 
        
               parameters_file="p.csv", 
        
               data_files={"dataset_1": "d.nc"}, 
        
           )

to

    scheme = Scheme(
        model,
        parameter,
        {"dataset_1": dataset},
    )

Additional side effects and improvements:

There now is a glotaran.typing module
FileLoadable classes (Model, ParameterGroup, Scheme, Result, DatasetMapping) know their own file origin
There is a new convenience io function load_datasets which can load datasets in bulk, which then can be consumed by Scheme

Change summary

♻️👌 Removed file fields in ProjectIo like classes and used unified field
♻️🔌 Refactored load_dataset to always return xr.Dataset
♻️ Added type 'StrOrPath' and refactored io plugins with new type
✨ Implemented convenience function 'load_datasets'
♻️👌 Made 'DatasetMapping.source_path' a property accessing the dataset
♻️👌 Replaced all file_representation_field with file_loadable_field
♻️✨ Factored making paths relative and posix style out and added support for Sequence like FileLoadable classes
♻️ Refactored bool_str_repr after sourcery suggested a different change
♻️🩹 Changed implementation of relative_posix_path to use os.path.relpath

Checklist

✔️ Passing the tests (mandatory for all PR's)
👌 Closes issue (mandatory for ✨ feature and 🩹 bug fix PR's)
🧪 Adds new tests for the feature (mandatory for ✨ feature and 🩹 bug fix PR's)

Closes issues

closes #858

github-actions · 2021-11-15T22:24:26Z

👈 Launch a binder notebook on branch s-weigand/pyglotaran/remove-file-fields

codecov · 2021-11-15T22:31:42Z

Codecov Report

Merging #903 (ecdb930) into main (9865243) will increase coverage by 0.3%.
The diff coverage is 95.1%.

@@           Coverage Diff           @@
##            main    #903     +/-   ##
=======================================
+ Coverage   84.8%   85.1%   +0.3%     
=======================================
  Files         81      85      +4     
  Lines       4610    4761    +151     
  Branches     851     880     +29     
=======================================
+ Hits        3910    4053    +143     
- Misses       558     561      +3     
- Partials     142     147      +5

Impacted Files	Coverage Δ
glotaran/builtin/io/yml/yml.py	`90.5% <80.0%> (-0.2%)`	⬇️
glotaran/project/dataclass_helpers.py	`83.8% <81.4%> (+3.2%)`	⬆️
glotaran/plugin_system/data_io_registration.py	`97.2% <88.8%> (-2.8%)`	⬇️
glotaran/builtin/io/folder/folder_plugin.py	`97.6% <100.0%> (ø)`
glotaran/io/__init__.py	`100.0% <100.0%> (ø)`
glotaran/model/model.py	`85.6% <100.0%> (+0.2%)`	⬆️
glotaran/parameter/parameter_group.py	`89.3% <100.0%> (+0.2%)`	⬆️
glotaran/parameter/parameter_history.py	`79.5% <100.0%> (+2.8%)`	⬆️
glotaran/plugin_system/io_plugin_utils.py	`100.0% <100.0%> (ø)`
glotaran/plugin_system/project_io_registration.py	`100.0% <100.0%> (ø)`
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9865243...ecdb930. Read the comment docs.

jsnel

Impressive piece of refactoring ♻️. LGTM.

If the field isn't an instance of the targetClass it will try to load the class instance from file. This also allows classes like scheme to be initialized with file paths directly. When dataclasses with file loadable fileds are serialized the objects will be replaced with their source path. In addition projectIO load and save functions will set the ``source_path`` attribute of the class instances.

Also ensure that paths are passed as posix formatted path

and file loadable wrapper class 'DatasetMapping'. This also allows to load all datasets need for a opimization directly from file path passing a dict with the keys used for the dataset names and the paths as values.

This way it will update when the source_path of the dataset is updated e.g. by calling 'save_dataset'.

Also, removed file_representation_field compleatly.

and added support for Sequence like FileLoadable classes

instead of pathlib.Pathrelative_to This prevents crashes as long at the files are on the same drive.

sourcery-ai · 2021-11-17T09:17:50Z

Sourcery Code Quality Report

❌ Merging this PR will decrease code quality in the affected files by 0.73%.

Quality metrics	Before	After	Change
Complexity	3.96 ⭐	4.11 ⭐	0.15 👎
Method Length	41.47 ⭐	43.76 ⭐	2.29 👎
Working memory	6.55 🙂	6.65 🙂	0.10 👎
Quality	78.30% ⭐	77.57% ⭐	-0.73% 👎

Other metrics	Before	After	Change
Lines	4035	4266	231

Changed files	Quality Before	Quality After	Quality Change
glotaran/builtin/io/folder/folder_plugin.py	55.31% 🙂	59.69% 🙂	4.38% 👍
glotaran/builtin/io/yml/yml.py	79.57% ⭐	78.43% ⭐	-1.14% 👎
glotaran/builtin/io/yml/test/test_save_result.py	88.79% ⭐	83.82% ⭐	-4.97% 👎
glotaran/builtin/io/yml/test/test_save_scheme.py	78.73% ⭐	86.35% ⭐	7.62% 👍
glotaran/deprecation/modules/test/test_project_scheme.py	75.33% ⭐	75.33% ⭐	0.00%
glotaran/io/init.py	87.99% ⭐	87.99% ⭐	0.00%
glotaran/model/model.py	70.82% 🙂	70.82% 🙂	0.00%
glotaran/parameter/parameter_group.py	69.41% 🙂	69.42% 🙂	0.01% 👍
glotaran/parameter/parameter_history.py	94.86% ⭐	94.62% ⭐	-0.24% 👎
glotaran/plugin_system/data_io_registration.py	92.43% ⭐	84.42% ⭐	-8.01% 👎
glotaran/plugin_system/io_plugin_utils.py	84.60% ⭐	85.22% ⭐	0.62% 👍
glotaran/plugin_system/project_io_registration.py	87.08% ⭐	83.86% ⭐	-3.22% 👎
glotaran/plugin_system/test/test_data_io_registration.py	92.24% ⭐	90.24% ⭐	-2.00% 👎
glotaran/plugin_system/test/test_project_io_registration.py	91.22% ⭐	90.24% ⭐	-0.98% 👎
glotaran/project/dataclass_helpers.py	61.09% 🙂	59.78% 🙂	-1.31% 👎
glotaran/project/result.py	76.37% ⭐	77.52% ⭐	1.15% 👍
glotaran/project/scheme.py	72.83% 🙂	72.07% 🙂	-0.76% 👎
glotaran/project/test/test_dataclass_helpers.py	84.03% ⭐	83.83% ⭐	-0.20% 👎
glotaran/project/test/test_result.py	80.28% ⭐	80.10% ⭐	-0.18% 👎
glotaran/project/test/test_scheme.py	80.83% ⭐	80.83% ⭐	0.00%

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
glotaran/parameter/parameter_group.py	ParameterGroup.from_dataframe	28 😞	267 ⛔	13 😞	26.41% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
glotaran/project/dataclass_helpers.py	asdict	18 🙂	122 😞	13 😞	44.50% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/model/model.py	Model.markdown	13 🙂	168 😞	10 😞	48.74% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/project/scheme.py	Scheme.__post_init__	16 🙂	127 😞	11 😞	48.75% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/builtin/io/yml/yml.py	YmlProjectIo.save_model	15 🙂	105 🙂	13 😞	49.06% 😞	Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

sonarqubecloud · 2021-11-17T09:18:37Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
11 Code Smells

No Coverage information
1.8% Duplication

github-actions · 2021-11-17T09:21:55Z

Benchmark is done. Checkout the benchmark result page.
Benchmark differences below 5% might be due to CI noise.

Benchmark diff v0.5.0rc1 vs. main

Parametrized benchmark signatures:

BenchmarkOptimize.time_optimize(index_dependent, grouped, weight)


All benchmarks:

       before           after         ratio
     [d05c042a]       [ecdb9304]
     <v0.5.0rc1>                 
         72.7±1ms         75.0±2ms     1.03  BenchmarkOptimize.time_optimize(False, False, False)
        99.8±30ms         154±20ms    ~1.54  BenchmarkOptimize.time_optimize(False, False, True)
       71.5±0.9ms         74.9±2ms     1.05  BenchmarkOptimize.time_optimize(False, True, False)
        86.6±30ms         147±40ms    ~1.70  BenchmarkOptimize.time_optimize(False, True, True)
         90.1±2ms         93.2±1ms     1.03  BenchmarkOptimize.time_optimize(True, False, False)
         97.8±4ms        99.9±30ms     1.02  BenchmarkOptimize.time_optimize(True, False, True)
         89.0±3ms         91.8±2ms     1.03  BenchmarkOptimize.time_optimize(True, True, False)
         101±20ms        99.2±30ms     0.98  BenchmarkOptimize.time_optimize(True, True, True)
             192M             196M     1.02  IntegrationTwoDatasets.peakmem_optimize
        1.93±0.1s       2.20±0.05s    ~1.14  IntegrationTwoDatasets.time_optimize

Benchmark diff main vs. PR

Parametrized benchmark signatures:

BenchmarkOptimize.time_optimize(index_dependent, grouped, weight)


All benchmarks:

       before           after         ratio
     [98652436]       [ecdb9304]
         73.9±1ms         75.0±2ms     1.02  BenchmarkOptimize.time_optimize(False, False, False)
         124±40ms         154±20ms    ~1.25  BenchmarkOptimize.time_optimize(False, False, True)
         73.8±1ms         74.9±2ms     1.01  BenchmarkOptimize.time_optimize(False, True, False)
         119±30ms         147±40ms    ~1.24  BenchmarkOptimize.time_optimize(False, True, True)
         92.0±1ms         93.2±1ms     1.01  BenchmarkOptimize.time_optimize(True, False, False)
        98.1±20ms        99.9±30ms     1.02  BenchmarkOptimize.time_optimize(True, False, True)
         91.2±1ms         91.8±2ms     1.01  BenchmarkOptimize.time_optimize(True, True, False)
         99.3±3ms        99.2±30ms     1.00  BenchmarkOptimize.time_optimize(True, True, True)
             197M             196M     1.00  IntegrationTwoDatasets.peakmem_optimize
       2.11±0.08s       2.20±0.05s     1.05  IntegrationTwoDatasets.time_optimize

jsnel

Changes (ecdb930) after last review (Changed implementation of relative_posix_path to use os.path.relpath ), reviewed as as ok.

joernweissenborn

Lgtm

s-weigand requested review from joernweissenborn, jsnel and a team as code owners November 15, 2021 22:24

sourcery-ai bot mentioned this pull request Nov 15, 2021

♻️ Refactor Result and Scheme loading to to use 'file' fields (Sourcery refactored) #904

Closed

jsnel previously approved these changes Nov 16, 2021

View reviewed changes

s-weigand added this to the v0.5.0 milestone Nov 16, 2021

s-weigand dismissed jsnel’s stale review via e5ccd12 November 17, 2021 08:14

s-weigand force-pushed the remove-file-fields branch from 65586ac to e5ccd12 Compare November 17, 2021 08:14

s-weigand added 9 commits November 17, 2021 10:17

♻️🔌 Refactored load_dataset to always return xr.Dataset

4bc15da

♻️ Added type 'StrOrPath' and refactored io plugins with new type

ac31db2

Also ensure that paths are passed as posix formatted path

✨ Implemented convenience function 'load_datasets'

4616c3f

and file loadable wrapper class 'DatasetMapping'. This also allows to load all datasets need for a opimization directly from file path passing a dict with the keys used for the dataset names and the paths as values.

♻️👌 Made 'DatasetMapping.source_path' a property accessing the dataset

b349f31

This way it will update when the source_path of the dataset is updated e.g. by calling 'save_dataset'.

♻️👌 Replaced all file_representation_field with file_loadable_field

ead6db8

Also, removed file_representation_field compleatly.

♻️✨ Factored making paths relative and posix style out

2bddb1b

and added support for Sequence like FileLoadable classes

♻️ Refactored bool_str_repr after sourcery suggested a differet change

0465fd8

♻️🩹 Changed implementation of relative_posix_path to use os.path.relpath

ecdb930

instead of pathlib.Pathrelative_to This prevents crashes as long at the files are on the same drive.

s-weigand force-pushed the remove-file-fields branch from e5ccd12 to ecdb930 Compare November 17, 2021 09:17

jsnel approved these changes Nov 17, 2021

View reviewed changes

joernweissenborn approved these changes Nov 18, 2021

View reviewed changes

jsnel merged commit 2d44c75 into glotaran:main Nov 18, 2021

jsnel deleted the remove-file-fields branch November 18, 2021 04:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

♻️ Refactor Result and Scheme loading to to use 'file' fields #903

♻️ Refactor Result and Scheme loading to to use 'file' fields #903

s-weigand commented Nov 15, 2021 •

edited

Loading

github-actions bot commented Nov 15, 2021

codecov bot commented Nov 15, 2021 •

edited

Loading

jsnel left a comment

sourcery-ai bot commented Nov 17, 2021

sonarqubecloud bot commented Nov 17, 2021

github-actions bot commented Nov 17, 2021

jsnel left a comment

joernweissenborn left a comment

	scheme = Scheme(
	model,
	parameter,
	{"dataset_1": dataset},
	model_file="m.yml",
	parameters_file="p.csv",
	data_files={"dataset_1": "d.nc"},
	)

♻️ Refactor Result and Scheme loading to to use 'file' fields #903

♻️ Refactor Result and Scheme loading to to use 'file' fields #903

Conversation

s-weigand commented Nov 15, 2021 • edited Loading

Additional side effects and improvements:

Change summary

Checklist

Closes issues

github-actions bot commented Nov 15, 2021

codecov bot commented Nov 15, 2021 • edited Loading

Codecov Report

jsnel left a comment

Choose a reason for hiding this comment

sourcery-ai bot commented Nov 17, 2021

Sourcery Code Quality Report

Legend and Explanation

sonarqubecloud bot commented Nov 17, 2021

github-actions bot commented Nov 17, 2021

jsnel left a comment

Choose a reason for hiding this comment

joernweissenborn left a comment

Choose a reason for hiding this comment

s-weigand commented Nov 15, 2021 •

edited

Loading

codecov bot commented Nov 15, 2021 •

edited

Loading