👌 Allow more natural column names in pandas parameters file reading #1174

s-weigand · 2022-11-10T22:21:48Z

PR #1135 changed the column names from 'non-negative' -> 'non_negative' and 'standard-error' -> 'standard_error' which causes crashes in load_parameters when reading parameters files that were created before the merge (e.g. 'optimized_parameters.csv').

In addition to reenabling the reading of old parameter files saved with the pandas plugins (csv, tsv, xlsx, ods), this PR allows the same more natural/shorter column names which are allowed in yaml files for the pandas plugins and casts the column names to lower strings to counter capitalization problems caused by auto-capitalization from excel and the likes.

pyglotaran/glotaran/parameter/parameter.py

Lines 39 to 45 in e70f760

    
           OPTION_NAMES_SERIALIZED = { 
        
               "expression": "expr", 
        
               "maximum": "max", 
        
               "minimum": "min", 
        
               "non_negative": "non-negative", 
        
               "standard_error": "standard-error", 
        
           }

And it removed the unintended requirement to define the non_negative and vary columns, which are optional and can be filled with default values.

Change summary

Checklist

✔️ Passing the tests (mandatory for all PR's)
🚧 Added changes to changelog (mandatory for all PR's)
🧪 Adds new tests for the feature (mandatory for ✨ feature and 🩹 bug fix PR's)

sourcery-ai · 2022-11-10T22:21:58Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.85%.

Quality metrics	Before	After	Change
Complexity	3.32 ⭐	2.96 ⭐	-0.36 👍
Method Length	63.08 🙂	62.89 🙂	-0.19 👍
Working memory	6.83 🙂	6.66 🙂	-0.17 👍
Quality	74.79% 🙂	75.64% ⭐	0.85% 👍

Other metrics	Before	After	Change
Lines	918	971	53

Changed files	Quality Before	Quality After	Quality Change
glotaran/builtin/io/pandas/csv.py	79.27% ⭐	78.41% ⭐	-0.86% 👎
glotaran/builtin/io/pandas/xlsx.py	91.00% ⭐	87.20% ⭐	-3.80% 👎
glotaran/builtin/io/pandas/test/test_pandas_parameters.py	74.11% 🙂	78.35% ⭐	4.24% 👍
glotaran/parameter/parameters.py	71.42% 🙂	71.26% 🙂	-0.16% 👎
glotaran/parameter/test/test_parameters.py	78.01% ⭐	78.77% ⭐	0.76% 👍

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
glotaran/parameter/parameters.py	Parameters.from_dataframe	14 🙂	177 😞	11 😞	45.17% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/parameter/parameters.py	param_dict_to_markdown	6 ⭐	168 😞	13 😞	49.98% 😞	Try splitting into smaller methods. Extract out complex expressions
glotaran/parameter/test/test_parameters.py	test_parameters_array_conversion	1 ⭐	228 ⛔	9 🙂	57.56% 🙂	Try splitting into smaller methods
glotaran/builtin/io/pandas/test/test_pandas_parameters.py	test_roundtrips	4 ⭐	145 😞	8 🙂	63.75% 🙂	Try splitting into smaller methods
glotaran/parameter/parameters.py	Parameters.get_label_value_and_bounds_arrays	4 ⭐	94 🙂	11 😞	63.96% 🙂	Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

github-actions · 2022-11-10T22:22:01Z

👈 Launch a binder notebook on branch s-weigand/pyglotaran/fix-parameters-io

codecov · 2022-11-10T22:27:58Z

Codecov Report

Base: 87.6% // Head: 87.7% // Increases project coverage by +0.1% 🎉

Coverage data is based on head (2397bf2) compared to base (e70f760).
Patch coverage: 100.0% of modified lines in pull request are covered.

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1174     +/-   ##
=======================================
+ Coverage   87.6%   87.7%   +0.1%     
=======================================
  Files        103     103             
  Lines       4898    4904      +6     
  Branches     806     808      +2     
=======================================
+ Hits        4291    4304     +13     
+ Misses       489     486      -3     
+ Partials     118     114      -4

Impacted Files	Coverage Δ
glotaran/builtin/io/pandas/csv.py	`100.0% <100.0%> (ø)`
glotaran/builtin/io/pandas/xlsx.py	`100.0% <100.0%> (ø)`
glotaran/parameter/parameters.py	`96.3% <100.0%> (+4.2%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

glotaran/parameter/parameters.py

jsnel

Agree, let's move the column renaming out of the parameters.py, validity of column names should be checked at parsing time (it's part of the user-facing API and should be validated, and errors (e.g. typos) reported back to the user.
In this case, csv.py and xlsx.py can handle the renaming.

Some other remarks while reviewing related code:

from_dataframe implicitly introduces required column names (e.g. minimum, maximum, non_negative, vary) which are not required in other input formats (e.g. yaml) because they have default values. This should not behave differently in that regard.
please allow for shorthand notation as well (expr | expression, min | minimum, etc)

jsnel · 2022-11-11T23:52:44Z

Also please, if possible, check for (accidental) capitalization and provide feedback to the user (error in yaml, warning in csv/xlsx). However, in the case of xlsx format just ignore it (lowercase it) (e.g. LibreOffice does auto capitalization of string labels).

PR glotaran#1135 changed the colmn names from 'non-negative' -> 'non_negative' and 'standard-error' -> 'standard_error' which causes crashes in load_parameters when reading parameters files that were created before the merge (e.g. 'optimized_parameters.csv'). This change allows reading of those old files, while saving still uses the new version

👌 Allow alternate notation used in yaml for pandas plugins 👌 Make column names case insensitive for pandas plugins

sonarqubecloud · 2022-11-12T21:19:09Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
0.0% Duplication

The requested changes has been implemented. parameters.py no longer does df.rename, its has moved to csv.py and xlsx.py

s-weigand requested review from jsnel, joernweissenborn and a team as code owners November 10, 2022 22:21

joernweissenborn previously requested changes Nov 10, 2022

View reviewed changes

glotaran/parameter/parameters.py Outdated Show resolved Hide resolved

jsnel requested changes Nov 11, 2022

View reviewed changes

s-weigand added 2 commits November 12, 2022 02:14

🚧📚 Added change to change log

45f34b8

s-weigand force-pushed the fix-parameters-io branch from e47be7e to 6aa60e6 Compare November 12, 2022 13:45

s-weigand requested review from jsnel and joernweissenborn November 12, 2022 13:46

🧹 Moved column renaming resposibility to plugin implementation

cf33e1e

👌 Allow alternate notation used in yaml for pandas plugins 👌 Make column names case insensitive for pandas plugins

s-weigand marked this pull request as draft November 12, 2022 14:15

s-weigand force-pushed the fix-parameters-io branch 2 times, most recently from bb9696c to 1d196c3 Compare November 12, 2022 15:51

🧪 Added unittest for Parameters.from_dataframe column validation

2af0143

s-weigand force-pushed the fix-parameters-io branch from 1d196c3 to 2af0143 Compare November 12, 2022 15:56

s-weigand changed the title ~~🩹 Fix reading of old pandas saved parameters files~~ 👌 Allow more natural column names in pandas parameters file reading Nov 12, 2022

🚧📚 Updated changelog with changed PR title

88e85a8

s-weigand marked this pull request as ready for review November 12, 2022 16:16

🧪Added parameters subset csv test

2397bf2

jsnel approved these changes Nov 12, 2022

View reviewed changes

jsnel merged commit 8116af0 into glotaran:main Nov 12, 2022

jsnel deleted the fix-parameters-io branch November 12, 2022 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

👌 Allow more natural column names in pandas parameters file reading #1174

👌 Allow more natural column names in pandas parameters file reading #1174

s-weigand commented Nov 10, 2022 •

edited

Loading

sourcery-ai bot commented Nov 10, 2022 •

edited

Loading

github-actions bot commented Nov 10, 2022

codecov bot commented Nov 10, 2022 •

edited

Loading

jsnel left a comment

jsnel commented Nov 11, 2022

sonarqubecloud bot commented Nov 12, 2022

	OPTION_NAMES_SERIALIZED = {
	"expression": "expr",
	"maximum": "max",
	"minimum": "min",
	"non_negative": "non-negative",
	"standard_error": "standard-error",
	}

👌 Allow more natural column names in pandas parameters file reading #1174

👌 Allow more natural column names in pandas parameters file reading #1174

Conversation

s-weigand commented Nov 10, 2022 • edited Loading

Change summary

Checklist

sourcery-ai bot commented Nov 10, 2022 • edited Loading

Sourcery Code Quality Report

Legend and Explanation

github-actions bot commented Nov 10, 2022

codecov bot commented Nov 10, 2022 • edited Loading

Codecov Report

jsnel left a comment

Choose a reason for hiding this comment

jsnel commented Nov 11, 2022

sonarqubecloud bot commented Nov 12, 2022

s-weigand commented Nov 10, 2022 •

edited

Loading

sourcery-ai bot commented Nov 10, 2022 •

edited

Loading

codecov bot commented Nov 10, 2022 •

edited

Loading