Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit validation functionalities to validate() function #84

Merged
merged 14 commits into from
Nov 25, 2020
Merged
30 changes: 29 additions & 1 deletion nomenclature/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,24 @@ def validate(df):

# validate all (other) columns
for col, codelist, ext in cols:
invalid = [c for c in df.data[col].unique() if c not in codelist]
invalid = []

# check variables for name and unit
if col == 'variable':
for c in df.data[col].unique():
# check if name is in codelist
# and unit in the .yaml file description
if (c not in codelist) or not(
all(_s in variables[c][
'unit'] for _s in df.data.loc[
df.data['variable'] == c]['unit'].values)):
invalid.append(c)
success = False
# check if only unit is not valid
invalid = _validate_unit(invalid)
else:
invalid = [c for c in df.data[col].unique() if c not in codelist]


# check if entries in the invalid list are related to directional data
if col == 'region' and invalid:
Expand Down Expand Up @@ -265,3 +282,14 @@ def _validate_directional(x):
"""Utility function to check whether region-to-region code is valid"""
x = x.split('>')
return len(x) == 2 and all([i in regions for i in x])


def _validate_unit(x):
# sub function to filter out variables with valid name
for i in reversed(x): # iterate list reversely due to 'remove' method
if i in variables.keys():
logger.warning(
'Unit for variable %s is not given in %s.', i, variables[
i]['unit'])
sebastianzwickl marked this conversation as resolved.
Show resolved Hide resolved
x.remove(i)
return x
2 changes: 1 addition & 1 deletion nomenclature/definitions/variable/economy/economy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ Price|Final Energy|Residential|Electricity:
Prices should include the effect of carbon prices
Mean price should reflect the variability of different prices that are accessible
to end-users (including regulated prices, prices proposed by different competiting retailers...)
unit: US$2010/GJ
unit: [euro/kWh, US$2010/GJ]
danielhuppmann marked this conversation as resolved.
Show resolved Hide resolved

Price|Final Energy|Residential|Gases|Natural Gas:
description: Mean Natural gas price at the final level in the residential sector.
Expand Down
4 changes: 4 additions & 0 deletions nomenclature/tests/test_validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,7 @@ def test_validate_time_entry():
replace([2005, 2010], value=['2005-06-17 00:00+01:00',
'2010-07-21 12:00+01:00'])
assert validate(IamDataFrame(df_sub))


def test_validate_unit_entry():
assert not (validate(df.rename(unit={'EJ/yr': 'MWh'})))