-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add seasonal statistics file to copernicus module #156
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Dahyann, I finished reviewing your PR, great work!
I have made a few comments, you can have a look at them separatly for more details. Here are my main points:
- The code and its structure looks good to me overall. However, I think it would benefit from some refactoring and more details in the documentation to improve its user-friednliness.
- For instance, I really like the class
index_definitions
, but I feel like it would be beneficial to use it more extensively in the other modules e.g. consistently use the variable names that are defined there throughout your code, or use the fact that you defined there the variables required for the computation of each indices in thecalcule_heat_indices_metrics function
- Many lines are not covered by tests (see Jenkins). I am not an expert in testing but maybe you could check if the tests you wrote are sufficient or not. Apart from this the tests I saw looked good to me.
- I was also a bit confused about the fact that for most functions of the
heat_index
module, you require an input in Kelvin, but you immediatly convert it to Celsius and eventually returns the value to Celsius. Maybe you could clarify on that. - In the
index_definitions
there are also some lines that appear duplicated, so you might want to check into that.
|
||
--- | ||
|
||
File to calculate different seasonal forecast indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative wording: "File to calculate different indices from seasonal forecasts"
|
||
|
||
def kelvin_to_fahrenheit(kelvin): | ||
return (kelvin - 273.15) * 9 / 5 + 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not covered by tests. One could also think of checking that the input type is numeric and that the range is within the Kelvin scale. But maybe it is not so necessary. It also depends on how much user friendly you want to be or not..
|
||
|
||
def fahrenheit_to_kelvin(fahrenheit): | ||
return (fahrenheit - 32) * 5 / 9 + 273.15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as for the kelvin_to_fahrenheit
function. The comment also applies to celsius_to_kelvin
, kelvin_to_celsius
.
float or array-like | ||
Relative humidity as a percentage (0-100) or as a decimal value (0-1), depending on the `as_percentage` setting. | ||
""" | ||
t2c = t2k - 273.15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here why not use the functions that you defined just before i.e. kelvin_to_celsius
?
Relative humidity as a percentage (0-100) or as a decimal value (0-1), depending on the `as_percentage` setting. | ||
""" | ||
t2c = t2k - 273.15 | ||
tdc = tdk - 273.15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not sure I understand why you specifically ask for an input value in Kelvin if it is to convert it to Celsius..
with xr.open_dataset(input_file_name, engine=engine) as daily_ds: | ||
# Handling various indices | ||
if index_metric == "Tmean": | ||
da_index = daily_ds["t2m_mean"] - 273.15 # Kelvin to Celsius |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here why not used the function defined in heat_index.py to convert Kelvin to Celsius?
Parameters | ||
---------- | ||
input_file_name : str | ||
Path to the input data file containing temperature and dewpoint information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this it is not really clear to me what should the format of the input file (.csv, netcdf?), or what the input file should actually contain. Later I see that you also ask for e.g. windspeeds so I would suggest to write:
Path to the input data file containing the variables required for the computation of the desired index. Information on the required variables can be found in the index_definitions class.
daily_ds["t2m_mean"], daily_ds["d2m_mean"] | ||
) | ||
elif index_metric == "AT": | ||
u10 = daily_ds.get("u10_max", daily_ds.get("10m_u_component_of_wind")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I see you only test if the required variables are present in the case of wind. Maybe you could make it more consistent by e.g. writing a function test_if_variables
that would check if the required variables are in the input file, using the variables
field in the index_definitions
. Then you would only need to call this function once instead of doing a check for each index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you could also collect the units from the index_definitions and add it as an attribute of your array once at the end instead of repeating it in each condition.
monthly = da_index.groupby("forecast_month").sum(dim="step") | ||
else: | ||
raise ValueError( | ||
f"Unknown method {method} to compute monthly data. Please use 'mean' or 'sum'." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be written "count" instead of "sum"
label = f"ensemble_p{int(level * 100)}" | ||
ds_stats[label] = ensemble_percentiles.sel(quantile=level) | ||
|
||
return ds_stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder here if it could be helpful compute everything by default but still let the user define what statistics they want to compute. It might be worth if some calculations are costly (e.g. the quantiles). However it might make the functions a bit less compact.
This section is part of the ongoing development of the "Copernicus Forecast Module" (see PR #142).
Changes introduced in this PR:
• Added seasonal_statistics.py to the climada_petals.hazard.copernicus_interface module. It processes climate data from NetCDF/GRIB files, computes daily and monthly statistics, and generates ensemble statistics. The module assigns forecast periods, aggregates data, and derives statistical insights for climate risk assessments. Functions include heat index calculations, time grouping, and statistical analysis.
• Added test_seasonal_statistics.py to validate seasonal forecast indices, ensuring accurate calculations for Tmean, Tmin, Tmax, and ensemble statistics. It includes tests for monthly datasets, error handling, and robustness, specifically testing calculate_heat_indices_metrics, calculate_statistics_from_index, calculate_monthly_dataset, and monthly_periods_from_valid_times.
Thank you, Luca. Please review seasonal_statistics.py and test_seasonal_statistics.py. Let me know if you need more information.
PR Author Checklist
develop
)PR Reviewer Checklist