Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in Climatic > Prepare > Transform > Water Balance #9231

Open
N-thony opened this issue Nov 5, 2024 · 11 comments
Open

Bug in Climatic > Prepare > Transform > Water Balance #9231

N-thony opened this issue Nov 5, 2024 · 11 comments
Assignees

Comments

@N-thony
Copy link
Collaborator

N-thony commented Nov 5, 2024

@lilyclements @MeSophie there is this error below reported by Lisa when testing R-Instat test version 0.8.0
image
image
I suspect in the Evaporation option the Variable receiver is not setting the value correctly to the associated R argument?

@MeSophie
Copy link
Contributor

MeSophie commented Nov 6, 2024

@N-thony Please is it possible to obtain the data using by Lisa? I used dodoma version then I created a dummy evaporation variable (prepare>Data Frame>Insert Column) and everything works well on R-Instant version 0.7.20 and on the developer version so I couldn't reproduce the error. Also Can I find version 0.8.0 in the wiki? Even though the download link doesn't open at the moment.

@N-thony
Copy link
Collaborator Author

N-thony commented Nov 6, 2024

@N-thony Please is it possible to obtain the data using by Lisa? I used dodoma version then I created a dummy evaporation variable (prepare>Data Frame>Insert Column) and everything works well on R-Instant version 0.7.20 and on the developer version so I couldn't reproduce the error. Also Can I find version 0.8.0 in the wiki? Even though the download link doesn't open at the moment.

@MeSophie you get this error if you have a missing value in your data.

@N-thony
Copy link
Collaborator Author

N-thony commented Nov 6, 2024

@N-thony Please is it possible to obtain the data using by Lisa? I used dodoma version then I created a dummy evaporation variable (prepare>Data Frame>Insert Column) and everything works well on R-Instant version 0.7.20 and on the developer version so I couldn't reproduce the error. Also Can I find version 0.8.0 in the wiki? Even though the download link doesn't open at the moment.

@MeSophie you get this error if you have a missing value in your data.

@rdstern when I ignore this row containing NA, it works fine. The R message "missing value where TRUE/FALSE needed" is generally informative, I wonder on how we can catch this type of message, improve it to inform the user in simple terms, such as "Some data fields are missing values. Please check your data for completeness."

@MeSophie
Copy link
Contributor

MeSophie commented Nov 6, 2024

@N-thony Please is it possible to obtain the data using by Lisa? I used dodoma version then I created a dummy evaporation variable (prepare>Data Frame>Insert Column) and everything works well on R-Instant version 0.7.20 and on the developer version so I couldn't reproduce the error. Also Can I find version 0.8.0 in the wiki? Even though the download link doesn't open at the moment.

@MeSophie you get this error if you have a missing value in your data.

Okay I will try to introduce missing value in my data.

@lilyclements
Copy link
Contributor

lilyclements commented Nov 6, 2024

We should have a discussion on how we want to handle it when there are NA values in the evapotranspiration variable. We discussed it briefly here, and in person, and my memory was that we left it for now, and I meanwhile suggested to CIMH to replace NAs with a value of their choice using a different dialog first (Replace Values dialog).

How do we want to handle it?
Perhaps we should offer here an option to "Replace Missing Values With: ____________" in the dialog for the evapotranspiration variable?
If checked, we can just run a function which replaces NAs

E.g.,

df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
df %>% mutate(x = replace_na(x, 3))

# or

df$x <- replace_na(df$x, 3)

If they want more advanced options (E.g., locf), then they can do that in the Replace Values dialog.

@rdstern @MeSophie @N-thony what do you think?

###################

Changes to our R Code:

# Dialog: Transform

# add in a new line to rename the evapotranspiration variable. Here, this variable is `evapo`, and the value it is being replaced with is `5`
evapotranspiration <- instat_calculation$new(type="calculation", function_exp="replace_na(evapo, 5)", result_name="evapo", calculated_from=list("dodoma"="evapo"))

rain_min <- instat_calculation$new(type="calculation", function_exp="ifelse(test=is.na(x=rain), yes=0, no=rain)", result_name="rain_min", calculated_from=list("dodoma"="rain"))

# in wb_min, we need to add our new evapotranspiration calculation to our sub_calculations. This just means changing list(rain_min) to now be list(rain_min, evapotranspiration)
wb_min <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate2(.f= ~ pmin(pmax(..1 + ..2 - WB_evaporation(..1, 0.5, 90, ..3, ..2), 0), 90), .y=tail(x=evapo, n=-1), .x=tail(x=rain_min, n=-1), .init=0)", result_name="wb_min", sub_calculations=list(rain_min, evapotranspiration))

rain_max <- instat_calculation$new(type="calculation", function_exp="ifelse(yes=100, test=is.na(x=rain), no=rain)", result_name="rain_max", calculated_from=list("dodoma"="rain"))

# similarly, we need to add our new evapotranspiration calculation to our sub_calculations for wb_max:
wb_max <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate2(.f= ~ pmin(pmax(..1 + ..2 - WB_evaporation(..1, 0.5, 90, ..3, ..2), 0), 90), .y=tail(x=evapo, n=-1), .x=tail(x=rain_max, n=-1), .init=0)", result_name="wb_max", sub_calculations=list(rain_max, evapotranspiration))

wb <- instat_calculation$new(type="calculation", function_exp="round(ifelse(test=(wb_min != wb_max) | is.na(x=rain), yes=NA, no=wb_min), 1)", result_name="water", adjacent_column="rain", save=2, sub_calculations=list(wb_min, wb_max), before=FALSE)

@rdstern
Copy link
Collaborator

rdstern commented Nov 6, 2024

I suggest we continue to not allow missing values in the evapotranspiration variable. We can have missing values in the rainfall variable and have gone to a lot of trouble there. So we just need to check first and give a message. Sorry, missing values are not allowed in the evaporation variable. I suggest it isn't worth trying to get cleverer on different methods of estimating them.

@N-thony
Copy link
Collaborator Author

N-thony commented Nov 7, 2024

I suggest we continue to not allow missing values in the evapotranspiration variable. We can have missing values in the rainfall variable and have gone to a lot of trouble there. So we just need to check first and give a message. Sorry, missing values are not allowed in the evaporation variable. I suggest it isn't worth trying to get cleverer on different methods of estimating them.

@rdstern yes, and probably have the OK button disable in such cases. @MeSophie can you make the change?
Also I noticed, when I intentionally introduced single NA value in the first row of the evaporation variable, I don't get any error message but I get it anywhere else.

@rdstern
Copy link
Collaborator

rdstern commented Nov 10, 2024

@MeSophie or @lilyclements or @Vitalis95 I have just run an example again, with a missing value in the evap column. It worked, but then gave me an unusable water column, because the values were all missing from then on.
We need a little routine (a new function?) that simply checks the evap variable (or any variable) for missing values and then gives a sensible message. Perhaps something like: Sorry, missing values are not permitted in this variable here. They have to be estimated first.

I hope this will be easy to do? I assume it should also be added to the Climatic > Prepare > End of Rains (end of season option)

@lilyclements
Copy link
Contributor

To check for missing values in a variable, the function would be:

missing_variable <- function(variable){
  any(is.na(variable))
}

Then to run it, the r code would be:

calc <- data_book$get_columns_from_data(data_name = "dodoma", col_names = "calc")
missing_variable(calc)

Where calc is our column in the dodoma data.
(To do get_columns_from_data, we just run like we do on other dialogs -- ucrReceiverNAME.SetParameterIsRFunction(), you don't need to explicitly write that line out).

@N-thony
Copy link
Collaborator Author

N-thony commented Nov 12, 2024

To check for missing values in a variable, the function would be:

missing_variable <- function(variable){
  any(is.na(variable))
}

Then to run it, the r code would be:

calc <- data_book$get_columns_from_data(data_name = "dodoma", col_names = "calc")
missing_variable(calc)

Where calc is our column in the dodoma data. (To do get_columns_from_data, we just run like we do on other dialogs -- ucrReceiverNAME.SetParameterIsRFunction(), you don't need to explicitly write that line out).

@Vitalis95 can you make the change?

@lilyclements
Copy link
Contributor

@Vitalis95 the R code can alternatively be

calc <- data_book$get_columns_from_data(data_name = "dodoma", col_names = "calc")
any(is.na(calc ))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants