Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checks in readers/writers to enforce correct types for event_name and event_id container and elements #951

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

spjuhel
Copy link
Collaborator

@spjuhel spjuhel commented Sep 26, 2024

Changes proposed in this PR:

  • Update all writers in Impact and Hazard to fail if event_name attribute contains elements that are not strings.
  • In Impact readers (from_excel and from_csv), check if event_name elements are all string, and warn user and convert to string otherwise, as reading relies on pandas here, checking and casting also use pandas. (note, that changes to from_hdf5 were done in Explicitly convert event_name to strings in Impact.from_hdf5 #894)
  • Add a _check_and_cast_attr method in HazardIO with the purpose of checking the type of both containers and elements of hazard attributes read by the different reader, and casting when required (warning the user in this case).
  • Use this method in from_raster, from_xarray_raster, from_excel and from_csv to effectively enforce correct types.
  • Update the test to be consistent with the changes

The idea behind implementing _check_and_cast_attr is to factorize the checking/casting and allow for possible additional test on the type of container/elements of other attributes in the future.

Currently, it only checks that (when read, i.e., before being set as an attribute of the object):

  • event_name is a list of str
  • event_id is a ndarray (not element type check)

This should be considered as a "medium term" temporary solution, before handling attributes such as event_id event_name in a DataFrame, which would greatly simplify reading, writing, type checking, etc.

This PR fixes #910 #950

PR Author Checklist

PR Reviewer Checklist

spjuhel added 4 commits September 25, 2024 09:42
The rationale is to factorize the checking of hazard attribute in readers, for
possible future checks than event_name == str
`assert_almost_equals` tries to convert to float which failed for some reason.
In any case we want the event_names to be completely equal not almost.
@spjuhel spjuhel changed the base branch from main to develop September 26, 2024 09:07
@spjuhel spjuhel self-assigned this Sep 26, 2024
@spjuhel
Copy link
Collaborator Author

spjuhel commented Sep 26, 2024

Note that I still need to check/decide if additional tests that make sure a warning is raised, or writing fails in the corresponding case is required.

Any discussion on this will be appreciated :)

@chahank
Copy link
Member

chahank commented Dec 10, 2024

@spjuhel Any plan with this PR? Is this ready for review? In this case, please assign a reviewer. Thanks!

@spjuhel
Copy link
Collaborator Author

spjuhel commented Dec 12, 2024

Just asked @emanuel-schmid to review as it is slightly technical.

I did not add any tests, still unsure whether this would be useful...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure proper handling of event_name attribute in Hazard and Impact
3 participants