Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Set all Hazard attributes in __init__ #546

Closed
wants to merge 5 commits into from

Conversation

peanutfun
Copy link
Member

@peanutfun peanutfun commented Oct 14, 2022

WIP: This serves as an example draft

Changes proposed in this PR:

  • Update Hazard.__init__ to take all data as arguments, skipping the null-initialization.
  • Default-initialize all of the attributes that do not require user input.
  • Perform a self-check after initialization by default. Users can opt to skip the check by setting fast=True.

To Do

  • Remove superfluous checks from Hazard.from_raster_xarray

PR Author Checklist

PR Reviewer Checklist

* Update Hazard.__init__ to take all data as arguments, skipping the
  null-initialization.
* Update some of the unit tests.

This is an incomplete update, as most of the code base still needs to be
updated to the new __init__.
"""
Initialize values.
def __init__(self,
haz_type: str,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only (?) major change of the interface so far. Here, I'm requiring users to specify the hazard type. I opted for this because I feel like the hazard type is a very important bit of information. I could give it an empty string as default value, but then I would need to move the argument behind the parameters without default types. That would make it look "less important" to me.

@chahank, what do you think about this?

Copy link
Member

@chahank chahank Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, to be consistent with the previous API, all attributes must have the same defaults as before. I think changing the defaults (and removing some) is a major change, and should not be done lightly. I would, although it is still sub-optimal, for this update, just keep all the defaults.

See for instance the example of the Impact class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I have a follow-up issue: The haz_type is actually an attribute of the Tag, not the Hazard. So for keeping the previous defaults, we could simply remove the haz_type from the argument list. It will then become part of the tag_kwargs. However, that obscures this very important variable. What to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand. Currently, the default is haz_type='', and it is a variable of the __init__... why would that be a problem now? You can just leave it as it is right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same default as in Tag. Such double-defaults are dangerous because the default in Hazard would ignore any changes to the default in Tag.

Another viable option I see is to have users insert a proper Tag object as optional parameter. If none is passed, we initialize a default one (this also default-inits the haz_type).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this is a problem not to be solved here. The whole Tag needs a complete overhaul. This is also in the pipeline ;).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 06c4065

Check if default-initialization of optional attributes succeeds
event_name: Optional[List[str]] = None,
fraction: Optional[sparse.csr_matrix] = None,
pool: Optional[Pool] = None,
fast: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced by the addition of this variable. We have this no-where in the code, and I also have not seen that in other Python modules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your issue the existence of the variable or its name? With the full initialization of the object we now have the chance to perform an automated check of the resulting object. I think this is very important and should only be skipped if a user is absolutely sure that the data is consistent.

I don't think the current checks are very expensive. So if you opt for removing this variable, I would be in favor of always performing the check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is the route we took so far with Impact.

Recover the default empty string value for `haz_type` argument of
Hazard initializer.
Update appropriate unit test
@chahank
Copy link
Member

chahank commented Oct 17, 2022

I think one question is, if in the __init__ some basic data consistency checks are, whether a failure of the tests leads to an error or a warning. The error has the advantage to help with consistent data, and helps writing stable code. The warning has the advantage to be more tolerant with the user, e.g., wenn loading data (for instance, if I get some external data, load it into a Hazard, but the dates are for some reason incomplete, I could still load the data and then correct for it, instead of having to manipulate the external data).

@peanutfun
Copy link
Member Author

Good point! We could make this a user decision via a string parameter check with three allowed values:

  • error: Run checks and throw errors
  • warn: Run checks and throw warnings
  • skip: Skip checks

I think the latter is important because depending on the data, these checks are potentially expensive and there should be an option for users to skip them altogether (the "I know what I'm doing" option)

@chahank
Copy link
Member

chahank commented Oct 17, 2022

I do not dislike the idea per se, but I do not know of other larger Python packages that do this. Is there a reason maybe?

@peanutfun
Copy link
Member Author

Superseded by #549, which is a simpler implementation. We'll come back to this

@peanutfun peanutfun closed this Oct 25, 2022
@emanuel-schmid emanuel-schmid deleted the feature/hazard-init branch January 12, 2023 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants