Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Sense check: event.trigger #91

Closed
stufraser1 opened this issue Jun 9, 2023 · 25 comments · Fixed by #121
Closed

[Proposal] Sense check: event.trigger #91

stufraser1 opened this issue Jun 9, 2023 · 25 comments · Fixed by #121
Labels
hazard Issues related to Hazard data proposal New feature or request

Comments

@stufraser1
Copy link
Member

event.trigger is an object used to identify the event.trigger.hazard_type and event.trigger.process_type which triggered the main hazard included in the event.

We should sense-check whether this should sit at the event level as it does now, or at the event_set level instead.

There could be cases where we have an event_set of multiple (historical or synthetic) tsunami events, in which some are triggered by seismic activity, others by landslide, other by volcano. This supports the need to keep it at event level.

On the other hand, a storm surge event set will have the same trigger (ETC or TC) for all events and it would be convenient to identify the trigger once.

On balance, for the flexibility of assigning this to each event where this is a hazard map or historical, I would say keep event.trigger in event object. For an event_set where we do not list every event of the 10,000 in metadata can we use event.trigger anyway to define the trigger applicable to all events?

@duncandewhurst
Copy link
Contributor

For an event_set where we do not list every event of the 10,000 in metadata can we use event.trigger anyway to define the trigger applicable to all events?

If I understood correctly, you mean creating a single 'dummy' event in order to have somewhere to put the trigger, when in fact there are actually 10,000 events. To keep the semantics clear, I think that one event in RDLS should always represent one event in the underlying dataset.

Given the two scenarios that you outline:

  • an event_set containing events with different triggers
  • an event_set with an empty .events array

I think that points us towards having two fields:

Field Title Description Type
event.trigger Trigger The trigger for this event. object (Trigger)
event_set.triggers Triggers The triggers for the events in this event set. You should only use this field when details of individual events are not included in the RDLS metadata. Otherwise, you should use event.trigger to provide the trigger for each event array (Trigger)

@matamadio
Copy link
Contributor

This seems the most straightforward and intuitive solution.

@matamadio matamadio changed the title Sense check: event.trigger [Proposal] Sense check: event.trigger Jun 12, 2023
@matamadio matamadio added proposal New feature or request hazard Issues related to Hazard data labels Jun 12, 2023
@johcarter
Copy link
Collaborator

can we also have multiple process types at the event set level? For instance, all events could be strong wind (ETC/TCY) potentially triggering storm surge (FSS). Its useful to know at a high level all of the covered process types.

@duncandewhurst
Copy link
Contributor

The Trigger object has both .hazard_type and .process_type properties so the proposal in my previous comment covers both hazard types and process types. @johcarter does that answer your question?

@johcarter
Copy link
Collaborator

Any chance of a quick example showing what event_set.triggers might look like when there are two process types (lets say Tropical Cyclone and Storm Surge) applying to the whole event_set please?

@stufraser1
Copy link
Member Author

stufraser1 commented Jun 15, 2023

In the case of having one event set for each of TC and SS (loss for TC, loss for SS) we might have the following, where the event set of TC losses don't have a trigger, but the event set for storm surge losses have the TC as the trigger. The use of trigger has been included to identify a triggering event or process, not to imply that the triggering event information was combined with that of the triggered event (i.e. describe a combined a loss, or a hazard map with wind speed and storm surge height). Including event sets changes the requirement a bit, because there is then potential to have a dataset which includes >1 hazard type in a single resource file.

This is also true of losses, in fact - where we can have a loss due to cyclone wind, a loss due to storm surge, and a loss due to both combined. It is also increasingly relevant for vulnerability functions which consider more than one hazard - see text image below).

{
  "event_set": [
    {
      "hazard_type": "Strong Wind",
      "process_type": "Tropical Cyclone",
      "trigger": 
        {
          "hazard_type": "",
          "process_type": ""
        }
    },
    {
      "hazard_type": "Coastal Flood",
      "process_type": "Storm Surge",
      "trigger": 
        {
          "hazard_type": "Strong Wind",
          "process_type": "Tropical Cyclone"
        }
    }
  ]
}

What is missing (@johcarter's point) is for an event set where we've got losses combined for TC and SS. This is a feature that was included in earlier version of RDL (in that case the focus was on ensuring this could be captured for vulnerability, but applies equally to event sets):
image
However, this seems to have been dropped in subsequent developments.

To align with this, we could include primary and secondary perils in their own field where an event set included >1 hazard and/or >1 process, e.g., :

{
  "event_set": [
    {
      "hazard_type_primary": "Strong Wind",
      "process_type_primary": "Tropical Cyclone",
      "hazard_type_secondary": "Coastal Flood",
      "process_type_secondary": "Storm Surge",
      "trigger": 
        {
          "hazard_type": "",
          "process_type": ""
        }
    }
  ]
}

Thinking about interoperability, and mapping data between RDLS and say OED... OED accounts for multiple hazards in the perilcode field by semi-colon separated list of codes, e.g., "Windstorm (ETC + TC) with Storm Surge" would be "WTC;WEC;WSS" so there could be an advantage of using a single field to capture multiple hazards.

To facilitate this interoperability, would require hazard_type and process_type to accept an array:

{
  "event_set": [
    {
      "hazard_type": "Strong Wind; Coastal Flood",
      "process_type": "Tropical Cyclone; Storm Surge",
      "trigger": 
        {
          "hazard_type": "",
          "process_type": ""
        }
    }
  ]
}

@johcarter
Copy link
Collaborator

Thank you Stu. I think I have a preference for the first example because you can support multiple process types (including more than two) as well as specify a trigger where appropriate. It includes all the information.

The last one is also flexible but think the first is cleaner.

In our footprint resource file, for multi-peril models we would always have multiple processes in the same file because they occur within the same event, as opposed to separate event sets for each process.

I don't think I would go for 'Primary' and 'Secondary' because they are limiting and maybe you don't want to assign these labels to independent hazard processes which occur together.

@stufraser1
Copy link
Member Author

stufraser1 commented Jun 15, 2023

I don't think I would go for 'Primary' and 'Secondary' because they are limiting and maybe you don't want to assign these labels to independent hazard processes which occur together.

I agree with the problems in terminology.

In our footprint resource file, for multi-peril models we would always have multiple processes in the same file because they occur within the same event, as opposed to separate event sets for each process.

This is the Oasis case, but the majority of data we deal with in development sector are single-hazard hazard maps so we need to handle both, which the first example can do too. When thinking about losses, we have many examples where a loss is single peril or combined perils.

If we used the first example, we would need to be clear that trigger could be used to describe that the trigger event might also be included in the same file as the main hazard/process OR that it is the trigger but doesn't occur in the same file.

Best way to do that @odscjen / @odscrachel / @duncandewhurst ?

@odscjen
Copy link
Contributor

odscjen commented Jun 19, 2023

If we used the first example, we would need to be clear that trigger could be used to describe that the trigger event might also be included in the same file as the main hazard/process OR that it is the trigger but doesn't occur in the same file.

In the example given in #91 (comment) I read that as an example of the first option, so one event_set describing the trigger event and another event_set describing the event caused by the trigger. If the dataset doesn't contain an event_set that describes the trigger event then the user just wouldn't include that event_set information, e.g. they'd only include:

{
  "event_set": [
    {
      "hazard_type": "Coastal Flood",
      "process_type": "Storm Surge",
      "trigger": 
        {
          "hazard_type": "Strong Wind",
          "process_type": "Tropical Cyclone"
        }
    }
  ]
}

So I don't think there's any particular best way of doing this beyond ensuring the guidance states to only create an event_set for the events that you providing data for, i.e. if the trigger event isn't being described then only state it as the trigger and not as a hazard_type or process_type at the event_set level.

EDIT: realised I'd put in the wrong part of the example! Now the example is what I meant

@stufraser1
Copy link
Member Author

The question remains what to do if we have a combined event_set containing the data for the main event and the triggering events.
I think we would use

{
  "event_set": [
    {
      "hazard_type": "Coastal Flood",
      "process_type": "Storm Surge",
      "trigger": 
        {
          "hazard_type": "Strong Wind",
          "process_type": "Tropical Cyclone"
        }
    }
  ]
}

And include just one data file. (Having two data files would imply they are separate)
This case would only occur if we have an event set file, I think, not individual events, which would more likely separate out the event types in any footprints.

@duncandewhurst
Copy link
Contributor

I think this discussion points to the need to clearly define what an event set is. My understanding from the following diagram from https://docs.riskdatalibrary.org/hazard.html and from the examples given in the issue description was that an event set is a collection of events of the same hazard type:

image

Based on the recent discussion, it sounds like an event set is simply a collection of events, without the constraint of a shared hazard type, and that the purpose of modelling event sets in RDLS is to provide a place to put summary information about the hazard types, process types and triggers covered by the events in the event set when event-level metadata is not provided.

This issue was originally about how to model an event set that contains events with different triggers. That is why my proposal in #91 (comment) has event_set.triggers as an array, which seems to have been dropped from the JSON examples shared in later comments.

The recent discussion suggests that we also need to model event sets that contain events with different hazard and process types. To avoid the terminological issues flagged in #91 (comment) and to avoid limiting the number of different hazard and process types in an event set, I think that hazard_type and process_type should be arrays, as in the final JSON example in #91 (comment).

Taking into account the above, I think that the correct way to model a combined event set that contains the data for the main event and the triggering events is as follows. I've provided draft descriptions/definitions for each field to aid comprehension:

{
  "event_sets": [   // The collections of events described in the dataset.
    {
      "hazard_types": [ // The physical hazard phenomena covered by the event set
        "Coastal Flood",
        "Strong Wind"
      ],
      "process_types": [ // The hazard processes covered by the event set
        "Storm Surge",
        "Tropical Cyclone"
      ],
      "triggers": [ // The causes of the events in the event set
        {
          "hazard_type": "Strong Wind", // The physical hazard phenomena for the trigger
          "process_type": "Tropical Cyclone" // The hazard process for the trigger
        }
      ]
    }
  ]
}

This is in line with @odscjen's recommendation in #91 (comment):

So I don't think there's any particular best way of doing this beyond ensuring the guidance states to only create an event_set for the events that you providing data for, i.e. if the trigger event isn't being described then only state it as the trigger and not as a hazard_type or process_type at the event_set level.

The problem with the modelling proposed in #91 (comment) is that it isn't possible to distinguish a combined event set that contains the data for the main event and the triggering events from an event set that contains only data for the main event, but for which the triggers are disclosed in event_sets.triggers.

Let me know if I'm barking up the wrong tree here!

@johcarter
Copy link
Collaborator

The structure given immediately above by @duncandewhurst looks fine too from my perspective.

Regarding the problem of distinguishing a combined event set in the data, does this problem then move to the meta data describing the resource file containing the combined footprint? Am I right in saying that the property of a resource file "process_type" is a single string, where we would need an array representing more than one process type ? And similarly, "imt" is a string whereas we might need an array of imts?

@odscjen
Copy link
Contributor

odscjen commented Jun 26, 2023

@stufraser1 does Duncan's suggestion in #91 (comment) cover the cases it needs to? If so can we move it to the Agreed column?

@stufraser1
Copy link
Member Author

Based on the recent discussion, it sounds like an event set is simply a collection of events, without the constraint of a shared hazard type, and that the purpose of modelling event sets in RDLS is to provide a place to put summary information about the hazard types, process types and triggers covered by the events in the event set when event-level metadata is not provided.

A shared hazard type constraint does exist, and event set is needed to describe the events, even when event level metadat is available, acting as a summary of those events.

In #91 (comment) the example does not define which hazard type the trigger relates to.
Perhaps it is enough, for a combined event set where we have the source and trigger together in the same event set, to list them in the hazard type and process type without the trigger, proposing to only use trigger where the event set contains the 'triggered' event.

@stufraser1
Copy link
Member Author

stufraser1 commented Jun 26, 2023

And similarly, "imt" is a string whereas we might need an array of imts?

For combined event sets, this may be the case.

@odscjen
Copy link
Contributor

odscjen commented Jun 27, 2023

So to summarize where I think we're at:

  • An event_set contains events with a shared hazard_type/process_type pair.
  • The event_set should provide a summary of the events it contains.
  • Each event should state which hazard_type/process_type pair it contains data from.
  • A hazard_type/process_type pair can be triggered by a different hazard_type/process_type pair. It can be useful to know what the trigger was and is important to ensure the link between them is clear.
  • Depending on the hazards and triggers it is possible that the event_set can contain data from both the main and the trigger pair types. @stufraser1 can you provide an example of this? From the descriptions so far I'm struggling to conceptualise how this works alongside your comment that "a shared hazard type constraint does exist"

@stufraser1
Copy link
Member Author

stufraser1 commented Jun 27, 2023

Please see these slides, in which I try to lay out the use cases for footprint data and event sets.
https://disasterriskuk-my.sharepoint.com/:p:/g/personal/stuart_disaster-risk_uk/EQ9WMker5wBMgq4DPUpeILcB1Pk4q3R0cMHfjiUlemAJtQ?e=93N5Qz

Depending on the hazards and triggers it is possible that the event_set can contain data from both the main and the trigger pair types.

The final slide gives possible variations of event sets, where two hazards are contained in 2 event set resource files, or both are contained in one.

@stufraser1
Copy link
Member Author

"a shared hazard type constraint does exist"

In that, an event set describing flood, would only contain events relating to flood - not to earthquake, for example

@duncandewhurst
Copy link
Contributor

@stufraser1 how does that fit with the examples in #91 (comment), which have multiple hazard types per event set?

@stufraser1
Copy link
Member Author

@stufraser1 how does that fit with the examples in #91 (comment), which have multiple hazard types per event set?

The event set should always describe the hazard type(s) contained within the event(s), that is what I mean by constraint, but we might not have the same interpretation of 'constraint'?

@duncandewhurst
Copy link
Contributor

Ah, I see. Sorry, I should've been clearer. I meant the constraint that all events in an event set must share the same hazard type, i.e. you can only have one hazard type per event set. It sounds like that is not the case.

@stufraser1
Copy link
Member Author

stufraser1 commented Jun 28, 2023 via email

@duncandewhurst
Copy link
Contributor

@stufraser1 thanks for preparing the slides. If it is important to model the link between triggers and hazards at the event set level, then a more structured model would be necessary with triggers nested within hazards. This modelling also means that we can have multiple intensity measures per event set with a clear link between intensity measures and hazards.

I've prepared two examples based on the final two examples on your final slides. The first example is annotated with draft descriptions for each field. The only part I am not sure is about is whether process should be an array, i.e. whether one hazard type can be related to more than one hazard process. Please take a look and let me know what you think.

If we decide to go with this approach at the event set level, we should re-use the same modelling at event level, although Event.hazard can be an object rather than an array.

Example 1

The event set includes only coastal flooding events. The coastal flooding events were triggered by strong wind events that are not included in the event set.

{
  "hazard": {
    "event_sets": [
      {
        "hazards": [ // The hazards included in this event set.
          {
            "type": "Coastal Flood", // The hazard type for this hazard, from the closed hazard type codelist.
            "process": "Storm Surge", // The process type for this hazard, from the closed hazard process type codelist.
            "intensity_measure": "fl_wd:m", // The metric and unit in which the intensity of this hazard is measured.
            "trigger": { // The trigger for this process
              "type": "Strong Wind", // The hazard type for this trigger, from the closed hazard type codelist.
              "process": "Tropical Cyclone" // The process type for this trigger, from the closed hazard process type codelist.
            }
          }
        ]
      }
    ]
  }
}

Example 2

The event set includes both coastal flooding events and strong wind events. The coastal flooding events were triggered by the strong wind events.

{
  "hazard": {
    "event_sets": [
      {
        "hazards": [
          {
            "type": "Coastal Flood",
            "process": "Storm Surge",
            "intensity_measure": "fl_wd:m",
            "trigger": {
              "type": "Strong Wind",
              "process": "Tropical Cyclone"
            }
          },
          {
            "type": "Strong Wind",
            "process": "Tropical Cyclone",
            "intensity_measure": "PGWS_tcy:km/h"
          }
        ]
      }
    ]
  }
}

@odscjen
Copy link
Contributor

odscjen commented Jul 7, 2023

I see a thumbs up from @matamadio for @duncandewhurst's latest suggestion, @stufraser1 are you happy for us to go ahead with this modelling?

@stufraser1
Copy link
Member Author

stufraser1 commented Jul 7, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hazard Issues related to Hazard data proposal New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants