Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce geo-threshold alerts #76285

Merged
merged 92 commits into from
Oct 5, 2020
Merged

Introduce geo-threshold alerts #76285

merged 92 commits into from
Oct 5, 2020

Conversation

kindsun
Copy link
Contributor

@kindsun kindsun commented Aug 31, 2020

Adds the ability to create geo-threshold alerts from the alerting app. This functionality will be required for eventually creating & previewing alerts from the Maps app.

This PR introduces the following:

  • Geo-threshold alerts server-side functionality
  • Basic validation
  • UI for creating these alerts

There are several outstanding pieces which may be added in later working including:

  • API for geo threshold alert creation
  • More robust crossing detection
  • Additional UI filters
  • Maps app interaction for alert param selection and creation

Instructions for review

This is a challenging PR to review as it requires both simulating alerts and viewing them as they happen. Here are the steps you should take to test this functionality:

0. Running the ES/Kibana dev env with https enabled

  • Just run the commands you're used to with --ssl as an arg, i.e.:
    • yarn es snapshot --ssl
    • yarn start --ssl

1. Get data in system

  • You'll be using the script: https://github.com/thomasneirynck/faketracks to generate "live data" to populate the system. Clone the repo and follow the instructions in the readme to set up
  • Final command to run should look something like: node ./generate_tracks.js -t manhattan_tracks.json

2. Create a new alert index in dev tools

Execute the following two commands:

  • Create index
PUT /manhattan_index_alerts
  • Create mappings
PUT /manhattan_index_alerts/_mapping
{
  "properties": {
    "AlertInstanceId": {
      "type": "keyword"
    },
    "CrossingEventTimestamp": {
      "type": "date"
    },
    "CurrentBoundaryId": {
      "type": "keyword"
    },
    "CurrentLocation": {
      "type": "geo_point"
    },
    "PreviousBoundaryId": {
      "type": "keyword"
    },
    "PreviousLocation": {
      "type": "geo_point"
    }
  }
}

3 Start the map to index the boundaries

  • Go to the Maps app
  • Create a new map
  • Using GeoJSON Upload, upload the file from fake_tracks manhattan_boundaries.json located in the code cloned from https://github.com/thomasneirynck/faketracks. Default settings are all fine.
  • Keep the Maps tab open, you'll come back to this

5. Create index pattern for generated tracks

  • Set time field to default
  • Repeat for new index pattern: tracks
  • Leave tab open, you'll come back to this

6. Create alert

  • Go to Stack management > Alerts & Actions > Create Alert > Tracking threshold
  • Fill out top and middle sections. Should flow somewhat logically. Do set both Check every and Notify every to 2 seconds
  • Create index connector (any name is fine). For index name, use the one we created earlier in dev tools: manhattan_index_alerts
  • For document structure, use:
{
    "AlertInstanceId": "{{alertInstanceId}}",
    "CrossingEventTimestamp": "{{context.crossingEventTimeStamp}}",
    "CurrentBoundaryId": "{{context.currentBoundaryId}}",
    "CurrentLocation": "{{context.currentLocation}}",
    "PreviousBoundaryId": "{{context.previousBoundaryId}}",
    "PreviousLocation": "{{context.previousLocation}}"
}

7. Create another index pattern to track on maps for indexed alert data

  • Go to Stack Management > Index Patterns
  • Create new index pattern: manhattan_index_alerts

6. Create Map

  • You should already have the boundaries layer from the GeoJson upload earlier
  • Do top hits (quantity: 1) on the generate tracks script output docs (index: tracks). Set sorting on timestamp descending
  • Create another doc layer for the alerting index (not top hits) (manhattan_index_alerts: current location)
  • On time picker set: Last 30 seconds, <Apply> and Refresh every: 2 seconds, <start>

Nice to have layers

  • Do point to point on newly indexed alert docs (manhattan_index_alerts)
  • Line of tracks or previous tracks (no top hits)

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some comments, mostly stylistic - and primarily just reviewed the alert type itself.

My biggest concern is the calculation of instanceIds in the alert executor in x-pack/plugins/alerting_builtins/server/alert_types/geo_threshold/geo_threshold.ts - it looks like it the id could change over time for the same entities, which you probably don't want to do. But, hard for me to say. It's possible that's what you want, or it's possible currLocation.shapeId is constant over time and so the instance id won't change over time. Reach out if you need more explanation ...

const toBoundaryName = shapesIdsNamesMap[currLocation.shapeId] || '';
const fromBoundaryName = shapesIdsNamesMap[prevLocation.shapeId] || '';
services
.alertInstanceFactory(`${entityName}-${toBoundaryName || currLocation.shapeId}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what currLocation.shapeId evaluates to, but you will want the value sent to alertInstanceFactory() to be consistent over time for the different instances you are dealing with (each movedEntity). Eg, say movedEntity contains one object, for two different invocations of the the alert executor (ie, separated by the alert interval). For both alert executor calls that entity is logically the same entity - but perhaps it "moved". In this case, you want the param to alertInstanceFactory() to NOT include the location, because that would end up generating a new instance.

It depends on how you want to handle this, but I'd think if this is tracking objects somehow, the alert instance id (the param to alertInstanceFactory()), each object should have it's own unique instance id that does not change over time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth noting the instance ids will show up in the alert details page - each instance will appear as row with a bit of status info (how long it's been active, etc), and the instance id should something identifiable to customers. As an example from o11y, the instance id is often a host name, a service name, maybe combo host-service name, etc (depending on the alert type).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each object should have it's own unique instance id that does not change over time.

Originally we did this. Believe it or not, the way it works currently is intended behavior 😬. This was a topic of heavy discussion along the way with this being the final result. We're aware of the pitfalls you've identified, but ultimately decided it provided the most useful version of the details page for now, acting as a sort of running manifest of points and their crossings into geographic regions.

I think ultimately this will evolve quite a bit in the near future. For threshold and other geo alert options, we'd be interested in possibly providing the user more information on the current state of tracked points/shapes. I'm not sure if what we're interested in will fit on this details page or perhaps in a separately generated dashboard that's something altogether different. Will definitely keep you and the team looped in thhough!

Copy link
Contributor

@gmmorris gmmorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't block merging this PR on my review, with @YulNaumenko & @pmuellr's reviews, I defer to them on this. :D

:elasticheart:

@kindsun
Copy link
Contributor Author

kindsun commented Oct 5, 2020

A few updates covering the latest round of changes:

  1. The issue where previous points outside of shapes weren't always picked up correctly has been resolved. We needed to set a higher aggregation limit on the inner aggregation than the default. The reason this was so much more noticeable on crossings from other to shape x was due to the large number of results in other (outside of shapes) vs. those in a shape.
  2. The functionality of this PR is now only accessible by using a config feature flag. @YulNaumenko the changes are introduced in this commit. In order to handle config values, we need at least a barebones server presence so they're properly attached to the initializerContext and made available to the UI, so I've added this.

To use Geo Threshold Tracking moving forward you need to add the following to your kibana.dev.yml: xpack.trigger_actions_ui.enableGeoTrackingThresholdAlert: true.

With it enabled:

WithTrackingThreshold

Disabled or absent from the config:

WithoutTrackingThreshold

  1. Addressed an issue identified offline where auto-complete would overwrite previously selected values on edit

  2. All feedback up to this point has been addressed but please feel free to double-check!

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

@kbn/optimizer bundle module count

id before after diff
triggers_actions_ui 275 285 +10

async chunks size

id before after diff
triggers_actions_ui 1.5MB 1.5MB +41.4KB

distributable file count

id before after diff
default 48094 48104 +10

page load bundle size

id before after diff
triggers_actions_ui 148.1KB 151.4KB +3.3KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The geo-alert threshold introduces a great feature to users who rely on the stack for storing and monitoring location-based sensor data. It broadens the stack's support for IoT-Observability use-cases.

  • On global or regional scale, consider monitoring vehicles like cars or ships with GPS and AIS sensor readings
  • In a closed environment, consider monitoring assets with RFID

Thanks for putting all the pieces together on this!

Thank you for introducing the feature flag. It will help to do future development in a more incremental fashion, and give an opportunity for motivated users to test the framework.


There are some outstanding gaps that imho we should address in follow-on PRs. Listing these out here for reference, but will create separate issues for these as well:

  • Feature completeness:

    • Introduce KQL filtering. Without it, users cannot narrow their entities or boundaries of interest. This is critical for operational use-cases (e.g. consider the analyst who would like to monitor this particular asset).
    • Enable "leaving" and generic "crossing"-conditions. These are similar to the entering-condition. Although not quite (e.g. The will require slight differences in how an AlertInstanceId should be represented).
  • Increase resiliency with real-world data: this PR has been validated with synthetic data (either completely manufactured, or synthetic data derived from real world recordings). However, some initial testing seems to indicate, that not all edge-cases that woudl arise in real-world data would be handled:

    • known knowns:
      • the detection resolution is too coarse:
        • The interval window introduces both a maximum bound on the number of crossing-events that can be detected in a single tick. It either detects 0 or 1 entering event per track. These limitations are somewhat artifical since they are largely due to the behaviour on the JS-side (so not due to the update-resolution of the real world tracks, nor the results Kibana gets back from Elasticsearch).
          - the algorithm will miss multiple different shape crossings if they occur in the same interval (e.g. consider a track going X->Y->Z. The algo will only detect X->Y, even though ES returns sufficient information to determine X-Y and Y->Z)
          - the algorithm will miss “blips” (e.g. consider a track going X->Y->X in an interval. The algo will not detect this because it only compares against the previous locations instead of all locations it retrieves from ES)
      • The delayed-offset parameter does double duty.
        • The delayed-offset parameter was introduced to deal with laggy real-world data, to handle the cases where real-world events get logged with some delay (e.g. consider updates be logged between 2-3 minutes late. If the interval runs every 30s, they would always get missed). However, the delayed-offset is used to pad the interval detection window in the past. While this padding helps to capture the laggy-data, it significantly excarcerbates the issue mentioned above.
    • known unknowns
      • It is unclear how pauses and interruptions are handled. For example, consider a pause in updates from an entity, or when an alert is disabled. This is untested. Given that the look-back parameter is hardcoded to 5-minutes, it's reasonable to expect some issues.
      • The number of assets that can be tracked is limited to Elasticsearch’s agg max-bucket settings. This is also dependent on the number of boundaries. These limitations are not precisely known, and are not communicated in any way to the user (e.g. an analyst cannot know if a real-world crossing event was missed due to age-size limitations, known limitations of the algorithm (see avove), or bugs in the algorithm… even if they would like to find out).
    • The unknown unknowns
      • … ? …
  • Polish

    • Icon doesn’t fit imho
    • event.CrossingTime is unix timestamp, which means it cannot be used in message-formatting (e.g. an analyst wants to generate an email-notification with “detected leaving of this asset at this time)

lgtm when green, super 🎉!

@kindsun kindsun merged commit bb37cc2 into elastic:master Oct 5, 2020
@kindsun kindsun deleted the maps-alerts branch October 5, 2020 23:01
gmmorris added a commit to gmmorris/kibana that referenced this pull request Oct 6, 2020
* master: (85 commits)
  Refactor attribute service (elastic#78414)
  [APM] Add default message to alerts. (elastic#78930)
  [Discover] Modify columns and sort when switching index pattern (elastic#74336)
  Document ts project references setup (elastic#78586)
  build all ts refs in single kbn:bootstrap (elastic#79438)
  [TSVB] Allow string fields on value count aggregation (elastic#79267)
  [SECURITY SOLUTION] Investigate EQL signal in timeline (elastic#79049)
  [Fleet] Add loading spinner to page headers (elastic#79568)
  [Security Solution][Resolver] Resolver query panel load more (elastic#79160)
  Add type row to monitor detail page. (elastic#79556)
  Remove license refresh from setup (elastic#79518)
  [docker] add reporting fonts (elastic#74806)
  [SECURITY_SOLUTION][ENDPOINT] Add info about trusted apps to the page subtitle + create flyout (elastic#79558)
  Trim Hash value before validating it (elastic#79545)
  Warn users when security is not configured (elastic#78545)
  update copy styling (elastic#79313)
  Update dependency @elastic/charts to v23.1.1 (elastic#78459)
  Introduce geo-threshold alerts (elastic#76285)
  elastic#76920 Show base breadcrumb when there is an error booting the app (elastic#79571)
  remove react-intl from kibana and keep it inside only i18n package (elastic#78956)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Maps] Tracking Alert by Containment
9 participants