Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SIEM][Detection Engine][Lists] Adds additional data types to value b…
…ased lists ## Summary Adds these data types to the value based lists end points from [Elasticsearch field data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html): Single value based list types: * binary * boolean * byte * date * date_nanos * date_range * double * float * integer * ip * half_float * keyword * text * long * short Range value based list types: * double_range * float_range * integer_range * ip_range * long_range Geo value based list types: (caveat is that you cannot query them using other geometry just yet ... you can only these and export them) * geo_point * geo_shape * shape For importing and exporting different values such as ranges, geo, or single values, this introduces a serialize and deserialize option for the endpoints. For example if you want to serialize in an ip_range such as 192.168.0.1,192.168.0.3 which has a comma between the two would use the following: ```ts POST /api/lists { "name": "List with an ip range", "serializer": "(?<gte>.+),(?<lte>.+)", "deserializer": "{{gte}},{{lte}}", "description": "This list has ip ranges", "type": "date_range" } ``` If you want to serialize in keywords from a list that _only_ match a particular value you would use the following: ```ts POST /api/lists { "id": "keyword_custom_format_list", "name": "Simple list with a keyword using a custom format", "description": "This parses the first found ipv4 only", "serializer": "(?<value>((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))", "deserializer": "{{value}}", "type": "keyword" } ``` The serializer is a [named capturing group](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match) while the deserializer is using [MustacheJS](https://github.com/janl/mustache.js/). The range type, single value types, and geo types all have default captures for their serialize and default mustache templates if none are configured with an endpoint. The default capture groups and mustache handles for each are: * shape, geo_point, geo_shape: `(?<lat>.+),(?<lon>.+)` * date_range: `(?<gte>.+),(?<lte>.+)|(?<value>.+)` * other ranges are: `(?<gte>.+)-(?<lte>.+)|(?<value>.+)` * all single data types: `(?<value>.+)` For ranges you can use both `gte, lte`, and `value` together. If `gte` _and_ `lte` matches it will use that for the greater than, less than elastic range and ignore `value` even if `value` also matched. If _only_ `value` matches and `gte`, `lte` does not match then it will use `value` and put `value` as _both_ the `gte`, and `lte`. For example, if you are serializing in a list of ip ranges as the list data type, `ip_range` and you have these 3 entries in the file: ```ts 127.0.0.1 127.0.0.2-5 ``` The default `serializer` will use `(?<gte>.+)-(?<lte>.+)|(?<value>.+)` and you will get two elastic documents like so: ```ts { "_source" : { "ip_range" : { "gte" : "127.0.0.1", "lte" : "127.0.0.1" } } { "_source" : { "ip_range" : { "gte" : "127.0.0.2", "lte" : "127.0.0.5" } } ``` The default mustache handles for each are: * shape, geo_point, geo_shape: `{{{lat}}},{{{lon}}}` * date_range: `{{{gte}}},{{{lte}}}` * other ranges are: `{{{gte}}}-{{{lte}}}` * all values are: `{{{value}}}` I use three instead of two handle bars (`{{{` vs.` {{`) so that HTML is not escaped for the lists. You can override and change it if you need or want the escaping. If during the deserializer phase it detects that a `gte` and `lte` are exactly the same it will still output them as a two items and use the mustache deserialize value. Using the ip-range example above that will be outputted like so since it detects that the lte-gte are exactly the same value: ```ts 127.0.0.1-127.0.0.1 127.0.0.2-127.0.0.5 ``` --- Interesting queries to run from the lists scripts folder for testing: Load some small test files from `./lists/files` for example: ```ts ./import_list_items_by_filename.sh ip_range ./lists/files/ip_range_cidr.txt ./import_list_items_by_filename.sh ip_range ./lists/files/ip_range.txt ./import_list_items_by_filename.sh date ./lists/files/date.txt ./import_list_items_by_filename.sh ip_range ./lists/files/ip_range_mixed.txt ... ``` Export them ```ts ./export_list_items.sh ip_range_cidr.txt ./export_list_items.sh ip_range.txt ./export_list_items.sh date.txt ./export_list_items.sh ip_range_mixed.txt ... ``` Find on them ```ts ./find_list_items.sh ip_range_cidr.txt ./find_list_items.sh ip_range.txt ./find_list_items.sh date.txt ./find_list_items.sh ip_range_mixed.txt ... ``` Find specific values such as: ```ts ./get_list_item_by_value.sh ip_range_mixed.txt 192.168.0.1 ./get_list_item_by_value.sh date.txt 2020-08-25T17:57:01.978Z ... ``` ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#cross-browser-compatibility) were updated or added to match the most common scenarios
- Loading branch information