Skip to content

Commit

Permalink
ecs map
Browse files Browse the repository at this point in the history
  • Loading branch information
tomrade committed Jul 24, 2020
1 parent 3a00a06 commit 2483a8c
Show file tree
Hide file tree
Showing 5 changed files with 165 additions and 20 deletions.
84 changes: 69 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
EVTX log file ingestion (no Windows required) using amazing ![evtx-rs](https://github.com/omerbenamram/evtx) lib. Current target is Elasticsearch with the hope of a modular output in the future.

* Elasticsearch output uses ECS common schema output (ive stayed close to winlogbeat however I use lowercase field names under winlog as I feel that is in the spirit of ECS better than the Camel Case used in winlogbeat)
* ECS mappings are done via a config file you can add your own maps to


## Usage
Expand Down Expand Up @@ -46,7 +47,9 @@ You add your input paths to the directory input and then choose one or more outp
"es_pass" : "PASSWORD",
"es_api_key" : "APIKEY",
"es_scheme" : "http",
"index_template" : "es_stuff/index-template.json"
"index_template" : "es_stuff/index-template.json",
"ecs_map_file" : "es_stuff/ecs_map.json",
"ecs_mode" : true
},
{
"name" : "stdout_nom",
Expand All @@ -61,20 +64,24 @@ You add your input paths to the directory input and then choose one or more outp

### Elasticsearch "elastic_nom"

#### Config

``` json
{
"name" : "elastic_nom",
"enabled" : true,
"es_host" : "localhost",
"es_port" : "9200",
"es_index" : "evtx_nom",
"security" : "none",
"es_user" : "USERNAME",
"es_pass" : "PASSWORD",
"es_api_key" : "APIKEY",
"es_scheme" : "http",
"index_template" : "es_stuff/index-template.json"
}
{
"name" : "elastic_nom",
"enabled" : true,
"es_host" : "localhost",
"es_port" : "9200",
"es_index" : "evtx_nom",
"security" : "none",
"es_user" : "USERNAME",
"es_pass" : "PASSWORD",
"es_api_key" : "APIKEY",
"es_scheme" : "http",
"index_template" : "es_stuff/index-template.json",
"ecs_map_file" : "es_stuff/ecs_map.json",
"ecs_mode" : true
}
```

| field | value type | notes |
Expand All @@ -89,4 +96,51 @@ You add your input paths to the directory input and then choose one or more outp
| es_pass | string | elasticsearch security password ( for basic auth)|
| es_api_key | string | base64 encoded api key (for api auth) |
| es_scheme| string | http or https (for security you will be using https) |
| index_template | string | path to index template, ive included one under es_stuff/index-template.json, You do not need to edit this for a custom index name as it will be done by the plugin |
| index_template | string | path to index template, ive included one under es_stuff/index-template.json, You do not need to edit this for a custom index name as it will be done by the plugin |
| ecs_map_file | string | path to ecs map |
| ecs_mode | string | if set to false no ecs mapping is done, the logs are still ecs structured ie under winlog.* just no processing ) |

#### ECS Mapping

The ECS map file is another JSON file. It is structured based on the channel, provider and finally the eventID

``` json
{
"Security" : {
"Microsoft-Windows-Security-Auditing" : {
"4624" :{
"event.action" : "user-logon",
"event.kind" : "event",
"user.name" : "%%%%winlog.event_data.targetusername",
"user.id" : "%%%%winlog.event_data.targetusersid"
}
}
}
}

```

Nested fields can be done with dots ie ```event.action``` means

``` json
{
"event" : {
"action" : "value"

}
}
```

If a value starts with 4 percentages ie "%%%%" evtx-nom will look up the value based on the provided path for example the value ```%%%%winlog.event_data.targetusersid``` Looks for the value from the field
```winlog.event_data.targetusersid``` (again using dots for nested fields). Please note that the path will be the lowercase ECS path as this lookup is done post parsing, you can find these paths in kibana/es post ingest.

When evtx-nom starts up it flattens these maps into a flat dictionary via a key value consisting of "channel + provider + eventid" ie

``` python
mapping_dict = {
"securityMicrosoft-Windows-Security-Auditing4624" : {"event.action" : "user-logon"}
}
```

This is so I can find as match based on "for X in mapping_dict" rather than a nested search tree, im not if this is better/faster or not , but I feel a dictionary check in RAM would be better/faster than a DB even with memcache

4 changes: 3 additions & 1 deletion config.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@
"es_pass" : "PASSWORD",
"es_api_key" : "APIKEY",
"es_scheme" : "http",
"index_template" : "es_stuff/index-template.json"
"index_template" : "es_stuff/index-template.json",
"ecs_map_file" : "es_stuff/ecs_map.json",
"ecs_mode" : true
},
{
"name" : "stdout_nom",
Expand Down
12 changes: 12 additions & 0 deletions es_stuff/ecs_map.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"Security" : {
"Microsoft-Windows-Security-Auditing" : {
"4624" :{
"event.action" : "user-logon",
"event.kind" : "event",
"user.name" : "%%%%winlog.event_data.targetusername",
"user.id" : "%%%%winlog.event_data.targetusersid"
}
}
}
}
3 changes: 2 additions & 1 deletion es_stuff/index-template.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@
}
],
"properties": {
"winlog.xml" : {"enabled" : false}
}
},
"settings" : {
"index.refresh_interval" : "10s",
"index.refresh_interval" : "20s",
"index.mapping.total_fields.limit" : 3000
}
}
82 changes: 79 additions & 3 deletions lib/nom.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import json
import datetime
from elasticsearch import helpers, Elasticsearch
import sys

# This file is parsing the evtx file and any default modules

Expand Down Expand Up @@ -29,7 +30,22 @@ def __init__(self,config):
self.es_api_key = config['es_api_key']
self.scheme = config['es_scheme']
self.index_template = config['index_template']
self.ecs_map = self.load_ecs(config['ecs_map_file'])
self.ecs_mode = config['ecs_mode']
self.prep_es()
def make_key(self,channel,provider,event_id):
key = channel + provider + event_id
return key.lower()
def load_ecs(self,filename):
with open(filename,'r') as in_file:
data = json.load(in_file)
# I think a flat dictionary is better for this sort of thing
mapping_dict = {}
for channel in data:
for provider in data[channel]:
for event_id in data[channel][provider]:
mapping_dict[self.make_key(channel,provider,event_id)] = data[channel][provider][event_id]
return mapping_dict
def get_es(self):
if self.security == "basic":
es = Elasticsearch(
Expand Down Expand Up @@ -89,19 +105,77 @@ def prepare_actions(self,filename):
# This method is a wrapper around the base nom method to add each event as a bulk index action
for event in nom_file(filename):
source = {
'@timestamp' : self.parse_date(event['timecreated']['systemtime']),
'@timestamp' : event['timecreated']['systemtime'],
'winlog' : event,
'os' : {"platform" : "windows"}
'os' : {"platform" : "windows"},
'agent' : {"name" : "evtx-nom"}
}
# Process the ECS!
action = {
'_index': self.es_index,
'_source': source
'_source': self.process_ecs(source)
}
yield action
def parse_date(self,datestring):
# Parse Date to Python object ISO 8601/ RFC3339
output = datetime.datetime.fromisoformat(datestring.replace('Z','+00:00'))
return output
def process_ecs(self,source):
# If we are not bothering just skip all this horrible code
if not self.ecs_mode:
return source
# Take the source document, check if we have an ECS map for it and then if so do the things
key = self.make_key(
source['winlog']['channel'],
source['winlog']['provider']['name'],
source['winlog']['eventid']
)
# check if we have a map
if key in self.ecs_map:
# for each ecs field key in the map add it to the source
for field in self.ecs_map[key]:
if self.ecs_map[key][field].startswith('%%%%'):
value = self.dict_fetch(source,self.ecs_map[key][field].replace('%%%%',''))
else:
value = self.ecs_map[key][field]
source = self.dict_put(field,value,source)
return source
else:
return source
def dict_put(self,key,value,source):
# Merge ECS value back into source document , I think this works but its a bit mental to try and understand. YAY Recursive!
# This should build the dictionary up bringing existing paths along for the ride then
if '.' in key:
# Our key is a dot noted path ie "object.subobject.subsubobject" etc
key_list = key.split('.')
# get the leftmost subobject
item = key_list[0]
# remove this from the key for the next runs
key_list.pop(0)
# check if the subobject already exists in source object
if item not in source:
# create an empty subobject and keep going deeper into inception
source[item] = self.dict_put('.'.join(key_list),value,{})
# Whatever comes back goes into our object
else:
# we need to merge into existing subobject and keep going deeper into inception
source[item] = self.dict_put('.'.join(key_list),value,source[item])
# Whatever comes back goes into our object
else:
# key is just a field name now so add value finally
source[key] = value
# return what we have up to previous level of inception or the exit if we back at the top
return source

def dict_fetch(self,source,key):
if '.' in key:
key_list = key.split('.')
new_source = source[key_list[0]]
key_list.pop(0)
value = self.dict_fetch(new_source,'.'.join(key_list))
else:
value = source[key] or "unknown"
return value


# Get values form EVTX-RS json which may be attributes from XML land
Expand Down Expand Up @@ -148,4 +222,6 @@ def nom_file(filename):
event.update(get_section(data['Event']['System']))
if data['Event'].get('EventData'):
event['event_data'] = get_section(data['Event']['EventData'])
# Raw Document
event['xml'] = record['data']
yield event

0 comments on commit 2483a8c

Please sign in to comment.