Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing hash fields to process.parent: attempt 2 #739

Merged
merged 5 commits into from
Feb 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Thanks, you're awesome :-) -->

* ECS scripts now use Python 3.6+. #674
* schema_reader.py now reliably supports chaining reusable fieldsets together. #722
* Add support for reusing fields in places other than the top level of the destination fieldset. #739

#### Deprecated

Expand Down
2 changes: 1 addition & 1 deletion docs/field-details.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1967,7 +1967,7 @@ type: keyword

==== Field Reuse

The `hash` fields are expected to be nested at: `file.hash`, `process.hash`.
The `hash` fields are expected to be nested at: `file.hash`, `process.hash`, `process.parent.hash`.

Note also that the `hash` fields are not expected to be used directly at the top level.

Expand Down
24 changes: 24 additions & 0 deletions generated/beats/fields.ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2505,6 +2505,30 @@
start).'
example: 137
default_field: false
- name: parent.hash.md5
level: extended
type: keyword
ignore_above: 1024
description: MD5 hash.
default_field: false
- name: parent.hash.sha1
level: extended
type: keyword
ignore_above: 1024
description: SHA1 hash.
default_field: false
- name: parent.hash.sha256
level: extended
type: keyword
ignore_above: 1024
description: SHA256 hash.
default_field: false
- name: parent.hash.sha512
level: extended
type: keyword
ignore_above: 1024
description: SHA512 hash.
default_field: false
- name: parent.name
level: extended
type: keyword
Expand Down
4 changes: 4 additions & 0 deletions generated/csv/fields.csv
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,10 @@ ECS_Version,Indexed,Field_Set,Field,Type,Level,Example,Description
1.5.0-dev,true,process,process.parent.executable,keyword,extended,/usr/bin/ssh,Absolute path to the process executable.
1.5.0-dev,true,process,process.parent.executable.text,text,extended,/usr/bin/ssh,Absolute path to the process executable.
1.5.0-dev,true,process,process.parent.exit_code,long,extended,137,The exit code of the process.
1.5.0-dev,true,process,process.parent.hash.md5,keyword,extended,,MD5 hash.
1.5.0-dev,true,process,process.parent.hash.sha1,keyword,extended,,SHA1 hash.
1.5.0-dev,true,process,process.parent.hash.sha256,keyword,extended,,SHA256 hash.
1.5.0-dev,true,process,process.parent.hash.sha512,keyword,extended,,SHA512 hash.
1.5.0-dev,true,process,process.parent.name,keyword,extended,ssh,Process name.
1.5.0-dev,true,process,process.parent.name.text,text,extended,ssh,Process name.
1.5.0-dev,true,process,process.parent.pgid,long,extended,,Identifier of the group of processes the process belongs to.
Expand Down
44 changes: 44 additions & 0 deletions generated/ecs/ecs_flat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4064,6 +4064,50 @@ process.parent.exit_code:
order: 29
short: The exit code of the process.
type: long
process.parent.hash.md5:
dashed_name: process-parent-hash-md5
description: MD5 hash.
flat_name: process.parent.hash.md5
ignore_above: 1024
level: extended
name: md5
order: 0
original_fieldset: hash
short: MD5 hash.
type: keyword
process.parent.hash.sha1:
dashed_name: process-parent-hash-sha1
description: SHA1 hash.
flat_name: process.parent.hash.sha1
ignore_above: 1024
level: extended
name: sha1
order: 1
original_fieldset: hash
short: SHA1 hash.
type: keyword
process.parent.hash.sha256:
dashed_name: process-parent-hash-sha256
description: SHA256 hash.
flat_name: process.parent.hash.sha256
ignore_above: 1024
level: extended
name: sha256
order: 2
original_fieldset: hash
short: SHA256 hash.
type: keyword
process.parent.hash.sha512:
dashed_name: process-parent-hash-sha512
description: SHA512 hash.
flat_name: process.parent.hash.sha512
ignore_above: 1024
level: extended
name: sha512
order: 3
original_fieldset: hash
short: SHA512 hash.
type: keyword
process.parent.name:
dashed_name: process-parent-name
description: 'Process name.
Expand Down
45 changes: 45 additions & 0 deletions generated/ecs/ecs_nested.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2698,6 +2698,7 @@ hash:
expected:
- file
- process
- process.parent
top_level: false
short: Hashes, usually file hashes.
title: Hash
Expand Down Expand Up @@ -4440,6 +4441,50 @@ process:
order: 29
short: The exit code of the process.
type: long
parent.hash.md5:
dashed_name: process-parent-hash-md5
description: MD5 hash.
flat_name: process.parent.hash.md5
ignore_above: 1024
level: extended
name: md5
order: 0
original_fieldset: hash
short: MD5 hash.
type: keyword
parent.hash.sha1:
dashed_name: process-parent-hash-sha1
description: SHA1 hash.
flat_name: process.parent.hash.sha1
ignore_above: 1024
level: extended
name: sha1
order: 1
original_fieldset: hash
short: SHA1 hash.
type: keyword
parent.hash.sha256:
dashed_name: process-parent-hash-sha256
description: SHA256 hash.
flat_name: process.parent.hash.sha256
ignore_above: 1024
level: extended
name: sha256
order: 2
original_fieldset: hash
short: SHA256 hash.
type: keyword
parent.hash.sha512:
dashed_name: process-parent-hash-sha512
description: SHA512 hash.
flat_name: process.parent.hash.sha512
ignore_above: 1024
level: extended
name: sha512
order: 3
original_fieldset: hash
short: SHA512 hash.
type: keyword
parent.name:
dashed_name: process-parent-name
description: 'Process name.
Expand Down
20 changes: 20 additions & 0 deletions generated/elasticsearch/6/template.json
Original file line number Diff line number Diff line change
Expand Up @@ -1520,6 +1520,26 @@
"exit_code": {
"type": "long"
},
"hash": {
"properties": {
"md5": {
"ignore_above": 1024,
"type": "keyword"
},
"sha1": {
"ignore_above": 1024,
"type": "keyword"
},
"sha256": {
"ignore_above": 1024,
"type": "keyword"
},
"sha512": {
"ignore_above": 1024,
"type": "keyword"
}
}
},
"name": {
"fields": {
"text": {
Expand Down
20 changes: 20 additions & 0 deletions generated/elasticsearch/7/template.json
Original file line number Diff line number Diff line change
Expand Up @@ -1519,6 +1519,26 @@
"exit_code": {
"type": "long"
},
"hash": {
"properties": {
"md5": {
"ignore_above": 1024,
"type": "keyword"
},
"sha1": {
"ignore_above": 1024,
"type": "keyword"
},
"sha256": {
"ignore_above": 1024,
"type": "keyword"
},
"sha512": {
"ignore_above": 1024,
"type": "keyword"
}
}
},
"name": {
"fields": {
"text": {
Expand Down
1 change: 1 addition & 0 deletions schemas/hash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
expected:
- file
- process
- process.parent

fields:

Expand Down
18 changes: 15 additions & 3 deletions scripts/schema_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,24 @@ def duplicate_reusable_fieldsets(schema, fields_nested):
# which is in turn reusable in a few places.
if 'reusable' in schema:
for new_nesting in schema['reusable']['expected']:
split_flat_name = new_nesting.split('.')
top_level = split_flat_name[0]
# List field set names expected under another field set.
# E.g. host.nestings = [ 'geo', 'os', 'user' ]
nestings = fields_nested[new_nesting].setdefault('nestings', [])
nestings.append(schema['name'])
nestings = fields_nested[top_level].setdefault('nestings', [])
if schema['name'] not in nestings:
nestings.append(schema['name'])
nestings.sort()
fields_nested[new_nesting]['fields'][schema['name']] = schema
nested_schema = fields_nested[top_level]['fields']
for level in split_flat_name[1:]:
nested_schema = nested_schema.get(level, None)
if not nested_schema:
raise ValueError('Field {} in path {} not found in schema'.format(level, new_nesting))
if nested_schema.get('reusable', None):
raise ValueError(
'Reusable fields cannot be put inside other reusable fields except when the destination reusable is at the top level')
nested_schema = nested_schema.setdefault('fields', {})
nested_schema[schema['name']] = schema

# Main

Expand Down
108 changes: 108 additions & 0 deletions scripts/tests/test_schema_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,114 @@ def test_cleanup_fields_recursive(self):
}
self.assertEqual(fields, expected)

def test_reusable_dot_notation(self):
fieldset = {
'reusable_fieldset1': {
'name': 'reusable_fieldset1',
'reusable': {
'top_level': False,
'expected': [
'test_fieldset.sub_field'
]
},
'fields': {
'reusable_field': {
'field_details': {
'name': 'reusable_field',
'type': 'keyword',
'description': 'A test field'
}
}
}
},
'test_fieldset': {
'name': 'test_fieldset',
'fields': {
'sub_field': {
'fields': {}
}
}
}
}
expected = {
'sub_field': {
'fields': {
'reusable_fieldset1': {
'name': 'reusable_fieldset1',
'reusable': {
'top_level': False,
'expected': [
'test_fieldset.sub_field'
]
},
'fields': {
'reusable_field': {
'field_details': {
'name': 'reusable_field',
'type': 'keyword',
'description': 'A test field'
}
}
}
}
}
}
}
schema_reader.duplicate_reusable_fieldsets(fieldset['reusable_fieldset1'], fieldset)
self.assertEqual(fieldset['test_fieldset']['fields'], expected)

def test_improper_reusable_fails(self):
fieldset = {
'reusable_fieldset1': {
'name': 'reusable_fieldset1',
'reusable': {
'top_level': False,
'expected': [
'test_fieldset'
]
},
'fields': {
'reusable_field': {
'field_details': {
'name': 'reusable_field',
'type': 'keyword',
'description': 'A test field'
}
}
}
},
'reusable_fieldset2': {
'name': 'reusable_fieldset2',
'reusable': {
'top_level': False,
'expected': [
'test_fieldset.reusable_fieldset1'
]
},
'fields': {
'reusable_field': {
'field_details': {
'name': 'reusable_field',
'type': 'keyword',
'description': 'A test field'
}
}
}
},
'test_fieldset': {
'name': 'test_fieldset',
'fields': {}
}
}
# This should fail because test_fieldset.reusable_fieldset1 doesn't exist yet
with self.assertRaises(ValueError):
schema_reader.duplicate_reusable_fieldsets(fieldset['reusable_fieldset2'], fieldset)
schema_reader.duplicate_reusable_fieldsets(fieldset['reusable_fieldset1'], fieldset)
# Then this should fail because even though test_fieldset.reusable_fieldset1 now exists, test_fieldset.reusable_fieldset1 is not
# an allowed reusable location (it's the destination of another reusable)
with self.assertRaises(ValueError):
schema_reader.duplicate_reusable_fieldsets(fieldset['reusable_fieldset2'], fieldset)


if __name__ == '__main__':
unittest.main()