Example of a Suricata datasource configuration #16496

ph · 2020-02-21T20:03:08Z

Suricata is using the logs input but creates multiples kind of event, so
its a single input mixed output. Lets try to see if type on the stream
could work or not.

Suricate is using the logs input but creates multiples kind of event, so its a single input mixed output. Lets try to see if type on the stream could work or not.

elasticmachine · 2020-02-21T20:03:11Z

Pinging @elastic/ingest (Project:fleet)

ph · 2020-02-21T20:17:40Z

x-pack/agent/docs/agent_configuration_example.yml

+         -  id?: {id}
+            type: "typeX"
+            dataset: suricata.logs
+            path: /var/log/surcata/eve.json


@ruflin this is a followup for our discussion, I've looked a the current implementation of the Suricata module. This is indeed a single input type (logs) mixed outputs (events, alert and metrics). All the generated events are extracted from a single source file the eve.json file.

Now, I don't think we can express that difference at the stream level, the logic is heavily dependent on the ingest pipeline implementation. Is log the right datasource type here? Maybe Event or *File would be more generic and appropriate, or could they be an alias to log?

I think your question is more how are we targetting the right index for these kinds of scenario? Because the above example will use the logs-{dataset}-{namespace} as the destination.

I think the actual solution is to make sure that all the fields that we use: dataset, namespace and type is available for the ingest pipeline and assume that a pipeline can route events if the content is mixed. With our current permission model and final pipeline usage it should just work?

I am not sure that the Suricata case is common.

PS: Beats is also doing that by sending a summary of the stats in the log.

+1 on having the stream.dataset, stream.type and stream.namespace available in all events and make it possible for the ingest pipeline to make decisions based on it and put it in different indices if needed.

@andrewkroh Would this make sense for suricata?

@andrewkroh If this is OK with you I am going to create the related issues to pass down the required information to generate the target index from an ingest pipeline.

@ruflin concerning the values I presume we are using values from the input when stream.type or stream.namespace aren't defined on the stream?

I'm probably missing some context about the current design. So a final pipeline will be installed to dynamically set the _index for all events based on stream.dataset, stream.type, and stream.namespace. Will those fields be present in all events? And then the suricata.logs dataset will overwrite stream.type to alerts or metrics when needed?

The current design is not exactly what you are describing at the moment the agent will generate the target index based on fields present in the datasource configuration.

If we take the following nginx datasource and only concentrate on the "error"

beats/x-pack/agent/docs/agent_configuration_example.yml

Lines 28 to 61 in 77f5f68

datasources:

# use the nginx package

- id?: nginx-x1

enabled?: true # default to true

title?: "This is a nice title for human"

# Package this config group is coming from. On importing, we know where it belongs

# The package tells the UI which application to link to

package?:

name: epm/nginx

version: 1.7.0

namespace?: prod

constraints?:

# Contraints look are not final

- os.platform: { in: "windows" }

- agent.version: { ">=": "8.0.0" }

use_output: long_term_storage

inputs:

- type: logs

processors?:

streams:

- id?: {id}

enabled?: true # default to true

dataset: nginx.acccess

paths: /var/log/nginx/access.log

- id?: {id}

enabled?: true # default to true

dataset: nginx.error

paths: /var/log/nginx/error.log

- type: nginx/metrics

streams:

- id?: {id}

enabled?: true # default to true

dataset: nginx.stub_status

metricset: stub_status

The agent will take the input type logs and the namespace prod and the dataset nginx.error and will generate the target index to be "logs-error.error-prod" and will send the data to that index. We cannot use the final pipeline to generate the index, because the usage context fleet vs standalone are different and we cannot guarantee the pipeline would be installed before.

Now, If we look at the Suricata use case, this is the exception that confirms the rules, considering that events: logs, metrics, and alerts are coming from the same source (logs) and we want to disambiguate them and route them to the right index. We see this as a more advanced use case where that logic to identify and route events are part of a pipeline definition.

So based on incoming data and with the aid of the streams.* fields it can make a rerouting decision and send the events to the appropriate index.

Note: It could be part of a final pipeline but at the moment it's up to the specific pipeline to do it.

Ok, thanks for the details. I don't see any issue with adding some extra ingest processors to handle modifying the index for logs and alerts.

x-pack/agent/docs/agent_configuration_example.yml

andrewkroh · 2020-02-24T16:18:00Z

x-pack/agent/docs/agent_configuration_example.yml

+         -  id?: {id}
+            type: "typeX"
+            dataset: suricata.logs
+            path: /var/log/surcata/eve.json


I'm probably missing some context about the current design. So a final pipeline will be installed to dynamically set the _index for all events based on stream.dataset, stream.type, and stream.namespace. Will those fields be present in all events? And then the suricata.logs dataset will overwrite stream.type to alerts or metrics when needed?

Co-Authored-By: Andrew Kroh <[email protected]>

ph · 2020-02-25T14:07:01Z

Will create followup issues for streams.* and their visibility for the ingest pipeline.

ph · 2020-02-25T15:13:29Z

Created #16562 for the stream.* discussion.

* Example of a Suricata datasource configuration Suricate is using the logs input but creates multiples kind of event, so its a single input mixed output. Lets try to see if type on the stream could work or not. * Update x-pack/agent/docs/agent_configuration_example.yml Co-Authored-By: Andrew Kroh <[email protected]> Co-authored-by: Andrew Kroh <[email protected]>

Example of a Suricata datasource configuration

59166f5

Suricate is using the logs input but creates multiples kind of event, so its a single input mixed output. Lets try to see if type on the stream could work or not.

ph added discuss Issue needs further discussion. review [zube]: In Review Project:fleet labels Feb 21, 2020

ph requested a review from ruflin February 21, 2020 20:03

ph self-assigned this Feb 21, 2020

ph commented Feb 21, 2020

View reviewed changes

andrewkroh reviewed Feb 24, 2020

View reviewed changes

Update x-pack/agent/docs/agent_configuration_example.yml

5580e30

Co-Authored-By: Andrew Kroh <[email protected]>

ph merged commit 3ef7ccb into elastic:feature-ingest Feb 25, 2020

ph mentioned this pull request Feb 25, 2020

[Agent] Exposed stream.type, stream.dataset and stream.namespace to every events. #16562

Closed

ph deleted the agent/suricata-configuration-mixed branch February 25, 2020 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of a Suricata datasource configuration #16496

Example of a Suricata datasource configuration #16496

ph commented Feb 21, 2020

elasticmachine commented Feb 21, 2020

ph Feb 21, 2020

ruflin Feb 24, 2020

ph Feb 24, 2020

ruflin Feb 24, 2020

andrewkroh Feb 24, 2020

ph Feb 24, 2020

ph Feb 24, 2020

andrewkroh Feb 24, 2020

andrewkroh Feb 24, 2020

ph commented Feb 25, 2020

ph commented Feb 25, 2020

	datasources:
	# use the nginx package
	- id?: nginx-x1
	enabled?: true # default to true
	title?: "This is a nice title for human"
	# Package this config group is coming from. On importing, we know where it belongs
	# The package tells the UI which application to link to
	package?:
	name: epm/nginx
	version: 1.7.0
	namespace?: prod
	constraints?:
	# Contraints look are not final
	- os.platform: { in: "windows" }
	- agent.version: { ">=": "8.0.0" }
	use_output: long_term_storage
	inputs:
	- type: logs
	processors?:
	streams:
	- id?: {id}
	enabled?: true # default to true
	dataset: nginx.acccess
	paths: /var/log/nginx/access.log
	- id?: {id}
	enabled?: true # default to true
	dataset: nginx.error
	paths: /var/log/nginx/error.log
	- type: nginx/metrics
	streams:
	- id?: {id}
	enabled?: true # default to true
	dataset: nginx.stub_status
	metricset: stub_status

Example of a Suricata datasource configuration #16496

Example of a Suricata datasource configuration #16496

Conversation

ph commented Feb 21, 2020

elasticmachine commented Feb 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ph commented Feb 25, 2020

ph commented Feb 25, 2020