Skip to content

Commit

Permalink
Add data_streams property
Browse files Browse the repository at this point in the history
  • Loading branch information
mtojek committed Feb 24, 2021
1 parent b1aff4e commit ed04c0f
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 1 deletion.
2 changes: 1 addition & 1 deletion code/go/internal/spec/statik.go

Large diffs are not rendered by default.

5 changes: 5 additions & 0 deletions test/packages/input_groups/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ policy_templates:
- name: ec2
title: AWS EC2
description: Collect logs and metrics from EC2 service
data_streams:
- ec2_logs
- ec2_metrics
inputs:
- type: s3
title: Collect logs from EC2 service
Expand Down Expand Up @@ -103,6 +106,8 @@ policy_templates:
- name: barracuda
title: Barracuda logs
description: Collect Barracuda logs from syslog or a file.
data_streams:

This comment has been minimized.

Copy link
@ruflin

ruflin Feb 24, 2021

Contributor

I initially expected this to be on the inputs level as for each input it can be decided which data_streams should be pulled in. Could it be that two inputs use the same data_stream? Or the opposite, one input wants to use the data_stream the other not?

At the same time, I like the idea that we start to decouple input configs from data_stream potentially.

This comment has been minimized.

Copy link
@mtojek

mtojek Feb 24, 2021

Contributor

At the same time, I like the idea that we start to decouple input configs from data_stream potentially.

Yes, I had that in mind.

Could it be that two inputs use the same data_stream?

Hmm.. we may have a policy template that defines two inputs (local files vs S3 logs?).

Or the opposite, one input wants to use the data_stream the other not?

I'm not sure about this one. Is there an input that doesn't use the data stream now?

I think we need more opinions about this. @sorantis @ycombinator any preference?

This comment has been minimized.

Copy link
@ycombinator

ycombinator Feb 24, 2021

Contributor

I'm trying to think about what it would mean if data_streams were defined at the input level and I'm getting a bit confused by something else. Sorry to derail the conversation but I need to get this confusion cleared up before having an opinion on which level the data_streams property should be defined.

Currently, if I look at a data_stream/foo/manifest.yml file, I see a streams property that specifies an array of inputs. To me this means all those inputs end up indexing data into the ES data stream <input type>-foo-<namespace>.

In this proposal, I'm seeing data stream folders named as ec2_logs, ec2_metrics, etc. Does that mean we'd end up with data stream names in ES like logs-aws.ec2_logs-<namespace>, metrics-aws.ec2_metrics-<namespace>, etc.?

Apologies if I'm missing something obvious here.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Feb 24, 2021

Contributor

Ignore my last comment. I find the naming redundant but also just realized that this is not something new introduced in this proposal. We already have data stream folders named ec2_logs and ec2_metrics in the aws package. Sorry for the noise.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Feb 25, 2021

Contributor

We sync'd up off-PR. The decision is to leave data_streams at the policy template level (not at the input level) as it's easy to reason about it.

The default behavior (when the data_streams property is omitted) will be as if the data_streams property is defined with all data streams under it. This matches up with current behavior (prior to this proposal) so we stay backwards compatible.

- spamfirewall
inputs:
- type: udp
title: Collect logs from Barracuda via UDP
Expand Down
10 changes: 10 additions & 0 deletions versions/1/manifest.spec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,16 @@ spec:
type: string
examples:
- Collect logs and metrics from Apache instances
data_streams:
description: List of data streams compatible with the policy template.
type: array
items:
type: string
description: Data stream name
examples:
- ec2_logs
- spamfirewall
- access
inputs:
description: List of inputs supported by policy template.
type: array
Expand Down

0 comments on commit ed04c0f

Please sign in to comment.