Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Proposal] Reserve package keyword for data stream/dataset values #699

Closed
kpollich opened this issue Jan 24, 2024 · 4 comments · Fixed by #704
Closed

[Change Proposal] Reserve package keyword for data stream/dataset values #699

kpollich opened this issue Jan 24, 2024 · 4 comments · Fixed by #704
Labels
discuss Issue needs discussion Team:Ecosystem Label for the Packages Ecosystem team

Comments

@kpollich
Copy link
Member

Ref elastic/kibana#175254

As of 8.12.1, Fleet defines a custom ingest pipeline processor for all data streams of the form ${type}-${integration}.package@custom as well as a processor of the form ${type}-${integration}-${dataset}@custom. So, it is possible to create a datastream of the name package and incur a collision of these pipeline names, resulting in expected ingestion behavior in cases where a custom pipeline is in use.

I'm proposing we forbid the usage of package as the data stream directory name as well as the dataset property in a data stream manifest file. This will guarantee a package can't ship with this potential collision in place.

@kpollich
Copy link
Member Author

There is one instance of the name package being used for a data stream in the system audit package

https://github.com/elastic/integrations/blob/main/packages/system_audit/data_stream/package/manifest.yml

This means we do wind up with a collision when Fleet generates custom pipeline names, e.g.

{
  "pipeline": {
    "name": "logs-system_audit.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `logs` defined by the `system_audit` integration"
  }
},
{
  "pipeline": {
    "name": "logs-system_audit.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for the `system_audit.package` dataset"
  }
}

@elastic/security-external-integrations - Is there an explicit motivation for the naming of this data stream as logs-system_audit.package, or is there simply not a specific name relevant to this particular use case?

@kpollich
Copy link
Member Author

One way we can move forward on the Fleet side is to just use .integration for the suffix instead, and make this issue about reserving the word integration rather than package. We won't have to ship a change to that existing data stream this way.

@jsoriano
Copy link
Member

One way we can move forward on the Fleet side is to just use .integration for the suffix instead, and make this issue about reserving the word integration rather than package. We won't have to ship a change to that existing data stream this way.

Could we use other form for the name of these ingest pipelines so they don't collide? For example ${type}-${integration}-package@custom and ${type}-${integration}.${dataset}@custom, or ${type}-${integration}@package@custom and ${type}-${integration}.${dataset}@custom

kpollich added a commit to elastic/kibana that referenced this issue Jan 25, 2024
…ns + add descriptions to each pipeline (#175448)

## Summary

Closes #175254
Ref #168019
Ref #170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Jan 25, 2024
…ns + add descriptions to each pipeline (elastic#175448)

## Summary

Closes elastic#175254
Ref elastic#168019
Ref elastic#170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
elastic#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](elastic#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
(cherry picked from commit 9fe5a66)
kibanamachine added a commit to elastic/kibana that referenced this issue Jan 25, 2024
…oid collisions + add descriptions to each pipeline (#175448) (#175547)

# Backport

This will backport the following commits from `main` to `8.12`:
- [[Fleet] Update Fleet&#x27;s custom ingest pipeline names to avoid
collisions + add descriptions to each pipeline
(#175448)](#175448)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kyle
Pollich","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-01-25T14:07:43Z","message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d","branchLabelMapping":{"^v8.13.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:breaking","Team:Fleet","backport:prev-minor","v8.12.1","v8.13.0"],"title":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each
pipeline","number":175448,"url":"https://github.com/elastic/kibana/pull/175448","mergeCommit":{"message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}},"sourceBranch":"main","suggestedTargetBranches":["8.12"],"targetPullRequestStates":[{"branch":"8.12","label":"v8.12.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.13.0","branchLabelMappingKey":"^v8.13.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/175448","number":175448,"mergeCommit":{"message":"[Fleet]
Update Fleet's custom ingest pipeline names to avoid collisions + add
descriptions to each pipeline (#175448)\n\n## Summary\r\n\r\nCloses
https://github.com/elastic/kibana/issues/175254\r\nRef
https://github.com/elastic/kibana/issues/168019\r\nRef
https://github.com/elastic/kibana/pull/170270\r\n\r\nIn 8.12.0, Fleet
unintentionally shipped a breaking change
in\r\nhttps://github.com//pull/170270 for APM users who
make use\r\nof a custom `traces-apm` data stream. If a user had
previously defined\r\nthis ingest pipeline to customize documents
ingested for the\r\n`traces-apm` data stream
(defined\r\n[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),\r\nthen
they would unexpectedly see that pipeline called when documents\r\nwere
ingested to the `traces-apm.rum` and `traces-apm.sampled`\r\ndatastreams
as well.\r\n\r\nThis PR addresses this collision by adding a `.package`
suffix to the\r\n\"package level\" ingest pipeline introduced in
8.12.0.\r\n\r\nSo, in 8.12.0 a processor would be defined as such on
the\r\n`traces-apm.rum` or `traces-apm.sampled` ingest
pipeline\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm@custom\",\r\n \"ignore_missing_pipeline\": true,\r\n
}\r\n},\r\n```\r\n\r\nThis PR replaces the pipeline with one that looks
as follows:\r\n\r\n```\r\n{\r\n \"pipeline\": {\r\n \"name\":
\"traces-apm.package@custom\",\r\n \"ignore_missing_pipeline\":
true,\r\n \"description\": \"[Fleet] Pipeline for all data streams of
type `traces` defined by the `apm` integration\"\r\n
}\r\n},\r\n```\r\n\r\n**To be clear: this is a breaking change if you
have defined the\r\n`traces-apm@custom` integration on 8.12. In 8.12.1,
it will no longer be\r\ncalled for documents ingested to the
`traces-apm`, `traces-apm.rum`, or\r\n`traces-apm.sampled` data streams.
You will need to rename your pipeline\r\nto `traces-apm.package@custom`
to preserve this behavior.**\r\n\r\nThis change also applies to
`logs-elastic_agent.*` ingest pipelines.
See\r\n[this\r\ncomment](https://github.com/elastic/kibana/issues/175254#issuecomment-1906202137)\r\nfor
more information.\r\n\r\nThere is still technically room for a
collision, though it's unlikely,\r\nif the data stream name is
`package`. This will be handled by a package\r\nspec validation proposed
in\r\nhttps://github.com/elastic/package-spec/issues/699.\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<[email protected]>","sha":"9fe5a66faf4e06fc444c6078edafc29e91126f8d"}}]}]
BACKPORT-->

Co-authored-by: Kyle Pollich <[email protected]>
@kpollich
Copy link
Member Author

I think we should avoid adding a suffix to the dataset level pipelines, as they've been supported for quite a while now and we don't have a clean way to migrate users onto the newly named pipelines. We did wind up shipping the .integration suffix, so maybe we can explore forbidding that keyword instead.

It might be good to think about changing the pipeline names to guarantee no collisions as part of a 9.0 breaking change, though.

CoenWarmer pushed a commit to CoenWarmer/kibana that referenced this issue Feb 15, 2024
…ns + add descriptions to each pipeline (elastic#175448)

## Summary

Closes elastic#175254
Ref elastic#168019
Ref elastic#170270

In 8.12.0, Fleet unintentionally shipped a breaking change in
elastic#170270 for APM users who make use
of a custom `traces-apm` data stream. If a user had previously defined
this ingest pipeline to customize documents ingested for the
`traces-apm` data stream (defined
[here](https://github.com/elastic/integrations/blob/9a36183f0bd12e39a957d2f7bd65f3de4ee685b1/packages/apm/data_stream/traces/manifest.yml#L2-L3),
then they would unexpectedly see that pipeline called when documents
were ingested to the `traces-apm.rum` and `traces-apm.sampled`
datastreams as well.

This PR addresses this collision by adding a `.package` suffix to the
"package level" ingest pipeline introduced in 8.12.0.

So, in 8.12.0 a processor would be defined as such on the
`traces-apm.rum` or `traces-apm.sampled` ingest pipeline

```
{
  "pipeline": {
    "name": "traces-apm@custom",
    "ignore_missing_pipeline": true,
  }
},
```

This PR replaces the pipeline with one that looks as follows:

```
{
  "pipeline": {
    "name": "traces-apm.package@custom",
    "ignore_missing_pipeline": true,
    "description": "[Fleet] Pipeline for all data streams of type `traces` defined by the `apm` integration"
  }
},
```

**To be clear: this is a breaking change if you have defined the
`traces-apm@custom` integration on 8.12. In 8.12.1, it will no longer be
called for documents ingested to the `traces-apm`, `traces-apm.rum`, or
`traces-apm.sampled` data streams. You will need to rename your pipeline
to `traces-apm.package@custom` to preserve this behavior.**

This change also applies to `logs-elastic_agent.*` ingest pipelines. See
[this
comment](elastic#175254 (comment))
for more information.

There is still technically room for a collision, though it's unlikely,
if the data stream name is `package`. This will be handled by a package
spec validation proposed in
elastic/package-spec#699.

---------

Co-authored-by: Kibana Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion Team:Ecosystem Label for the Packages Ecosystem team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants