Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dynamic datasets and namespaces #11168

Merged
merged 7 commits into from
Jul 13, 2023

Conversation

dmathieu
Copy link
Member

@dmathieu dmathieu commented Jul 10, 2023

Motivation/summary

This change grants the permissions for agents running the apmpackage integration to send data to other data streams, and other namespaces, making the use of the reroute processor possible.

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

I ran this integration locally with tilt.

Related issues

Closes #10991

@dmathieu dmathieu added the backport-skip Skip notification from the automated backport with mergify label Jul 10, 2023
@dmathieu dmathieu force-pushed the dynamic-dataset-namespace branch from 45ea165 to 4a0c24e Compare July 10, 2023 12:30
@apmmachine
Copy link
Contributor

apmmachine commented Jul 10, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-07-13T07:33:04.713+0000

  • Duration: 7 min 41 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate and publish the docker images.

  • /test windows : Build & tests on Windows.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@apmmachine
Copy link
Contributor

apmmachine commented Jul 10, 2023

📚 Go benchmark report

Diff with the main branch

goos: linux
goarch: amd64
pkg: github.com/elastic/apm-server/internal/agentcfg
cpu: 12th Gen Intel(R) Core(TM) i5-12500
                                  │ build/main/bench.out │             bench.out             │
                                  │        sec/op        │   sec/op     vs base              │
geomean                                     68.99n         69.24n       +0.37%

                                  │ build/main/bench.out │             bench.out              │
                                  │         B/op         │    B/op     vs base                │
geomean                                                ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                  │ build/main/bench.out │             bench.out              │
                                  │      allocs/op       │ allocs/op   vs base                │
geomean                                                ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/elastic/apm-server/internal/beater/request
                                             │ build/main/bench.out │             bench.out              │
                                             │        sec/op        │    sec/op     vs base              │
ContextResetContentEncoding/gzip_sniff-12              3.941µ ± 12%   4.046µ ±  5%  +2.65% (p=0.041 n=6)
geomean                                                921.1n         900.3n        -2.25%

                                             │ build/main/bench.out │              bench.out               │
                                             │         B/op         │     B/op      vs base                │
geomean                                                           ²                 +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                             │ build/main/bench.out │             bench.out              │
                                             │      allocs/op       │ allocs/op   vs base                │
geomean                                                           ²               +0.00%               ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/elastic/apm-server/internal/publish
             │ build/main/bench.out │          bench.out          │
             │        sec/op        │   sec/op    vs base         │

             │ build/main/bench.out │           bench.out            │
             │         B/op         │     B/op       vs base         │

             │ build/main/bench.out │           bench.out           │
             │      allocs/op       │  allocs/op    vs base         │

pkg: github.com/elastic/apm-server/x-pack/apm-server/aggregation/spanmetrics
                 │ build/main/bench.out │          bench.out           │
                 │        sec/op        │   sec/op     vs base         │

                 │ build/main/bench.out │            bench.out            │
                 │         B/op         │     B/op      vs base           │
¹ all samples are equal

                 │ build/main/bench.out │           bench.out           │
                 │      allocs/op       │ allocs/op   vs base           │
¹ all samples are equal

pkg: github.com/elastic/apm-server/x-pack/apm-server/aggregation/txmetrics
                        │ build/main/bench.out │             bench.out             │
                        │        sec/op        │   sec/op     vs base              │
AggregateTransaction-12            88.64n ± 1%   87.39n ± 1%  -1.42% (p=0.026 n=6)

                        │ build/main/bench.out │           bench.out           │
                        │         B/op         │    B/op     vs base           │
¹ all samples are equal

                        │ build/main/bench.out │           bench.out           │
                        │      allocs/op       │ allocs/op   vs base           │
¹ all samples are equal

pkg: github.com/elastic/apm-server/x-pack/apm-server/sampling
               │ build/main/bench.out │             bench.out              │
               │        sec/op        │    sec/op     vs base              │
geomean                  421.1n         425.4n        +1.01%

               │ build/main/bench.out │              bench.out               │
               │         B/op         │     B/op      vs base                │
geomean                    381.4          381.3       -0.03%
¹ all samples are equal

               │ build/main/bench.out │             bench.out              │
               │      allocs/op       │ allocs/op   vs base                │
geomean                    3.742        3.742       +0.00%
¹ all samples are equal

pkg: github.com/elastic/apm-server/x-pack/apm-server/sampling/eventstorage
                                             │ build/main/bench.out │              bench.out              │
                                             │        sec/op        │    sec/op      vs base              │
geomean                                               13.09µ           13.36µ        +2.04%

                                             │ build/main/bench.out │               bench.out               │
                                             │         B/op         │     B/op       vs base                │
ReadEvents/proto_codec/1_events-12                    1.882Ki ±  0%   1.884Ki ±  0%  +0.08% (p=0.045 n=6)
ReadEvents/nop_codec/10_events-12                     8.526Ki ±  0%   8.541Ki ±  0%  +0.17% (p=0.015 n=6)
geomean                                               11.34Ki         11.39Ki        +0.48%
¹ all samples are equal

                                             │ build/main/bench.out │              bench.out              │
                                             │      allocs/op       │  allocs/op   vs base                │
geomean                                                  123.6         123.6       +0.00%
¹ all samples are equal

report generated with https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

@dmathieu dmathieu requested a review from a team July 10, 2023 12:52
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the issue description:

to add elasticsearch.dynamic_dataset: true only to the app_metrics and app_logs data streams. That's because changing the data set for other data stream, such as internal_metrics could confuse the APM UI.

Was this overlooked or were there any follow up conversations/considerations that led to adding dynamic_dataset: true everywhere?

apmpackage/apm/changelog.yml Show resolved Hide resolved
@dmathieu
Copy link
Member Author

dmathieu commented Jul 11, 2023

Was this overlooked or were there any follow up conversations/considerations that led to adding dynamic_dataset: true everywhere?

Ah yes, my bad. I overlooked that. I've kept it only for app_metrics and app_logs.
I do wonder if it would be interesting to have it for traces as well?

@simitt
Copy link
Contributor

simitt commented Jul 11, 2023

I was wondering the same tbh. @felixbarny was there a special reason for keeping traces-* out of this? Any concerns for the APM UI?

@felixbarny
Copy link
Member

I don't have a strong opinion on whether to allow dynamic_dataset for traces. I suppose there's no harm in allowing that.

However, we should add dynamic_namespace to all APM data streams. This enables users to send all APM data to a dedicated namespace so that they can use custom retention and security policies.

@dmathieu
Copy link
Member Author

Thank you. I have updated this PR to set dynamic_namespace on every data stream, and I've added dynamic_dataset to traces.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If we go ahead with elastic/elasticsearch#97546, then I expect we'll end up giving APM append-only privileges for traces-*, metrics-*, and logs-*. This seems pretty well aligned.

@dmathieu dmathieu enabled auto-merge (squash) July 12, 2023 07:54
@dmathieu dmathieu merged commit 2af344d into elastic:main Jul 13, 2023
@dmathieu dmathieu deleted the dynamic-dataset-namespace branch July 13, 2023 07:48
@axw axw self-assigned this Aug 28, 2023
@axw
Copy link
Member

axw commented Aug 28, 2023

Verified with 8.10.0-BC1.

I created @custom ingest pipelines with the reroute processor, setting dataset to a constant value ("apm.app.all") for app logs/metrics; and setting namespace to a template (some "{{agent.name}}", some "{{service.name}}")

Sending test data with apm-tool, I verified this with logs-apm.app.*, metrics-apm.app.*, traces-apm-*, logs-apm.error-*, and metrics-apm.service_destination.1m-*.

bmorelli25 pushed a commit to bmorelli25/apm-server that referenced this pull request Sep 5, 2023
* allow dynamic datasets and namespaces

* only keep dynamic datasets and namespaces for app metrics and app logs

* add dynamic namespace to every data stream

* add dynamic dataset to traces

* update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify test-plan test-plan-ok v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable routing data to different data streams
6 participants