Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various integration package enhancements and fixes #6359

Merged
merged 6 commits into from
Oct 16, 2021

Conversation

axw
Copy link
Member

@axw axw commented Oct 15, 2021

Motivation/summary

  • Add client.geo to internal_metrics, so RUM breakdown metrics can be indexed. The internal metrics data stream is strictly
    mapped.
  • Remove our README generator, update to use the elastic-package generator instead. This means we no longer have ECS badges; we should enhance the elastic-package generator later if this is important. I have also removed the sample documents, as we have these in the elastic.co APM documentation.
  • Stop generating integration package fields and ingest pipelines. Prune unused fields per data stream, and introduce data stream specific ingest pipelines. This means that any changes to _meta/fields.yml and ingest/pipeline/definition.yml that should be reflected in the integration package need be made manually from now on.
  • Use constant_keyword for processor.name and processor.event fields where we can.
  • Use external: ecs for ECS fields, so we get the canonical ECS field descriptions and types.

Now that we're using external: ecs in fields/ecs.yml, we need to use the output of elastic-package build rather than using the integration package source directly. This goes for the system tests (i.e. what we mount into the package-registry container), and also for what we copy to package-storage.

Checklist

How to test these changes

  1. Install integration
  2. Ingest RUM breakdown metrics, check they are indexed with client.geo.location
  3. Manually inspect installed ingest pipelines for each data stream, checking that they look correct

Related issues

Closes #6344
Closes #6001
Closes #5602
Closes #3871

axw added 3 commits October 15, 2021 11:32
- Add client.geo to internal_metrics, so RUM breakdown metrics
  can be indexed. The internal metrics data stream is strictly
  mapped.

- Remove our README generator, update to use the elastic-package
  generator instead. This means we no longer have ECS badges; we
  should enhance the elastic-package generator later if this is
  important. I have also removed the sample documents, as we have
  these in the elastic.co APM documentation.

- Stop generating integration package fields and ingest pipelines.
  Prune unused fields per data stream, and introduce data stream
  specific ingest pipelines.

- Use `external: ecs` for ECS fields, so we get the canonical ECS
  field descriptions and types.

- Use `constant_keyword` for `processor.name` and `processor.event`
  fields where we can.
Now that we're using `external: ecs` in fields/ecs.yml,
we need to use the output of `elastic-package build`
rather than using the integration package source directly.
This goes for the system tests (i.e. what we mount into
the package-registry container), and also for what we
copy to package-storage.
@axw axw force-pushed the apmpackage-fields branch from d032da1 to 01fb47b Compare October 15, 2021 03:32
@apmmachine
Copy link
Contributor

apmmachine commented Oct 15, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 43 min 55 sec

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /hey-apm : Run the hey-apm benchmark.

  • /package : Generate and publish the docker images.

@axw
Copy link
Member Author

axw commented Oct 15, 2021

/test

@axw axw marked this pull request as ready for review October 15, 2021 07:16
@axw axw requested a review from a team October 15, 2021 07:16
Copy link
Contributor

@marclop marclop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Tested that RUM events had client.geo set.

dependencies:
ecs:
# TODO(axw) make sure this is kept in sync with the ECS version used in libbeat.
reference: [email protected]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can introduce some sort of check in the makefile or the build package process to error if it's not? Doesn't necessarily have to be done in this PR but may be a good candidate for a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could even automate that as part of the make update-beats command (in a follow up PR).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, opened #6362 to track this

Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pipelines are so much easier to read now 👍

LGTM

dependencies:
ecs:
# TODO(axw) make sure this is kept in sync with the ECS version used in libbeat.
reference: [email protected]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could even automate that as part of the make update-beats command (in a follow up PR).

@@ -6,6 +6,18 @@
- description: version bump to align package version with stack version
type: enhancement
link: https://github.com/elastic/apm-server/issues/4898
- description: added client.geo fields to internal_metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the language we use is inconsistent with the first (already existing) item - could you update to for consistency:

version bumped to align package version with stack version

Actually, it is inconsistent also compared to prev. versions - since this is a big refactor PR, would you mind aligning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated all changelog descriptions in the package to use past tense

- name: metricset.period
type: long
description: Current data collection period for this event in milliseconds.
unit: ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed metricset.period in #6111, just forgot to update fields.yml.

@axw axw enabled auto-merge (squash) October 16, 2021 00:29
@axw axw merged commit 5abc7d0 into elastic:master Oct 16, 2021
mergify bot pushed a commit that referenced this pull request Oct 16, 2021
* apmpackage: various improvements and fixes

- Add client.geo to internal_metrics, so RUM breakdown metrics
  can be indexed. The internal metrics data stream is strictly
  mapped.

- Remove our README generator, update to use the elastic-package
  generator instead. This means we no longer have ECS badges; we
  should enhance the elastic-package generator later if this is
  important. I have also removed the sample documents, as we have
  these in the elastic.co APM documentation.

- Stop generating integration package fields and ingest pipelines.
  Prune unused fields per data stream, and introduce data stream
  specific ingest pipelines.

- Use `external: ecs` for ECS fields, so we get the canonical ECS
  field descriptions and types.

- Use `constant_keyword` for `processor.name` and `processor.event`
  fields where we can.

* systemtest: test geoIP enrichment with integration

* Use built integration package in system tests

Now that we're using `external: ecs` in fields/ecs.yml,
we need to use the output of `elastic-package build`
rather than using the integration package source directly.
This goes for the system tests (i.e. what we mount into
the package-registry container), and also for what we
copy to package-storage.

* testing/docker/package-registry: local first

* apmpackage: make changelog descriptions consistent

(cherry picked from commit 5abc7d0)

# Conflicts:
#	apmpackage/apm/data_stream/app_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/error_logs/fields/ecs.yml
#	apmpackage/apm/data_stream/internal_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/profile_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/traces/fields/ecs.yml
#	apmpackage/apm/docs/README.md
@axw axw deleted the apmpackage-fields branch October 16, 2021 02:16
axw added a commit that referenced this pull request Oct 17, 2021
) (#6363)

* Various integration package enhancements and fixes (#6359)

* apmpackage: various improvements and fixes

- Add client.geo to internal_metrics, so RUM breakdown metrics
  can be indexed. The internal metrics data stream is strictly
  mapped.

- Remove our README generator, update to use the elastic-package
  generator instead. This means we no longer have ECS badges; we
  should enhance the elastic-package generator later if this is
  important. I have also removed the sample documents, as we have
  these in the elastic.co APM documentation.

- Stop generating integration package fields and ingest pipelines.
  Prune unused fields per data stream, and introduce data stream
  specific ingest pipelines.

- Use `external: ecs` for ECS fields, so we get the canonical ECS
  field descriptions and types.

- Use `constant_keyword` for `processor.name` and `processor.event`
  fields where we can.

* systemtest: test geoIP enrichment with integration

* Use built integration package in system tests

Now that we're using `external: ecs` in fields/ecs.yml,
we need to use the output of `elastic-package build`
rather than using the integration package source directly.
This goes for the system tests (i.e. what we mount into
the package-registry container), and also for what we
copy to package-storage.

* testing/docker/package-registry: local first

* apmpackage: make changelog descriptions consistent

(cherry picked from commit 5abc7d0)

# Conflicts:
#	apmpackage/apm/data_stream/app_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/error_logs/fields/ecs.yml
#	apmpackage/apm/data_stream/internal_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/profile_metrics/fields/ecs.yml
#	apmpackage/apm/data_stream/traces/fields/ecs.yml
#	apmpackage/apm/docs/README.md

* Fix merge conflicts

* Fix merge conflicts

Co-authored-by: Andrew Wilkins <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Oct 25, 2021

This pull request does not have a backport label. Could you fix it @axw? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.x is the label to automatically backport to the 7.x branch.
  • backport-7./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Oct 25, 2021
@marclop marclop self-assigned this Oct 26, 2021
@marclop
Copy link
Contributor

marclop commented Oct 28, 2021

Using 7.16.0, verified that:

  • The ingest pipelines are set as defined.
  • The client.geo is set when ingesting the following document:
$ cat rum.ndjson
{"metadata":{"service":{"name":"rum-js-test","agent":{"name":"rum-js","version":"5.5.0"}}}}
{"transaction":{"trace_id":"611f4fa950f04631aaaaaaaaaaaaaaaa","id":"611f4fa950f04631","type":"page-load","duration":643,"span_count":{"started":0}}}
{"metricset":{"samples":{"transaction.breakdown.count":{"value":12},"transaction.duration.sum.us":{"value":12},"transaction.duration.count":{"value":2},"transaction.self_time.sum.us":{"value":10},"transaction.self_time.count":{"value":2},"span.self_time.count":{"value":1},"span.self_time.sum.us":{"value":633.288}},"transaction":{"type":"request","name":"GET /"},"span":{"type":"external","subtype":"http"},"timestamp": 1496170422281000}
$ curl -k -H "X-Forwarded-For: 220.244.41.16" -H "Content-type: application/x-ndjson" --data-binary @rum.ndjson "$ELASTIC_APM_SERVER_URL/intake/v2/rum/events"

Via Kibana dev tools:

GET .ds-traces*/_search
{
  "size": 1, 
  "_source": [false], 
  "fields": [
    "client.geo.*"
  ], 
  "query": {
    "match": {
      "service.name": "rum-js-test"
    }
  }
}
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0296195,
    "hits" : [
      {
        "_index" : ".ds-traces-apm-default-2021.10.28-000001",
        "_type" : "_doc",
        "_id" : "TZ8BxXwBAEMLo23_grMQ",
        "_score" : 1.0296195,
        "_source" : { },
        "fields" : {
          "client.geo.continent_name" : [
            "Oceania"
          ],
          "client.geo.region_name" : [
            "Western Australia"
          ],
          "client.geo.city_name" : [
            "Perth"
          ],
          "client.geo.location" : [
            {
              "coordinates" : [
                115.8648,
                -31.9474
              ],
              "type" : "Point"
            }
          ],
          "client.geo.region_iso_code" : [
            "AU-WA"
          ],
          "client.geo.country_iso_code" : [
            "AU"
          ],
          "client.geo.country_name" : [
            "Australia"
          ]
        }
      }
    ]
  }
}

marclop added a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
marclop added a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
marclop added a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
mergify bot pushed a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
(cherry picked from commit 381e594)
mergify bot pushed a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
(cherry picked from commit 381e594)
marclop pushed a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
(cherry picked from commit 381e594)
marclop pushed a commit that referenced this pull request Oct 28, 2021
Adds back the ecs base "message" field to as an external `ecs` field to
the `error_logs` data stream.

It causes the errors captured by APM to not have a `message` field which
in turn causes the Logs tab to not display the error message.

This bug was introduced by #6359.

Signed-off-by: Marc Lopez Rubio <[email protected]>
(cherry picked from commit 381e594)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify test-plan test-plan-ok v7.16.0
Projects
None yet
5 participants