Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/metricbeat/module/gcp: Add Organization ID and display name to cloud labels #40461

Merged
merged 22 commits into from
Sep 19, 2024

Conversation

Linu-Elias
Copy link
Contributor

@Linu-Elias Linu-Elias commented Aug 8, 2024

Proposed commit message

  • Set the project ID in the cloud.project.id field (currently missing)
  • Set the Google Cloud ORG ID in the cloud.account.id field (currently contains the project ID)
  • Set the Google Cloud ORG display name in the cloud.account.name field (currently contains the project ID)

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

  • With Project that belong to an Organization:

image

  • With Project that belong to "No Organization Category" and Compute Service
    NoOrgCompute

  • With Project that belong to "No Organization Category" and Spanner Service
    image

Logs

@Linu-Elias Linu-Elias requested review from a team as code owners August 8, 2024 11:28
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 8, 2024
@botelastic
Copy link

botelastic bot commented Aug 8, 2024

This pull request doesn't have a Team:<team> label.

Copy link
Contributor

mergify bot commented Aug 8, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @Linu-Elias? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

x-pack/metricbeat/module/gcp/metrics/metricset.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/gcp/metrics/metricset.go Outdated Show resolved Hide resolved
CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Aug 13, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp-cloud-label-fix upstream/gcp-cloud-label-fix
git merge upstream/main
git push upstream gcp-cloud-label-fix

Copy link
Contributor

mergify bot commented Aug 23, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp-cloud-label-fix upstream/gcp-cloud-label-fix
git merge upstream/main
git push upstream gcp-cloud-label-fix

@@ -176,6 +176,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Update beat module with apm-server monitoring metrics fields {pull}40127[40127]
- Fix Azure Monitor metric timespan to restore Storage Account PT1H metrics {issue}40376[40376] {pull}40367[40367]
- Remove excessive info-level logs in cgroups setup {pull}40491[40491]
- Add GCP organization ID and display name to ECS cloud fields. {issue}39203[39203] {pull}40461[40461]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a bug fix or enhancement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a challenging call. It feels both.

  • enhancement: we're adding project IDs and names that were unavailable.
  • bugfix: setting organization IDs and names with the correct values, according to ECS. This feels like an alignment with the spec and part of a fix.

IMO, this change is 75% an enhancement and 25% a fix. It's not a blocker; I'll leave the final call to you, @Linu-Elias.

Copy link
Contributor

@kaiyan-sheng kaiyan-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good. Can we add some documentation on the new fields and also change the existing field definition to reflect this change please? Thank you!

@Linu-Elias
Copy link
Contributor Author

Overall it looks good. Can we add some documentation on the new fields and also change the existing field definition to reflect this change please? Thank you!

@kaiyan-sheng I'm not sure if you’re referring to this Documentation, but we already have appropriate field definitions here.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good; thank you for working on this, @Linu-Elias!

Besides the comments about the constant name and the additional API call to get the project name, we should pay attention to one aspect: the rollout to users.

This change is a breaking change (by design).

The cloud.project.id field is OK; we're adding a missing field.

However, the cloud.account.id and cloud.account.name fields now contain the correct values instead of the project ID. Even if wrong, users may have grown using them.

@lucabelluccini, I know it is not your responsibility, but you probably have experience with problematic rollouts, so you are my expert.

What's the best way to communicate a change in field availability and value?

In this PR, we are:

  • adding a new field cloud.project.id (and maybe cloud.project.name)
  • fixing the cloud.account.id and cloud.account.name fields, setting them with the actual organization ID and name instead of the project ID
  • cloud.account.id and cloud.account.name may not be available (the service account may lack the required permissions to get the information), while before this change, they were always present.

x-pack/metricbeat/module/gcp/metrics/metricset.go Outdated Show resolved Hide resolved
@@ -53,14 +57,19 @@ func (s *StackdriverTimeSeriesMetadataCollector) Metadata(ctx context.Context, i

ecs := mapstr.M{
ECSCloud: mapstr.M{
ECSCloudAccount: mapstr.M{
ECSCloudProject: mapstr.M{
ECSCloudAccountID: accountID,
ECSCloudAccountName: accountID,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to change the names of these constants; even if the values are correct ("id" and "name"), the name is misleading.

They contain "CloudAccount" while we use them for the ECSCloudProject:

	ECSCloudAccountID   = "id"
	ECSCloudAccountName = "name"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ECSCloudAccountID and ECSCloudAccountName is used in two places - for setting organization(id and name) and Project(id and name) hence the field name can be something generic like- ECSCloudID and ECSCloudName ?

@lucabelluccini
Copy link
Contributor

What's the best way to communicate a change in field availability and value?

In this PR, we are:

  • adding a new field cloud.project.id (and maybe cloud.project.name)
  • fixing the cloud.account.id and cloud.account.name fields, setting them with the actual organization ID and name instead of the project ID
  • cloud.account.id and cloud.account.name may not be available (the service account may lack the required permissions to get the information), while before this change, they were always present.

Thanks @zmoog on the callout!

I think:

  1. Make this clear in the changelog/release notes of both GCP Module & GCP Integration as much as possible
  2. This is a good enhancement/fix, but I see eventually some users who might complain of the change. As before all fields were identical, I think we might leverage one of the following:
    • A custom ingest pipeline to put the fields as they were (just processors to copy the cloud.project.id into cloud.account.id and cloud.account.name)
    • Runtime fields to shadow the real content of the fields with the "old" value (just define 2 runtime fields cloud.account.id and cloud.account.name, making them read the value from cloud.project.id)

It would be good to have those 2 workarounds tested (if needed I can help setting them up if you have an accessible cluster where you have the data) and documented in a KB or in the docs.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

We need to add two follow-up actions to the #39203 issue to implement the suggestions from @lucabelluccini.

I think:

  1. Make this clear in the changelog/release notes of both GCP Module & GCP Integration as much as possible
  2. This is a good enhancement/fix, but I see eventually some users who might complain of the change. As before all fields were identical, I think we might leverage one of the following:
    • A custom ingest pipeline to put the fields as they were (just processors to copy the cloud.project.id into cloud.account.id and cloud.account.name)
    • Runtime fields to shadow the real content of the fields with the "old" value (just define 2 runtime fields cloud.account.id and cloud.account.name, making them read the value from cloud.project.id)

It would be good to have those 2 workarounds tested (if needed I can help setting them up if you have an accessible cluster where you have the data) and documented in a KB or in the docs.

  1. We need to update the release notes for the GCP module
  2. We need to update the GCP integration docs
  3. Provide a tested copy-and-paste solution to revert this change for those users who may rely on the previous behavior (I lean towards the custom pipeline, mainly because it feels easier to understand, but maybe I'm biased).

@Linu-Elias, we can work on these follow-up actions separately from this PR, which is ready.

@@ -176,6 +176,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Update beat module with apm-server monitoring metrics fields {pull}40127[40127]
- Fix Azure Monitor metric timespan to restore Storage Account PT1H metrics {issue}40376[40376] {pull}40367[40367]
- Remove excessive info-level logs in cgroups setup {pull}40491[40491]
- Add GCP organization ID and display name to ECS cloud fields. {issue}39203[39203] {pull}40461[40461]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a challenging call. It feels both.

  • enhancement: we're adding project IDs and names that were unavailable.
  • bugfix: setting organization IDs and names with the correct values, according to ECS. This feels like an alignment with the spec and part of a fix.

IMO, this change is 75% an enhancement and 25% a fix. It's not a blocker; I'll leave the final call to you, @Linu-Elias.

x-pack/metricbeat/module/gcp/metrics/metricset.go Outdated Show resolved Hide resolved
@zmoog
Copy link
Contributor

zmoog commented Sep 10, 2024

@kilfoyle, sorry for pinging you directly! 🙇

This PR introduces a breaking change in the GCP metricset. We are changing the content of the cloud.account.id and cloud.account.name fields to align them to ECS.

What do you think the best way is to communicate this change in the release notes? Do you have an example at hand we can imitate?

@kilfoyle
Copy link
Contributor

Hi @zmoog, good question! I haven't really worked with the Beats docs but it looks like we'd need:

  • A new page similar to Breaking changes in 8.0. The source for that one is here.

  • A new bullet and new include in this Breaking changes page for whichever release will introduce the change (8.16 I guess). The source file is here. So something like this:

See the following topics for a description of breaking changes between major versions:

* <<breaking-changes-8.0>>
* <<breaking-changes-8.16>>

For breaking changes between minor versions, see the <<release-notes,release notes>>.

include::breaking-8.0.asciidoc[]
include::breaking-8.16.asciidoc[]

Copy link
Contributor

mergify bot commented Sep 11, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Sep 11, 2024
@v1v v1v removed the backport-v8.x label Sep 11, 2024
@zmoog
Copy link
Contributor

zmoog commented Sep 13, 2024

@kilfoyle, pages like Breaking changes in 8.0 seem an excellent place for reporting breaking changes for the stack as a whole.

In this case, we want to report a breaking change in the Metricbeat's module for GCP, a humble and more limited scope.

Should we target the Metricbeat release notes? Do you think it's appropriate?

@zmoog
Copy link
Contributor

zmoog commented Sep 13, 2024

Something like the "Breaking changes" section in 8.15.0?

CleanShot 2024-09-13 at 15 53 30@2x

@kilfoyle
Copy link
Contributor

@zmoog Ah, yes! Sorry, I didn't realize that there are different versions of breaking changes depending on scope.

Definitely you could duplicate the breaking changes section in those 8.15 docs. So that would mean updating the 8.15.1 changelog with something like:

==== Breaking changes

*Metricbeat*

- {description}. {pull}40461[40461]

@zmoog
Copy link
Contributor

zmoog commented Sep 13, 2024

Definitely you could duplicate the breaking changes section in those 8.15 docs. So that would mean updating the 8.15.1 changelog with something like:

Great, we'll add this section to the target release!

Thank you @kilfoyle!

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Before merging, please add an example of how to restore the previous behavior.

After the merge, update the integration docs when we bump the min stack version to bring this to the Elastic Agent integration.

Comment on lines 7 to 14
=== Beats version 8.15.1
https://github.com/elastic/beats/compare/v8.15.0\...v8.15.1[View commits]

==== Breaking changes

*Metricbeat*

- Add GCP organization and project details to ECS cloud fields. {pull}40461[40461]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be in the 8.15.1 section of the release notes, but I guess we're shipping in 8.15.2 at the earliest, right?

Comment on lines +150 to +155
We have updated our field names to align with ECS semantics. As part of this change:

* `cloud.account.id` will now contain the Google Cloud Organization ID (previously, it contained the project ID).
* `cloud.account.name` will now contain the Google Cloud Organization Display Name (previously, it contained the project name).
* New fields `cloud.project.id` and `cloud.project.name` will be added to store the actual project ID and project name, respectively.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @lucabelluccini suggested, could you also explain how to revert to the previous behavior for users who may depend on having the project ID in the cloud.account.* fields?

I think adding a processor in the @Custom pipeline is good enough.

We need to do the same in the Elastic Agent integrations docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this?

[
  {
    "set": {
      "field": "cloud.account.id",
      "value": "{{cloud.project.id}}",
      "if": "ctx?.cloud?.project?.id != null"
    }
  },
  {
    "set": {
      "field": "cloud.account.name",
      "value": "{{cloud.project.name}}",
      "if": "ctx?.cloud?.project?.name != null"
    }
  },
  {
    "remove": {
      "field": [
        "cloud.project.id",
        "cloud.project.name"
      ],
      "ignore_missing": true
    }
  }
]

Labels with current implementation:

    "cloud": {
    "account": {
        "id": "992493199029"
      },
      "project": {
        "id": "elastic-observability",
        "name": "elastic-observability"
      },
    },

After pipeline:

          "cloud": {
            "project": {},
            "account": {
              "name": "elastic-observability",
              "id": "elastic-observability"
            }
          },

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

Copy link
Contributor

mergify bot commented Sep 17, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp-cloud-label-fix upstream/gcp-cloud-label-fix
git merge upstream/main
git push upstream gcp-cloud-label-fix

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Linu-Elias Linu-Elias merged commit 355ed04 into elastic:main Sep 19, 2024
29 checks passed
mergify bot pushed a commit that referenced this pull request Sep 19, 2024
… cloud labels (#40461)

* issue fix

* add change.log

* formatting

* change.log

* error-fix

* resolved comments

* resolved comments

* context added

* resolve comments

* error handling

* doc changes

* change.log

* Update gcp.asciidoc

* Update CHANGELOG.asciidoc

* doc changes

(cherry picked from commit 355ed04)
Linu-Elias added a commit that referenced this pull request Dec 5, 2024
… ID and display name to cloud labels (#40899)

* x-pack/metricbeat/module/gcp: Add Organization ID and display name to cloud labels (#40461)

* issue fix

* add change.log

* formatting

* change.log

* error-fix

* resolved comments

* resolved comments

* context added

* resolve comments

* error handling

* doc changes

* change.log

* Update gcp.asciidoc

* Update CHANGELOG.asciidoc

* doc changes

(cherry picked from commit 355ed04)

* Update x-pack/metricbeat/module/gcp/timeseries_metadata_collector.go

Co-authored-by: Harnish Chavda <[email protected]>

* Update metadata.go

* Update metadata.go

* Update CHANGELOG.asciidoc

* Update CHANGELOG.asciidoc

* Update CHANGELOG.asciidoc

* Delete CHANGELOG.asciidoc

* Revert "Delete CHANGELOG.asciidoc"

This reverts commit c1bd932.

* fix changelog

* Update CHANGELOG.asciidoc

* Update CHANGELOG.next.asciidoc

---------

Co-authored-by: Linu-Elias <[email protected]>
Co-authored-by: Harnish Chavda <[email protected]>
Co-authored-by: Gabriel Pop <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify bugfix needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GCP metricset does not collect Google Cloud Org ID and display name
7 participants