Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/package_policies: package vars should be on input level #125625

Closed
mtojek opened this issue Feb 15, 2022 · 26 comments
Closed

/package_policies: package vars should be on input level #125625

mtojek opened this issue Feb 15, 2022 · 26 comments
Labels
Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@mtojek
Copy link
Contributor

mtojek commented Feb 15, 2022

Hi,

We're investigating an issue with AWS integration, system tests and policy vars. Everything works well with 7.17.0 branch, but when running system tests against 8.0, 8.1, Kibana fails with:

Error: error running package system tests: could not complete test run: could not add data stream config to policy: could not add package to policy; API status code = 500; response body = {"statusCode":500,"error":"Internal Server Error","message":"Cannot read properties of undefined (reading 'access_key_id')"}

I checked the behavior using UI and it worked (modified a package policy), so I went down the rabbit hole and compared JSON requests:

elastic-package:

{
  "name": "aws-ec2_metrics",
  "description": "",
  "namespace": "ep",
  "policy_id": "ad5e3840-8e4a-11ec-8b0a-59f6cbd7b323",
  "enabled": true,
  "output_id": "",
  "inputs": [
    {
      "type": "aws/metrics",
      "enabled": true,
      "streams": [
        {
          "id": "aws/metrics-aws.ec2_metrics",
          "enabled": true,
          "data_stream": {
            "type": "metrics",
            "dataset": "aws.ec2_metrics"
          },
          "vars": {
            "latency": {
              "value": "10m",
              "type": "text"
            },
            "period": {
              "value": "5m",
              "type": "text"
            },
            "regions": {
              "value": null,
              "type": "text"
            },
            "tags_filter": {
              "value": "- key: Name\n  value: \"elastic-package-test-33138\"",
              "type": "yaml"
            }
          }
        }
      ],
      "vars": {
        "access_key_id": {
          "value": "<redacted>",
          "type": "text"
        },
        "credential_profile_name": {
          "value": null,
          "type": "text"
        },
        "endpoint": {
          "value": "amazonaws.com",
          "type": "text"
        },
        "proxy_url": {
          "value": null,
          "type": "text"
        },
        "role_arn": {
          "value": null,
          "type": "text"
        },
        "secret_access_key": {
          "value": "<redacted>",
          "type": "text"
        },
        "session_token": {
          "value": null,
          "type": "text"
        },
        "shared_credential_file": {
          "value": null,
          "type": "text"
        }
      }
    }
  ],
  "package": {
    "name": "aws",
    "title": "AWS",
    "version": "999.999.999"
  }
}

Kibana UI:

{
  "name": "aws-1",
  "description": "",
  "namespace": "default",
  "policy_id": "elastic-agent-managed-ep",
  "enabled": true,
  "output_id": "",
  "inputs": [
    {
      "type": "aws/metrics",
      "policy_template": "ec2",
      "enabled": true,
      "streams": [
        {
          "enabled": true,
          "data_stream": {
            "type": "metrics",
            "dataset": "aws.ec2_metrics"
          },
          "vars": {
            "period": {
              "value": "5m",
              "type": "text"
            },
            "regions": {
              "type": "text",
              "value": []
            },
            "latency": {
              "type": "text"
            },
            "tags_filter": {
              "value": "# - key: \"created-by\"\n  # value: \"foo\"\n",
              "type": "yaml"
            }
          }
        }
      ]
    }
  ],
  "package": {
    "name": "aws",
    "title": "AWS",
    "version": "999.999.999"
  },
  "vars": {
    "shared_credential_file": {
      "type": "text"
    },
    "credential_profile_name": {
      "type": "text"
    },
    "access_key_id": {
      "type": "text",
      "value": "ACCESS_KEY_ID"
    },
    "secret_access_key": {
      "type": "text",
      "value": "SECRET_ACCESS_KEY"
    },
    "session_token": {
      "type": "text"
    },
    "role_arn": {
      "type": "text"
    },
    "endpoint": {
      "value": "amazonaws.com",
      "type": "text"
    },
    "proxy_url": {
      "type": "text"
    }
  }
}

As you can see Kibana UI placed vars under the root. Is it correct behavior? We haven't touched that part of logic of elastic-package for a long time and it worked well. Something is definitely inconsistent between 7.x and 8.x.

@mtojek mtojek added the Team:Fleet Team label for Observability Data Collection Fleet team label Feb 15, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@joshdover
Copy link
Contributor

@juliaElastic @criamico you all worked on recent changes to our API in 8.0. Was this one of the breaking changes we made to the package policies API? Possibly as part of #119739?

@joshdover
Copy link
Contributor

@mtojek Curious that we just found this failing test now. Do we know how long this test has been failing against the 8.0 branch? Maybe a recent change that hasn't been shipped is creating this problem

@mtojek
Copy link
Contributor Author

mtojek commented Feb 15, 2022

I can explain this. We used to test AWS integration against the 7.x stack and now we're defaulting to 8.x. Recently we had a lot of broken tests in both lines, so it's hard to figure out when it started failing. We also don't keep artifacts for a longer time, but based on CI statuses, I think it worked around ~2021-12-22.

@joshdover
Copy link
Contributor

I think this is a bug and not an intentional or unintentional breaking change because we still have vars under inputs[idx].streams[idx].vars in our validation schema. @mtojek are there any Kibana logs to share from this test run that would have a stack trace available to us?

@joshdover
Copy link
Contributor

Nevermind I found them, this definitely looks strange, see the inputs.undefined-aws name:

�[36mkibana_1                     |�[0m [2022-02-14T03:37:33.885+00:00][ERROR][plugins.fleet] Package policy is invalid: inputs.undefined-aws/metrics.vars.access_key_id: access_key_id var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.credential_profile_name: credential_profile_name var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.endpoint: endpoint var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.proxy_url: proxy_url var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.role_arn: role_arn var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.secret_access_key: secret_access_key var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.session_token: session_token var definition does not exist
�[36mkibana_1                     |�[0m inputs.undefined-aws/metrics.vars.shared_credential_file: shared_credential_file var definition does not exist

@joshdover
Copy link
Contributor

Seems related to #124215

@juliaElastic
Copy link
Contributor

juliaElastic commented Feb 15, 2022

@joshdover yes, we introduced stricter validation in that pr, meaning that the validation rejects package policies where there are unknown vars provided (missing from policy template)
@mtojek can the test be fixed to produce a valid package policy?

@mtojek
Copy link
Contributor Author

mtojek commented Feb 15, 2022

Hey @juliaElastic, could you please refer to the sample I posted above and adjust it to pass the validation? I will review it and see what we can do in elastic-package.

BTW in this case it looks like an unintentional breaking change.

EDIT:

Test config

EDIT2:

I think you might be talking about a different problem, as all vars are reported as missing, even ones that were passed (access key, secret key).

@juliaElastic
Copy link
Contributor

@mtojek so the validation checks against the package definition from the api call below:

GET /api/fleet/epm/packages/aws/1.11.4
{
  "name": "aws",
  "version": "1.11.4",
  "policy_templates": [...],
  "data_streams": [...],
  "vars": [
    {
       "name": "shared_credential_file",
       ...
    }, ....]
}

the problematic vars are on different level of the package definition, outside of inputs/streams.
the question is, is it valid to move the global vars inside an input? are those var values being persisted correctly?
if so, then the validation is not complete and Fleet should fix that.
If not, then perhaps the test should be fixed.

we debated whether this validation is breaking, and thought that it is an improvement not to allow invalid policies to be created: https://github.com/elastic/kibana/pull/124215/files#r796666631

cc @joshdover

@mtojek
Copy link
Contributor Author

mtojek commented Feb 15, 2022

the question is, is it valid to move the global vars inside an input? are those var values being persisted correctly?

I think, global vars are intended and were introduced as part of works on input groups? The manifest is compatible with Kibana since 7.15. cc @kaiyan-sheng

If not, then perhaps the test should be fixed.

The problem here is that we use this system test for a long time and it worked well :) Notice: if we decide to change its format, then we have to adjust the elastic-package tool.

EDIT:

I'm trying to understand why did our system test config work well with 7.x and it's invalid with 8.x? elastic-package passes only vars which are defined in manifests. I assume that the root cause is the wrong location?

@juliaElastic
Copy link
Contributor

the question is, is it valid to move the global vars inside an input? are those var values being persisted correctly?

I think, global vars are intended and were introduced as part of works on input groups? The manifest is compatible with Kibana since 7.15. cc @kaiyan-sheng

okay, if that's the case, it seems that Fleet validation should consider it valid.
I can revert the change in validation to unblock you and raise an issue to improve the validation to take into account input groups.

EDIT:

I'm trying to understand why did our system test config work well with 7.x and it's invalid with 8.x? elastic-package passes only vars which are defined in manifests. I assume that the root cause is the wrong location?

yes, the location is the cause I think

@mtojek
Copy link
Contributor Author

mtojek commented Feb 15, 2022

Thank @juliaElastic for taking care of this!

If this is only about the system tests of AWS integration, we can temporarily disable it. It looks like our end-users are safe. If you plan to fix it in a few days, we're fine with that schedule :)

@juliaElastic
Copy link
Contributor

juliaElastic commented Feb 15, 2022

Thank @juliaElastic for taking care of this!

If this is only about the system tests of AWS integration, we can temporarily disable it. It looks like our end-users are safe. If you plan to fix it in a few days, we're fine with that schedule :)

I raised the pr with the revert, should be merged soon once reviewed.

EDIT: the revert is merged

@juliaElastic
Copy link
Contributor

juliaElastic commented Feb 15, 2022

one thing I'm not sure about, I tried creating that package_policy in the description (after disabling the validation locally), and on the UI the values of those vars are not loading.
shouldn't fleet ui have been updated to support input groups? now it looks like those vars are not working properly
cc @nchaulet maybe you know about this?
image

strangely I see only the Endpoint value there, which seems to be set on the outermost vars too

image

@kaiyan-sheng
Copy link
Contributor

Thank you for looking into this!!

I think, global vars are intended and were introduced as part of works on input groups? The manifest is compatible with Kibana since 7.15. cc @kaiyan-sheng

Yes global vars are intended. Because these variables are shared between all data streams inside the AWS package.

@juliaElastic Endpoint variable is the only one that has a default value to start with. If you put some random string value into Access Key ID box, does it show up in the policy?

@juliaElastic
Copy link
Contributor

Thank you for looking into this!!

I think, global vars are intended and were introduced as part of works on input groups? The manifest is compatible with Kibana since 7.15. cc @kaiyan-sheng

Yes global vars are intended. Because these variables are shared between all data streams inside the AWS package.

@juliaElastic Endpoint variable is the only one that has a default value to start with. If you put some random string value into Access Key ID box, does it show up in the policy?

Yes, it works if I put it in global vars (outside of inputs), but my question is, the same vars can appear inside inputs/streams as well, should Fleet take the values any level (e.g. moved inside inputs) and load them on UI?

@kaiyan-sheng
Copy link
Contributor

@juliaElastic 👍 Thanks for confirming the global vars work. For AWS package, the same vars will not appear inside inputs/streams. @mtojek Do you know if there's any other use case with variables need to appear inside and outside the inputs?

@mtojek
Copy link
Contributor Author

mtojek commented Feb 15, 2022

AFAIK only the AWS package uses vars so extensively and maybe the GCP soon as @endorama is experimenting with system tests now.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 17, 2022

@juliaElastic @kaiyan-sheng I can confirm that this issue will also affect GCP: elastic/elastic-package#701

@juliaElastic
Copy link
Contributor

@mtojek the unknown vars validation has been reverted, we would have to check where the GCP error comes from exactly. can you give the package policy object and the new package definition in JSON format?

@mtojek
Copy link
Contributor Author

mtojek commented Feb 21, 2022

It also depends on when the Kibana Docker image was published. I'm checking what's the status now and will come back with any feedback: pipeline job.

Keep in mind that the GCP package is under development, so I can't guarantee that it's 100% correct.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 21, 2022

@juliaElastic Ok, I checked the status of AWS and GCP packages and both are failing with similar error:

GCP:

[2022-02-21T08:53:14.342Z] Error: error running package system tests: could not complete test run: could not add data stream config to policy: could not add package to policy; API status code = 500; response body = {"statusCode":500,"error":"Internal Server Error","message":"Cannot read properties of undefined (reading 'credentials_file')"}

AWS:

[2022-02-21T08:49:07.033Z] Error: error running package system tests: could not complete test run: could not add data stream config to policy: could not add package to policy; API status code = 500; response body = {"statusCode":500,"error":"Internal Server Error","message":"Cannot read properties of undefined (reading 'access_key_id')"}

Package sources:
GCP AWS

EDIT:

I checked the CI job for Integrations. It uses stack 8.1.0-SNAPSHOT and it looks like the issue is fixed. The elastic-package tool uses 8.0.0 (RELEASE), so I bet it might be the root cause.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 21, 2022

.. and it was the root cause (confirmed with: elastic/elastic-package#708).

The revert went well, so if there is any other action required we can close this issue. I do appreciate your help and ownership, @juliaElastic. Thank you!

@endorama
Copy link
Member

Thanks everyone for your effort on this issue!

Given what I'm seeing here, is input level variables still missing from Fleet? The error was in the validation code, but is not clear to me if input level variables are correctly displayed in Fleet UI.

@jen-huang
Copy link
Contributor

Going to close this issue as it looks like everything has been resolved.

@jen-huang jen-huang closed this as not planned Won't fix, can't repro, duplicate, stale May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

7 participants