Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credentials JSON string is mangled when rendered to YAML #178

Closed
endorama opened this issue Mar 7, 2022 · 14 comments
Closed

Credentials JSON string is mangled when rendered to YAML #178

endorama opened this issue Mar 7, 2022 · 14 comments
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@endorama
Copy link
Member

endorama commented Mar 7, 2022

Context: adding system tests for gcp integration package (elastic/elastic-package#701)
Observed behaviour: gcp metrics collector does not initialise with correct credentials (and report various errors related to permission errors)

Worth mentioning:

  • we pass a JSON string containing credentials to the integration (via Fleet UI/YAML); this allow to provide a JSON string instead of a file path and helps with testing (as credentials can be pulled from the environment)

  • after debugging I noticed that the JSON string is broken down in 3 pieces (newlines are added) so is not a single string but a multiline. This breaks credential parsing from GCP library client within metricbeat. Example of relevant YAML config (this private key is not the real one) with the strange newlines:

       - credentials_json: '{"type":"service_account","project_id":"elastic-obs-integrations-dev","private_key_id":"y163g4s4cus0117kpdza25woy5nabr0nouobng9l","private_key":"-----BEGIN
                 PRIVATE KEY-----\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC3HpuoGhPxowTM\n0AXSCi3ALyf6Jkax3y+UwotYYGhHyMyLPZuIDjDT4iW4CVE8Fs9rMk2YrOquDaCU\n8NUKjN42zwT36j8CK0STaiBc0UFCFXdYFTLbvABF/afqwcohgoqHLZpf/3PcnF0p\nOXL9q/PFYqwTr0BYJnqHEX5Lpq2d3FMyuH9OtAh7VKbkd4rYF9sYkSpWUqRPS4QM\nfvzohkmN8ZZXbFj9zE9pE3VYqNUk4KBjpP+ZneuEtZrqlh0wr95t3Aom/kiQqL10\n5j3n+Xi7q/zDwzA0Pukkhba7eyuXRiv7NL5ZquoKZP50lLbffG4EuMp3iVl2kxEv\nU8XlTuPhAgMBAAECggEAQo0DMSLZILaIZg8sLlu4qOH6e3UxuC9O0ZeqoOHYxE37\n2Jb5UYcmw7qqzqnENjxsAZ90iAo/+CXHuJmWM5FiqTSvr5IYeCdVcuXdAR6jwuqe\nRwrdQvKeftFjF3R6i5cv1VUDh+QFLaY+TV8tpXe6zn+/3h+RGPhTC4eWCNR4wKfW\nvEnGuMFOn/FqxmBoKL12+DRgw0Fx4SC+J9ClUFMF6kby/OBjNk94Yh2rHdEq5CmZ\nZSBzjsI1fShpIi1/ijkE3iHCZaKdx0UA+J4k1G5dGMfZG9RCVea33vF7Dvdc55A8\nnFCHlDMI1UNPland1aTd2llJf1iDQyTpF5udrPgj4QKBgQDeryH43mq+AIbNJHeq\n5dS4JhSu1QWD5i1lepiublcB1ICTH4856XOLx/Nj80a+zgdaUn/e3NRUGIVExqQk\nwdbGzWlkDwYTV54fr+g/RMvibo4wYdzMQiVgr0aCY8vwfu92Dq0qKRfgvqcd3vbP\nPYD4joLU7nTvfdQtf0em3Hkn/QKBgQDShCQyXFZeDKgH4iT+Q+5QpumAY4FOuqWI\nWPJ7m/OrEQeftqfqv/dbVWSgyD3mu2dDBzcibDM+TjlXH3eYQudKJvybTxjH+seW\njQDyE85dBc2IPhKFSOmDPO71ZXsKAkrtw/mpP5J84Oc/pv3kMtavGhYwbbx8w15N\nxD8NxEN2tQKBgH3oKd4r69CYPZ+58ct3/alNJr6fhWnJeHt7MN7XVmybeUM2QeYt\nn/41xOELiUGS/kdMhC4/T/JoltmHMwHxc32eYOuJLxc6oBYsgLVdMaZKeizS+GOp\nNrcPA1/wCzxkmQJ4U+KVr4GMarMSARy2Grju4vyAAy/yRkifQaUP3ZUFAoGAEAql\nv3it1CjevQsMipuek2LEtFXgyqEKcCNnBuhRXx3DGPaQQSEztjABpQbdQLHTIpZw\nKx1Xok3PrMXnFSE0AsCJy0PxvXtsrho8kjXUKd6BVPp16tYthSSliOmcwJyAHTIr\n2ivP+9gfhwgwnK0LEvjH7BTQoik5DHAB5gioo2kCgYEA0zGL2c/yEOPo69ZqsUuw\nXB81kyABKIQHtbBZESgxyCPUQOMn0uzP9mzVPvm8NP7zUda+g/MOV9WXgoiNpX0M\nmv6XQhWdrsOzdMBrAj8gr2zaN7m1Me/juxelqAbdxRCElKUryMcdnfgNCmdqBJ/a\nbqbKlFmbGlfzxxbHZ6+Lq04=\n-----END
                 PRIVATE KEY-----\n","client_email":"edoardo-testing@elastic-obs-integrations-dev.iam.gserviceaccount.com","client_id":"102132912542254405825","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"https://www.googleapis.com/robot/v1/metadata/x509/edoardo-testing%40elastic-obs-integrations-dev.iam.gserviceaccount.com"}'
    
  • newlines added are consistent (I tried with different keys and newlines have been added always in the same places)

  • we found a similar issue, we thought it was fixed, in Add single quotes around the credentials_json var integrations#2712

Steps taken so far:

  • tested metricbeat standalone with credentials_json configuration: works
  • inspected communications between Kibana and backend when saving credentials_json from Fleet UI: correct
  • tcpdumped communications between Kibana and enrolled Agent: correct
  • run standalong agent with configuration: works
    confirmed by the agent logs (excerpt): {"log.level":"info","message":"Non-zero metrics in the last 30s","monitoring":{"metrics":{"metricbeat":{"gcp":{"compute":{"events":8,"success":8}}}},"ecs.version":"1.6.0"}}
    but I then printed diagnostics info with elastic-agent diagnostics collect and inspected the config/elastic-agent-policy.yaml file: the newlines were present

I can provide further debug files but given they contain sensitive credentials will not upload them here.

/cc @joshdover @ruflin @mtojek

@endorama endorama added the Team:Elastic-Agent Label for the Agent team label Mar 7, 2022
@joshdover
Copy link
Contributor

  • inspected communications between Kibana and backend when saving credentials_json from Fleet UI: correct

Could you clarify what was verified as correct here? Did we look at the fully compiled policy in the latest revision of the policy in .fleet-policies?

If I'm understanding correctly, there's some breakdown happening in the processing between .fleet-policies => Fleet Server => Elastic Agent => elastic-agent-policy.yaml on disk. Is that right?

Simple, isolated repro steps would be quite helpful to debug this, is that easy to produce?

@ruflin
Copy link
Contributor

ruflin commented Mar 7, 2022

It is interesting that standalone with the config works but inspect still outputs a non proper result. This indicates the "problem" happens somewhere before the policy is picked up by Elastic Agent but that there is also a post processing "issue" in elastic agent maybe just on the inspect side.

As @joshdover , lets try to find a minimal example to reproduce it. My guess is any input with \n should do?

@endorama
Copy link
Member Author

endorama commented Mar 7, 2022

There are some examples that don't trigger this issue in elastic/integrations#2712, so I suspect line length is a factor.

I'll experiment with different strings to collect more details about this behaviour.

inspected communications between Kibana and backend when saving credentials_json from Fleet UI: correct
Could you clarify what was verified as correct here? Did we look at the fully compiled policy in the latest revision of the policy in .fleet-policies?

@joshdover I used tcpdump to inspect the network traffic and from it's analysis the JSON sent from Fleet to the Agent is correct (but my understanding of the policy update process is limited, so I may have misinterpreted the results). I have the dump file for further inspection if interested.

@mtojek
Copy link
Contributor

mtojek commented Mar 7, 2022

I have a feeling that this is related to escaping quotes and new lines while configuring policy using Fleet UI. Agent picks it up from the Fleet Server and it's already mangled.

@endorama
Copy link
Member Author

endorama commented Mar 7, 2022

I performed some tests and these are the result: behaviour is observed when a string contains a space and is longer than 101 characters (not exact length, not sure if it make sense to pinpoint exactly at which length this happens).

This string:

# 50 a + 1 + 50 b = 101 chars
 - credentials_json: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

is dumped (by diagnostics collect) as:

- credentials_json: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

This string (in elastic-agent.yaml config file):

# 60 a + 1 + 60 b = 121 chars
- credentials_json: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

is dumped (by diagnostics collect) as:

  - credentials_json: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
      bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

Note the hard newline in this case. This is consistent with it happening with a string representing a private key, very long and containing spaces.

@ph
Copy link
Contributor

ph commented Mar 7, 2022

Looking at the above I see that you are using an embedded certificate, @lykkin since you are working on the LS output, have you seen this issue when working with certificates?

@endorama endorama changed the title [Agent] Credentials JSON string is mangled when rendered to YAML Credentials JSON string is mangled when rendered to YAML Mar 7, 2022
@endorama endorama transferred this issue from elastic/beats Mar 7, 2022
@andrewkroh
Copy link
Member

andrewkroh commented Mar 7, 2022

after debugging I noticed that the JSON string is broken down in 3 pieces (newlines are added) so is not a single string but a multiline. This breaks credential parsing from GCP library client within metricbeat. Example of relevant YAML config (this private key is not the real one) with the strange newlines

I was trying to determine what is wrong with the example given. The example appears to be valid YAML and its contents appear to be valid JSON. It passes when I paste it into https://yamllint.com. And when I check it with pbpaste | yq eval '.[].credentials_json' | jq . the JSON looks like a typical GCP credentials file, but if you further inspect the private_key you'll see that key is malformed because there should be a EOL marker separating the end of the base64 content the END label as per https://www.rfc-editor.org/rfc/rfc7468#section-3.

$ pbpaste  | yq eval '.[].credentials_json' | jq -r .private_key
mv6XQhWdrsOzdMBrAj8gr2zaN7m1Me/juxelqAbdxRCElKUryMcdnfgNCmdqBJ/a
bqbKlFmbGlfzxxbHZ6+Lq04=-----END PRIVATE KEY-----

So where does this newline get lost at?

@endorama
Copy link
Member Author

endorama commented Mar 8, 2022

@andrewkroh I fear that may be a copy paste error on my side, as the newline is missing in my local configuration. I amended the test string.

@mtojek
Copy link
Contributor

mtojek commented Mar 8, 2022

I went the other way round and tried to remove the escaping problem from the equation.

  1. I cloned @endorama's branch.
  2. Then:
make clean build
cd test/packages/parallel/gcp
../../../../elastic-package build # (../../../../ is to use the custom build)
../../../../elastic-package stack up -v -d
`../../../../elastic-package stack shellinit`

export VAULT_ADDR=https://secrets.elastic.co:8200
export GOOGLE_CREDENTIALS=`vault read -field credentials secret/observability-team/ci/service-account/elastic-package-gcp | jq -c`
export GCP_PROJECT_ID=elastic-observability
export TF_VAR_GCP_PROJECT_ID=elastic-observability

../../../../elastic-package test system -v

Now you can see that elastic-package starts the terraform container to create GCP compute machine. It proves that credentials are working correctly. GOOGLE_CREDENTIALS is valid.

Then, elastic-package runs system test runner (which fails):

2022/03/08 12:45:06 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:07 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:08 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:09 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:10 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:11 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:12 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:13 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:14 DEBUG found 0 hits in metrics-gcp.compute-ep data stream
2022/03/08 12:45:15 DEBUG reassigning original policy back to agent...
2022/03/08 12:45:15 DEBUG PUT http://127.0.0.1:5601/api/fleet/agents/22c1329c-2ada-4057-b2d4-b4df641ac2b2/reassign
2022/03/08 12:45:17 DEBUG GET http://127.0.0.1:5601/api/fleet/agents/22c1329c-2ada-4057-b2d4-b4df641ac2b2

(no metrics found)

When I jump into Metricbeat's logs, I can see:

{"log.level":"info","@timestamp":"2022-03-08T11:49:26.077Z","log.logger":"centralmgmt.fleet","log.origin":{"file.name":"management/manager.go","file.line":150},"message":"Status change to Configuring: Updating configuration","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-03-08T11:49:26.082Z","log.logger":"centralmgmt.fleet","log.origin":{"file.name":"management/manager.go","file.line":271},"message":"Applying settings for metricbeat.modules","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2022-03-08T11:49:26.098Z","log.logger":"cfgwarn","log.origin":{"file.name":"metrics/metricset.go","file.line":115},"message":"BETA: The gcp 'metrics' metricset is beta.","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-03-08T11:49:26.107Z","log.logger":"centralmgmt","log.origin":{"file.name":"cfgfile/list.go","file.line":99},"message":"Error creating runner from config: 1 error: error creating Stackdriver client: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-03-08T11:49:26.108Z","log.logger":"centralmgmt.fleet","log.origin":{"file.name":"management/manager.go","file.line":307},"message":"1 error: Error creating runner from config: 1 error: error creating Stackdriver client: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.","service.name":"metricbeat","ecs.version":"1.6.0"}

Short: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information

@endorama Does it mean that I misconfigured something?

BTW when I look into Kibana's UI, I can see this:

Zrzut ekranu 2022-03-8 o 12 47 04

Zrzut ekranu 2022-03-8 o 12 50 08

@endorama QQ: is the credentials_json correctly rendered? or is it a different issue we found here?

@mtojek
Copy link
Contributor

mtojek commented Mar 8, 2022

@endorama QQ: is the credentials_json correctly rendered? or is it a different issue we found here?

I guess that we found the root cause. The elastic-package used 8.0.1-SNAPSHOT in this branch, but this feature (credentials_json) is available since 8.1.0. Most likely it wasn't backported to 8.0.0.

@endorama
Copy link
Member Author

endorama commented Mar 8, 2022

As written by @mtojek the credential_json support has been added to the gcp metricbeat module targeting 8.1.0, while the CI was using 8.0.x.
It wasn't backported to 8.0 intentionally (not sure if it was the right decision or not, but I remember discussion around which version to target for new features and decided to target 8.1.0).

Still: When using stack 8.1.0-SNAPSHOT and Kibana UI I encountered issues. We were able to run tests by fixing the escaping (thank you @andrewkroh), but those pass credentials through YAML files.
So I'll proceed checking if the issue is present from the UI or not.

To conclude:

  • is necessary to use the raw escaping in handlebars to pass this variable correctly from integrations system tests configuration (in test-*-config.yaml): credentials_json: '{{{GOOGLE_CREDENTIALS}}}' (Note the 3 {)
  • configuring this field from the UI may still produce an issue (waiting for confimation)

@endorama
Copy link
Member Author

endorama commented Mar 8, 2022

I confirmed using Stack version 8.1.0 everything works, even if the string is mangled as mentioned in this issue.

I confirm the findings from @andrewkroh: both YAML and JSON are correct so even if it looks weird it actually works when passed to the Agent and Metricbeat.

I'm going to close this as not a bug.

I'm also going to investigate @mtojek finding about the empty Kibana UI, but separately.

Thank you all for the help with this issue, your help has been very precious 🙇

@endorama endorama closed this as completed Mar 8, 2022
@jlind23
Copy link
Contributor

jlind23 commented Mar 9, 2022

ping @ph

@ph
Copy link
Contributor

ph commented Mar 9, 2022

Followup issue #185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

7 participants