Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base64 Secrets Decoding Design #504

Merged
merged 2 commits into from
Mar 28, 2023
Merged

Base64 Secrets Decoding Design #504

merged 2 commits into from
Mar 28, 2023

Conversation

gl-johnson
Copy link
Contributor

@gl-johnson gl-johnson commented Mar 2, 2023

Add design doc for adding configurable decoding of Conjur secrets stored in base64.

POC branch can be found here

@gl-johnson gl-johnson force-pushed the base64-secrets-design-doc branch 2 times, most recently from 23ed5c3 to bb803cd Compare March 2, 2023 20:16
@gl-johnson gl-johnson marked this pull request as ready for review March 2, 2023 20:17
@gl-johnson gl-johnson requested a review from a team as a code owner March 2, 2023 20:17
@gl-johnson gl-johnson mentioned this pull request Mar 2, 2023
13 tasks
@adamouamani
Copy link

adamouamani commented Mar 2, 2023

Hi @gl-johnson Thanks for putting this together. I had a couple of comments/questions regarding test scenarios. please see below..

  1. how do we handle case with \n etc is part of value - this could cause issue with base64 encoding

  2. uppercase or mixed case values for text and base64 - for example does Base64 work?

  3. for E2E tests - I see 4 happy path scenarios for text and base64 annotation with k8s secrets and p2f

  4. Do we have unit tests with original values containing special chars - such as umlaut

  5. Do we trim whitespace - for example if you have a string " h e l l o " - are leading and trailing spaces trimmed before converting to base64

  6. Base64 can include a slash '/' - does this cause any issue?

  7. How could this fail -what are some failure/error scenarios and how do we report these?

  8. how will customers use them - is there any size limit to data being encoded?

Copy link

@niteshtaneja niteshtaneja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure to test with -

  • binary secrets > 65K (65K is the limit for Vault apparently and since customer has secrets >65K, they are going to store them directly in Conjur)
  • rotated secrets (The secrets are expected to be rotated every 3 weeks by the customer)

andytinkham
andytinkham previously approved these changes Mar 3, 2023
Copy link
Contributor

@andytinkham andytinkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a security concerns section in this, so I'll put a general review comment in for it. I forget if Secrets Provider does any memory clearing to make sure that secret values don't persist in memory after they're provided, but I think it does. We'll need to make sure that this cleanup mechanism (if it exists) cleans up both the encoded and the decoded secret values. I've got a note to check on this in the final security review.

Other than that, approved by security.

### Backwards compatibility
This enhancement will not modify the default behavior of Secrets Provider. It will continue to assume that secrets should be delivered "as-is" unless it specifically finds a content-type annotation which indicates that the secret value should be decoded.

## Test Plan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a correct assumption that if someone mistakenly puts the same secret twice in the annotation and gives them different types, the one defined later in the file would be the type used? Do we need any tests around that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would likely be the case if they are exactly the same alias/path and in the same manifest.

Although we should support the same Conjur secret with multiple content-types as long as it is in different secret groups (P2F) or under a different kubernetes secret.

For example:

conjur.org/conjur-secrets.decoded-group: |
  - mysecret: /path/to/secret
    content-type: base64
conjur.org/conjur-secrets.encoded-group: |
  - mysecret: /path/to/secret
    content-type: text

Both definitions can exist at the same time but the value for mysecret in decoded-group will be decoded while the value for mysecret in encoded-group will be provided as its original base64 value.

@gl-johnson
Copy link
Contributor Author

gl-johnson commented Mar 3, 2023

@adamouamani responses below, let me know if anything is unclear! Will get the document and testing-related ticket (CNJR-815) updated based on the questions/responses so far.

  1. how do we handle case with \n etc is part of value - this could cause issue with base64 encoding

We will be using the Golang base64 pkg to do the decoding so any valid base64 strings shouldn't pose an issue. If a newline exists in an encoded string, it is represented as \n in the output. It should also handle escape sequences fine.

  1. uppercase or mixed case values for text and base64 - for example does Base64 work?

In the POC, no. Any string that doesn't match exactly "base64" would be treated as a 'text' secret and not attempt decoding. I believe all the existing annotations in Secrets Provider are also case-sensitive.

  1. for E2E tests - I see 4 happy path scenarios for text and base64 annotation with k8s secrets and p2f

I agree that each mode needs its own e2e test. Although I don't believe we need to explicitly test the 'text' content type since it is the default cause which should be covered sufficiently covered by existing tests. We would basically be adding two extra e2e tests just to verify that the parsing works, which should be handled in unit tests.

It would probably be worthwhile to have two additional e2e tests for rotation of encoded secrets too per @niteshtaneja's comment

  1. Do we have unit tests with original values containing special chars - such as umlaut

Secrets Provider has unit tests for retrieving Conjur secrets containing special characters. I don't know that we need to test the same for secrets decoding, since we would just be validating that the Golang base64 pkg works as expected.

  1. Do we trim whitespace - for example if you have a string " h e l l o " - are leading and trailing spaces trimmed before converting to base64

Secrets Provider is simply going to provide the value output by base64.StdEncoding.DecodeString(secretValueFromConjur) with no additional processing on the value

  1. Base64 can include a slash '/' - does this cause any issue?

Good question - I see that / and + are part of the base64 library. A few sample test strings I've tried seem to be supported. Again I think this should be tested sufficiently in the base64 library and probably doesn't need explicit testing in Secrets Provider

  1. How could this fail -what are some failure/error scenarios and how do we report these?

Failure on configuration:

  • Fail to parse annotations
    • Malformed secret (alias/path)
      • Log YAML parsing error
      • Secrets Provider fails to start
    • Malformed content-type
      • If YAML can still be parsed, create a secret with default content type (text)
      • Log (warning)

Failure on providing secrets:

  • Fail to decode an encoded secret (invalid base64)
    • Log (warning)
    • Provide original value
    • Secrets provider continues to run
  1. how will customers use them - is there any size limit to data being encoded?

According to the docs, Conjur's maximum variable size is 10MB so that would be the upper limit as far as I'm aware.

- If YAML can still be parsed, create a secret with default content type (text)
- Log (warning)

Failure on providing secrets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already log a request for a non-existent secret or a secret the API key used doesn't have access to? I'd assume so, but we definitely should if we don't already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both cases log the resulting 404 if I recall

@jvanderhoof
Copy link

I'd like to strongly we bring back a solution from Conjur v4 to augment the plan proposed in this Design. In Conjur v4, variables could include information about how they were to be sent back to the requestor.

- !variable
  id: secret-1
  annotations:
    content-encoding: text/base64

The above would send the secret back using text/base64 encoding in the header. This allows the client to understand the data format (and what it needs to do to handle the response).

As the design currently stands, the person creating the K8s manifests needs intimate knowledge of the secret format in order to correctly define the decode mechanism. Moving context to the variable allows us to leverage the HTTP protocol for handling a wide variety of formats (ex. text/json, application/octet-stream).

@gl-johnson
Copy link
Contributor Author

gl-johnson commented Mar 15, 2023

Thanks for chiming in on that @jvanderhoof. That actually addresses the open question I left in the solution design upon seeing that Conjur supports a mime_type attribute, along with custom annotations. It would allow us to implement this behavior in the SDK so it doesn't require any additional configuration by the Secrets Provider user.

Is that behavior specific to the v4 API? I still see it in the docs for Variables, but haven't done any verification.

Do you think there is still value in allowing the Secrets Provider manifest to dictate which variables are decoded, along with a separate addition to the upstream SDKs to handle decoding based on the Conjur API headers?

Edit - Another consideration about using the Conjur variable annotations is whether it would be compatible with the batch retrieval endpoint currently used by Secrets Provider. I assume we would have to make separate requests for each variable to receive the correct Content-Type header for each.

@jvanderhoof
Copy link

@gl-johnson It looks like mime-type support is part of Conjur already, which simplifies things. Good catch!

@jvanderhoof
Copy link

Do you think there is still value in allowing the Secrets Provider manifest to dictate which variables are decoded, along with a separate addition to the upstream SDKs to handle decoding based on the Conjur API headers?

Personally, I believe there's little value in defining decoding at the K8s manifest level.

In the best case scenario, the SDK decodes the value based on the provided header. The developer does not need to be concerned about the form of the variable value. It appears in the relevant K8s secret already decoded.

In the worst case scenario, the secret appears in K8s base64 encoded. In this case, it feels more logical for the application to be aware of the secret format, and handle that format appropriately (just like it would with environment variable). Managing format at the manifest level adds an additional layer which needs to kept in sync with Vault, Conjur, and target application. It feels like we're adding an opportunity to make the process more error prone for our customers.

@jvanderhoof
Copy link

Another consideration about using the Conjur variable annotations is whether it would be compatible with the batch retrieval endpoint currently used by Secrets Provider. I assume we would have to make separate requests for each variable to receive the correct Content-Type header for each.

Excellent catch! Batch retrieval is far more complex as a feature than most people give it credit. I'd strongly recommend we add a per-variable content type (the batch request would return text/json). This allows us to convert non-ascii content to a format appropriate for transfer via JSON (ex. binary data needs to be base64 encoded). For example, something like:

{
  // base64 encoded
  "variable-1": {
    "value": "Zm9vLWJhcg==\n",
    "encoding": "text/base64"
  },
  // no encoding, same result as 'text/plain'
  "variable-2": {
    "value": "foo-bar"
  },
  // none-text formats are always base64 encoded so they can be included in JSON
  "variable-3": {
    "value": "Zm9vLWJhcg==\n", // base64 encoded binary
    "encoding": "application/octet-stream"
  }
}

The above allows us to offload the decode to the target SDK.

@jvanderhoof
Copy link

@gl-johnson, this does raise a gap I see in the current proposal: How do we handle error cases (or more importantly, how do we let engineers know what goes wrong moving data from Conjur to Secrets Provider? Do software engineers typically have access to Secrets Provider logs?

@gl-johnson
Copy link
Contributor Author

{
  // base64 encoded
  "variable-1": {
    "value": "Zm9vLWJhcg==\n",
    "encoding": "text/base64"
  },
  // no encoding, same result as 'text/plain'
  "variable-2": {
    "value": "foo-bar"
  },
  // none-text formats are always base64 encoded so they can be included in JSON
  "variable-3": {
    "value": "Zm9vLWJhcg==\n", // base64 encoded binary
    "encoding": "application/octet-stream"
  }
}

The above allows us to offload the decode to the target SDK.

I tend to agree that this would be the better approach long-term, but I'm a little hesitant to suggest a breaking change to the Conjur API to accommodate what seems to be a fairly niche enhancement request. I suppose we could add an optional request param to indicate that the requestor wants the full response w/ content-types, and leave the default response as-is (JSON list of key/val pairs). But I'd be curious on your thoughts on how best to manage this in the Conjur API.

In either case I think this warrants a deeper discussion since it'll affect more than just Secrets Provider and likely delay delivery a little bit. (cc @niteshtaneja)

@niteshtaneja
Copy link

Will the above proposal of using the mime_type attribute have any compatibility issues with secrets synced via Synchronizer? Also, scope increase aside, this will also impact delivery mechanism? The original idea was to implement this in Secrets Provider and deliver it as part of monthly OSS release.

@gl-johnson
Copy link
Contributor Author

Will the above proposal of using the mime_type attribute have any compatibility issues with secrets synced via Synchronizer? Also, scope increase aside, this will also impact delivery mechanism? The original idea was to implement this in Secrets Provider and deliver it as part of monthly OSS release.

My understanding of Synchronizer isn't very deep, but I don't foresee an issue. It would require a minor change to the Conjur policy for synchronized secrets to include the mime_type (if needed) so that info exists in the Conjur DB.

We would need releases for Conjur, conjur-api-go, and Secrets Provider to implement the change. And the user would have to be running both the newly released Conjur and Secrets Provider to use the feature.

- <alias_value>: <path_value> # to be added
content-type: <content_type_value>
- <path_value>: # to be added
content-type: <content_type_value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case with the implicit key the user will have to add a colon to the line, and the indenting is different that the case with the alias is specified which could lead to user errors. So since they would have to either add a colon on the end, or add the alias_value at the beginning, I think it's clearer to just support the one case which is the - <alias_value>: <path_value> and add the content-type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and updated in the doc

@gl-johnson gl-johnson force-pushed the base64-secrets-design-doc branch from 57f318f to 620400e Compare March 24, 2023 19:31
@codeclimate
Copy link

codeclimate bot commented Mar 24, 2023

Code Climate has analyzed commit 620400e and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 89.2% (0.0% change).

View more on Code Climate.

Copy link
Contributor

@rpothier rpothier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gl-johnson gl-johnson merged commit f78acb0 into main Mar 28, 2023
@gl-johnson gl-johnson deleted the base64-secrets-design-doc branch March 28, 2023 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants