Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interface for client-side encryption #158

Open
chaudyg opened this issue Dec 16, 2021 · 10 comments
Open

Add interface for client-side encryption #158

chaudyg opened this issue Dec 16, 2021 · 10 comments
Labels
feature request Request for new feature or functionality

Comments

@chaudyg
Copy link

chaudyg commented Dec 16, 2021

What would you like to have changed?

Provide a mechanism for client-side encryption.

Why is this feature a useful, necessary, and/or important addition to this project?

Several storage implementations already support some form of client-side encryption, but each using a slightly different encryption mechanism:

Adding a mechanism for client-side envelope encryption to cert-magic would :

  • Improve security: Client-side encryption adds another layer of security on top of encryption that might happen server-side (for example with s3 or gcs). Encryption would occur before data is sent to the Storage backend. A potential attacker would need to compromise both the storage backend and acquire the encryption key to be able to use the data.
  • provide more flexibility to the users. Some users might want to use chose the encryption mechanism: SecretBox or AES, or maybe a KMS service. If implemented within cert-magic, encryption would become available to every storage backend currently not supporting it.

Here are some initial thoughts:

  • A naive first pass implementation could be to extend the Storage interface with Encrypt/Decrypt optional functions. For example:
    Encrypt func(ctx context.Context, data []byte) ([]byte, error)
    Decrypt func(ctx context.Context, data []byte) ([]byte, error)

This would delegate the responsibility to the caller to provide the encryption mechanisms.

  • Ideally, encryption should only happen when data is written to the store. After enabling encryption, only newly created or updated data should be encrypted when stored. Existing unencrypted data should still be readable. This is a bit harder to implement but it would allow users to transition from un-encrypted to encrypted data 💪 . One way to do this would be to store the encryption provider as metadata alongside with the data. This is similar to how kubernetes encrypts secrets in ETCD (see docs) and allows moving from one encryption to another seamlessly.

I would love to get the community's opinion this before doing a first pass implementation.

@chaudyg chaudyg added the feature request Request for new feature or functionality label Dec 16, 2021
@francislavoie
Copy link
Member

FWIW the current config structure for storage in Caddy wouldn't allow for this, because there's no sibling config to the storage modules that would allow the caller (Caddy/CertMagic) to have config about how it calls the storage module. Like, the config is like this:

"storage": {
	"module": "file_system",
	...
}

But to support this, it would need to be in a structure like this (imaginary):

"storage_config": {
	"encrypt_key": "foobar",
	"storage": {
		"module": "file_system",
		...
	}
}

So it would be a breaking change from a config standpoint, not only an implementation breakage (like the context thing is)

@mholt
Copy link
Member

mholt commented Dec 16, 2021

Thanks for proposing this, let's discuss:

Improve security: Client-side encryption adds another layer of security on top of encryption that might happen server-side (for example with s3 or gcs). Encryption would occur before data is sent to the Storage backend. A potential attacker would need to compromise both the storage backend and acquire the encryption key to be able to use the data.

Encryption already (usually?) occurs before data is sent to (remote) storage backends: that's what TLS is for.

Encrypting data at rest, however, is tricky for server applications. What are the specific threat models you are trying to defend against, and what is your use case for said threat models? Usually encrypting databases is a sign you don't trust the system or service your database is stored on, and that's a bigger problem than what encryption can solve.

@chaudyg
Copy link
Author

chaudyg commented Dec 16, 2021

Thanks @mholt and @francislavoie for your answers ☺️ !

Encryption already (usually?) occurs before data is sent to (remote) storage backends: that's what TLS is for.

First, sorry if this bit was confusing. Indeed I am only considering Encryption at Rest here and not Encryption in Transit. Some provides such as s3/gcs offer server-side encryption (the data is encrypted when reaching the storage backend), but here I am only focusing on client-side-encryption (the data is encrypted before being sent to the backend).

Encrypting data at rest, however, is tricky for server applications. What are the specific threat models you are trying to defend against, and what is your use case for said threat models? Usually encrypting databases is a sign you don't trust the system or service your database is stored on, and that's a bigger problem than what encryption can solve.

Certificates is very sensitive data, and it is critical to keep it safe from eavesdropping or tempering. Restricting least-privilege access to storage is essential to prevent exposing the raw data. But what if the access control mechanism fails? then a potential attacker would have access to the raw data in storage.

I don't think this is about "trusting the system" but more about providing multiple layers of protection as part of an defence-in-depth strategy. Encryption can provide an additional layer of protection and mitigate a weakness of the primary access control mechanism. Several cert-magic storage backend already seem to provide a way to encrypt the data, but the choice of encryption algorithm is set in stone which isn't great if you need stronger encryption (or for compliance).

So it would be a breaking change from a config standpoint, not only an implementation breakage

yeah, i figured that part one might be tricky 🤔. I haven't dig much into how modules are managed yet so sorry for my ignorance here. Indeed we would need an optional "encryption" config section. Would that necessarily be a breaking change? What prevents us to setup encryption directly under the "storage" section?

@francislavoie
Copy link
Member

francislavoie commented Dec 16, 2021

What prevents us to setup encryption directly under the "storage" section?

That config is controlled by the storage implementations themselves (unpacked into the struct from the storage module), there's nowhere for config that is controlled by Caddy/CertMagic, without breaking changes.

Personally I don't see the problem letting storage implementations manage encryption themselves. We could ask them to conform to some standard approach, like we could set up a encryption module namespace that storage modules could use, not sure the complexity is worth it though. Config for that would look like:

"storage": {
	"module": "file_system",
	// options for `file_system`...

	// optional encryption (omit the "encryption" field for none)
	"encryption": {
		"module": "aes",
		"key": "foobar"
	}
}

We'd have modules like encryption.aes or encryption.aws_kms or whatever, essentially modules under the encryption.* namespace.

Point is that Caddy/CertMagic wouldn't invoke the encryption, libs like caddy-tlsredis would accept an encryption module as config and use it if available.

Also, the above would effectively turn this into a Caddy issue, instead of a CertMagic issue, to be clear, because CertMagic wouldn't care, it would be only a concern for Caddy at the config level (and Caddy would define the encryption.* interface I guess) and the storage modules which offer a Caddy compatible module would accept in their config.

Relevant reading to understand how modules work: https://caddyserver.com/docs/architecture, and the basics of plugins: https://caddyserver.com/docs/extending-caddy

@chaudyg
Copy link
Author

chaudyg commented Dec 17, 2021

Thanks for the links, very helpful 💪 .

We'd have modules like encryption.aes or encryption.aws_kms or whatever, essentially modules under the encryption.* namespace.

I like the approach of having one module per EncryptionProvider. IIUC we would need a dedicated module namespace to ensure each encryption provider implements the same interface.

Personally I don't see the problem letting storage implementations manage encryption themselves. We could ask them to conform to some standard approach, like we could set up a encryption module namespace that storage modules could use, not sure the complexity is worth it though.

I am not certain this approach would bring much value. IIUC, each backend would only support its own set of encryptions, its own implementation. I feel that extending cert-magic itself with this capability would be a lot more powerful given the encryption provider would be completely independent from the storage provider.

To continue on that thought, I agree the storage section isn't the right place for this 🤔 . Would the AutomationConfig be a better place? given that's the one governing the tls cetificate management? Something along the lines of

// imaginary
{
    "on_demand": {...},
    "encryption": {
        //  optional encryption (omit the "encryption" field for none)
        "module": "aws_kms",
        "keyId": "123456"
    },
   "policies": [{
        "subjects": [""],
        "storage": {...},
        "encryption": {...}, // Optionally configure a separate encryption module.
        ...
    }],
    ...
}

@mholt
Copy link
Member

mholt commented Dec 17, 2021

@francislavoie

So it would be a breaking change from a config standpoint, not only an implementation breakage (like the context thing is)

Are you sure? Why can't a storage module simply be configured to enable encryption, along with its other configuration parameters? Just add some fields to the struct for the storage provider you want to support encryption with.

@francislavoie
Copy link
Member

@mholt

Are you sure? Why can't a storage module simply be configured to enable encryption, along with its other configuration parameters?

That's exactly what I was saying.

What was suggested here is that Caddy/CertMagic would encrypt before passing it to the storage module. There's nowhere in the config currently where it would be possible to configure encryption outside of the storage module, because the only configuration for storage is the storage module.

So either the config for it is in the storage module itself, or somewhere else altogether (like @chaudyg just suggested, in the automation config, so only certs/keys would be encrypted, not all storage).

@chaudyg

I like the approach of having one module per EncryptionProvider. IIUC we would need a dedicated module namespace to ensure each encryption provider implements the same interface.

Yep.

I am not certain this approach would bring much value. IIUC, each backend would only support its own set of encryptions, its own implementation. I feel that extending cert-magic itself with this capability would be a lot more powerful given the encryption provider would be completely independent from the storage provider.

Not quite, anything could define encryption modules, including dedicated separate plugins. The encryption module doesn't need to come from the storage plugin (probably shouldn't, really).

I'd suggest Caddy core could provide the AES module because that's super simple and easy, basically no knobs to turn there; that would mean there would always be at minimum "no encryption" and "aes" modes available for configuration, and any other modes could be provided by plugins that are more use-case specific (like loading from a KMS or whatever).

@mholt
Copy link
Member

mholt commented Dec 17, 2021

Ah, I see. You want to configure encryption independent of the storage module... that would add new config surface, but I don't think it would have to be a breaking change per-se.

I am not comfortable implementing or reviewing encryption code for this use case. I'd probably require a second set of eyes from an established cryptographer. (But don't have the budget to pay for it. It'd likely have to be a donated or volunteer review.)

@chaudyg
Copy link
Author

chaudyg commented Feb 9, 2022

I am not comfortable implementing or reviewing encryption code for this use case.

I agree. Encryption is never obvious and it's an area where we need to be careful. I've experimenting with Google Tink for Go lately (https://github.com/google/tink/blob/master/docs/GOLANG-HOWTO.md). Delegating encryption to something like Tink might reduce the risk of doing the wrong think:

  • Tink aims to provide simpler API that are harder to miss-use. For example, the library makes it harder to load en-encrypted clear-text keys (it's possible, but you have to import the insecurecleartextkeyset, making it obvious something isn't right).
  • The choice of the cryptographic primitive can be delegated to the user (AES, CHACHA20, ...). That's because tink provides a standard interface for every encryption algo.
  • Private key rotation is provided out-of-the box. Users can provide multiple keys for decryption, and chose which one is the default for encryption.
  • Native support for envelope encryption with integration with Google/AWS KMS and vault.

How do you feel about a dedicated "Tink Encryption module" ?

@mholt
Copy link
Member

mholt commented Feb 9, 2022

It doesn't look like a bad option; would still want some expert review though.

Before we get into specifics, I still think we should figure out first whether this would be storage-implementation-specific, or a single "pre-storage" point of configuration (i.e. storage plugins remain unchanged, they just read and write encrypted bytes instead). Because it looks like some storage implementations already offer encryption, and I'm still a bit fuzzy on the benefits of at-rest encryption for these materials when the key itself would be in plaintext, potentially also in storage of some sort (like when configs are persisted).

Or maybe a storage wrapper is a better solution, so a generic encrypting storage module that wraps any other underlying module.

I dunno. Feel like this still needs more thorough discussion / contemplation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for new feature or functionality
Projects
None yet
Development

No branches or pull requests

3 participants