Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Client Side Encryption Support #1805

Closed
johnclara opened this issue Nov 21, 2020 · 2 comments
Closed

AWS: Client Side Encryption Support #1805

johnclara opened this issue Nov 21, 2020 · 2 comments

Comments

@johnclara
Copy link

johnclara commented Nov 21, 2020

TLDR: Do you all want to split out an aws-java-sdk-1 variant or is there no plans to support sdk1?

Not sure if you all already discussed this: The aws java sdk2 doesn't support client side encryption:
aws/aws-encryption-sdk-java#58
aws/aws-sdk-java-v2#34

Current tables with client side encryption wouldn't be able to flip over to this.

One idea would be to try to mimic the decryption inside iceberg's application level encryption with the aws encryption sdk,
but it doesn't support the s3 encrypted objects: https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/introduction.html. Right now the best option for the user seems to be implementing the AES GCM and then switching to unauthenticated CTR on seeks on their own with the EncryptionManager.

I'm wondering if you all would want to split out an sdk1 variant to support client side encryption or if you think it would be better to try to reimplement it in the EncryptionManager higher up?

If you all did want to split it out, would setting the multipart upload threads to 1 be enough to ensure serial uploads (required by the client to compute the MAC in the last part)?

@danielcweeks @jackye1995

@jackye1995
Copy link
Contributor

I remember based on the last discussion, the conclusion is to stick with the v2 client and not split out another v1 client to have a clean dependency, please let me know if we would like the v1 client discussion to be back on the table.

I briefly considered client side encryption when starting the aws module, and the existing Iceberg encryption interface looks good enough for implementing the features in the encryption libraries. There can be a S3EncryptionManager, and the data can be read and written by extending the current S3InputFile and S3OutputFile.

The MAC calculation is sometimes used by plain S3 IO as well, so I think it can be a separated discussion.

And let me also ask about this directly with the team and come back with a reply for their v2 client support.

I have 2 question for you:

  1. there are two S3 client side encryption libraries, the AmazonS3EncryptionV2 and the AWS Encryption SDK. They are also not compatible with each other. Just to make sure, which library (or both?) would you like to support?

  2. You mentioned "user seems to be implementing the AES GCM and then switching to unauthenticated CTR on seeks on their own", could you elaborate more on this so I fully understand the context of your use case?

@johnclara
Copy link
Author

johnclara commented Nov 21, 2020

@jackye1995 I think our usecase is still in flux so I was more wondering what the current plan was. I think we were planning on using the AmazonS3EncryptionV2 in certain areas. I'll ping you on slack with specifics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants