Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix S3 PartialContentInputStream to be compliant to InputStream specification #2217

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

vierbergenlars
Copy link
Contributor

The PartialContentInputStream was a minimal implementation that worked
well when used directly with the spring byte-range support.

However, using the InputStream variant in any other place (like wrapping it for decryption),
the minimal implementation breaks down, requiring workarounds when
handling the encryption.
Using an implementation that is fully compliant with the InputStream
specifications makes implementation of a wrapping InputStream easier,
as no special provisions need to be made for a read method that
immediately reads the byte-range (instead of empty space before the start of the byte-range)

@paulcwarren
Copy link
Owner

Looks good. I just noticed that the tests are focused which might prevent tests from being run (now or later). Let's unfit and we can merge this. Side note; yes we probably should have something in CI that warns when tests are fitted but I have never gotten around to that.

@vierbergenlars vierbergenlars force-pushed the fix-partial-input-stream branch from ac57657 to 06ceaba Compare December 4, 2024 14:01
The PartialContentInputStream was a minimal implementation that worked
well when used directly with the spring byte-range support.

However, using the InputStream variant in any other place (like wrapping it for decryption),
the minimal implementation breaks down, requiring workarounds when
handling the encryption.
Using an implementation that is fully compliant with the InputStream
specifications makes implementation of a wrapping InputStream easier,
as no special provisions need to be made for a read method that
immediately reads the byte-range (instead of empty space before the
start)
Because an actual compliant InputStream is now returned from
the backing store, encryption can be handled simpler by just running the
cipher for the whole range.

Since encryption is not authenticated, and CTR mode is byte-addressable,
this just works. The zero-bytes before/after the actual data are
decrypted as garbage, but they will be cut out when streaming the
response back to the user.
Although the testcase works when aligned on a block boundary, the test is more comprehensive when it is not aligned
Skipping decrypting blocks that do not have data and will be skipped anyways is wasted work.
Reintroduce the technique to calculate IV for a certain block offset to save decryption work for larger offsets
Using an int limits the offset to about 2GB, which might not be sufficient for very large files.
Using a long results in an offset of 9EB, which should be sufficient.
@thijslemmens
Copy link

Looks good. I just noticed that the tests are focused which might prevent tests from being run (now or later). Let's unfit and we can merge this. Side note; yes we probably should have something in CI that warns when tests are fitted but I have never gotten around to that.

Hello @paulcwarren

I believe this issue has been addressed. Is there anything left here that blocks you from merging?

kind regards

Thijs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants