-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: BlobWriteChannelV2 - same throughput less GC #2110
Conversation
72f4a7f
to
04362be
Compare
public BlobWriteChannel writer(URL signedURL) { | ||
public StorageWriteChannel writer(URL signedURL) { | ||
// TODO: is it possible to know if a signed url is configured to have a constraint which makes | ||
// it idempotent? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signed URL in this case creates a resumable upload session which are idempotent per session.
Creating a signed URL that can only be once would require https://cloud.google.com/storage/docs/xml-api/reference-headers#xgoogifgenerationmatch with value 0 but it's possible that a session is created but not completed when this preconditions is checked which can't be helped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly, that if-generation-match header would be part of the signed URL and not something our client could be aware of. Meaning, our client doesn't have visibility into any precondition which might apply to the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof, this is an interesting case; if a user generates a v4 signed url and adds this header as signed then our client tries to create a resumable upload; it will fail without this header in the request with the expected value.
Our client code could extract the header/values pairs from the signed URL, but sounds like a feature request.
...ud-storage/src/main/java/com/google/cloud/storage/HttpWritableByteChannelSessionBuilder.java
Outdated
Show resolved
Hide resolved
google-cloud-storage/src/main/java/com/google/cloud/storage/HttpUploadSessionBuilder.java
Outdated
Show resolved
Hide resolved
...ud-storage/src/main/java/com/google/cloud/storage/HttpWritableByteChannelSessionBuilder.java
Outdated
Show resolved
Hide resolved
ByteRangeSpec rangeSpec = ByteRangeSpec.explicit(cumulativeByteCount, newFinalByteOffset); | ||
if (available % ByteSizeConstants._256KiB == 0) { | ||
header = HttpContentRange.of(rangeSpec); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this cause issues for customers that write data as it comes in (might be less than 256KiB) so buffer would be filled over multiple writes to perform a final flush on falling out of writer scope or explicitly flushing or close()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially in standalone yet, however customers are not interacting with this class directly themselves.
We have the following pipeline to usage of this class (bash pipeline shorthand):
BlobWriteChannelV2 | BaseStorageWriteChannel | DefaultBufferedWritableByteChannel | ApiaryUnbufferedWritableByteChannel
^-- buffer allocation and alignment happens here
...loud-storage/src/main/java/com/google/cloud/storage/ApiaryUnbufferedWritableByteChannel.java
Show resolved
Hide resolved
byteBuffers(100, 1_000), | ||
byteBuffers(1_000, 10_000), | ||
byteBuffers(10_000, 100_000), | ||
byteBuffers(100_000, 1_000_000))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinged you through chat; but what's the rationale for using multiple buffers versus one? A few reasons come to mind (easier to allocate smaller contiguous memory buffers, individual buffers can be passed as-is to transport writes); and wondering if there are others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One big one for us specifically, is from the buffered to unbuffered layer, this enables the ability to reduce (and in some cases completely remove) copies between the ByteBuffer
provided to our write
method, and any underlying location it needs to be propagated to.
A codified test case that illustrates this minimized copying can be seen here:
Lines 316 to 321 in ed05232
* 0 61 | |
* data: |------------------------------------------------------------| | |
* 16 (16) 32 (16) 48 (13) 61 | |
* writes: |---------------|---------------|---------------|------------| | |
* 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 | |
* flushes: |--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|--|| |
In that test flushes will happen every 3 bytes, where we're doing 16 byte writes. In order to flush, we don't need to copy into our buffer, we can simply take a 3 byte slice of the provided ByteBuffer
and flush that. Then if there is any outstanding bytes they are enqueued. Then the next time write
is called, in a single call we can flush our buffer and the slice from the new ByteBuffer
.
Use stable buffer allocation with laziness. Leverage new JsonResumableSession to provide more robustness and easier separation of concerns compared to BlobWriteChannel * rename blobWriteChannel.ser.properties to the correct blobReadChannel.ser.properties ### Runtime improvments Throughput is on par with the existing v1 implementation, however GC impact has been lightened with the new implementation. Below is the summary of the GC improvement between v1 and v2. These GC numbers were collected while uploading 4096 randomly sized objects, from 128KiB..2GiB across 16 concurrent threads, using a default chunkSize of 16MiB. | metric | unit | v1 | v2 | % decrease | |---------------------------------|--------|-------------:|-------------:|-----------:| | gc.alloc.rate | MB/sec | 2240.056 | 1457.731 | 34.924 | | gc.alloc.rate.norm | B/op | 955796726217 | 638403730507 | 33.207 | | gc.churn.G1_Eden_Space | MB/sec | 1597.009 | 1454.304 | 8.936 | | gc.churn.G1_Eden_Space.norm | B/op | 681418424320 | 636902965248 | 6.533 | | gc.churn.G1_Old_Gen | MB/sec | 691.877 | 11.316 | 98.364 | | gc.churn.G1_Old_Gen.norm | B/op | 295213237398 | 4955944331 | 98.321 | | gc.churn.G1_Survivor_Space | MB/sec | 0.004 | 0.002 | 50.000 | | gc.churn.G1_Survivor_Space.norm | B/op | 1572864 | 786432 | 50.000 | | gc.count | counts | 1670 | 1319 | 21.018 | | gc.time | ms | 15936 | 9527 | 40.217 | Overall allocation rate is decreased, while Old_Gen use is almost entirely eliminated. ``` openjdk version "11.0.18" 2023-01-17 OpenJDK Runtime Environment (build 11.0.18+10-post-Debian-1deb11u1) OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Debian-1deb11u1, mixed mode, sharing) -Xms12g -Xmx12g ``` All other java parameters are defaults.
a1a01f8
to
5cc8745
Compare
Use stable buffer allocation with laziness.
Leverage new JsonResumableSession to provide more robustness and easier separation of concerns compared to BlobWriteChannel
Runtime improvments
Throughput is on par with the existing v1 implementation, however GC impact has been lightened with the new implementation.
Below is the summary of the GC improvement between v1 and v2.
These GC numbers were collected while uploading 4096 randomly sized objects, from 128KiB..2GiB across 16 concurrent threads, using a default chunkSize of 16MiB.
Overall allocation rate is decreased, while Old_Gen use is almost entirely eliminated.
All other java parameters are defaults.