-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with Setting highWaterMark in AWS S3 GetObject Stream #6890
Comments
Hey @ujjwol05 , The |
@zshzbh Does that mean we have no control over the S3 stream, and if we need a fixed-size buffer, we must rebuffer it ourselves? |
Depending on what you are trying to achieve. S3 honors HTTP byte range request, so any file can be downloaded in fixed-size pieces using multiple GETs. You can use PartNumber If this is referring to individual TCP packet sizes, then no you cannot control this. |
I can share the code if you want a consistant chunk size - To get consistent 32KB chunks, you'll need to use a import { S3Client, GetObjectCommand } from "@aws-sdk/client-s3";
import { Transform } from 'stream';
const s3Client = new S3Client({
region: "us-east-1",
});
const params = {
Bucket: "test-s3-XXXX-mm",
Key: "large-file.txt",
};
const command = new GetObjectCommand(params);
const response = await s3Client.send(command);
const stream = response.Body;
// Create a custom transform stream that buffers data into 32KB chunks
const chunkSize = 32 * 1024; // 32KB
let buffer = Buffer.alloc(0);
const chunker = new Transform({
transform(chunk, encoding, callback) {
// Add new chunk to our buffer
buffer = Buffer.concat([buffer, chunk]);
// While we have enough data for a full chunk
while (buffer.length >= chunkSize) {
// Push a chunk of exactly 32KB
this.push(buffer.slice(0, chunkSize));
buffer = buffer.slice(chunkSize);
}
callback();
},
// Push any remaining data when the stream ends
flush(callback) {
if (buffer.length > 0) {
this.push(buffer);
}
callback();
}
});
const customStream = stream.pipe(chunker);
customStream.on("data", (chunk) => {
console.log(`Chunk size: ${chunk.length}`);
}); |
Checkboxes for prior research
Describe the bug
When using the AWS SDK for JavaScript v3 to stream data from S3 (via GetObjectCommand), I cannot set highWaterMark , I've also set a custom highWaterMark value for stream buffers to see if that works but, the buffer size remains at the 16 KB
Regression Issue
SDK version number
@aws-sdk/package-name@version, ...
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
v20.17.0
Reproduction Steps
Observed Behavior
INFO Chunk size: 16384
INFO Chunk size: 389
Expected Behavior
INFO Chunk size: 30243
INFO Chunk size: 30243
Possible Solution
No response
Additional Information/Context
No response
The text was updated successfully, but these errors were encountered: