You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Files bigger than some amount of megabytes are uploaded via multipart upload by default for network resiliency, and for files uploaded via multipart upload the etag is not the md5 of the file. So this means the current method of comparing the ETag in S3 to the md5 of the local file is not a good method for verifying whether it has changed.
A potential solution could be to just compare filenames since as far as I can tell Gatsby already outputs files with the md5 sum in the filename? Another solution could be to add a custom Tag to each file uploaded containing the md5 sum and compare this rather than the ETag.
The text was updated successfully, but these errors were encountered:
Good find! It's possible to determine the ETag for a multipart upload. But since this is undocumented by Amazon, we really shouldn't rely on it. I like your idea of adding a custom tag with our own hash. Maybe we can move to a better hashing algorithm like HMAC-SHA1 at the same time.
If we do this, it will require the GetObjectTagging and PutObjectTagging permissions. #39 will need to include these.
I forgot to update this, but disabling multipart upload works as a workaround. The S3.ManagedUpload accepts a partSize parameter which dictates the smallest part size for multipart upload (as documented here). By default it's 5mb. I'm setting it to a size bigger than any of the files in my site, meaning multipart upload is never used and the ETag is always the MD5.
I'm not sure if this is a good production solution (presumably there's some advantage to multipart uploading?) but it works.
Files bigger than some amount of megabytes are uploaded via multipart upload by default for network resiliency, and for files uploaded via multipart upload the etag is not the md5 of the file. So this means the current method of comparing the ETag in S3 to the md5 of the local file is not a good method for verifying whether it has changed.
A potential solution could be to just compare filenames since as far as I can tell Gatsby already outputs files with the md5 sum in the filename? Another solution could be to add a custom Tag to each file uploaded containing the md5 sum and compare this rather than the ETag.
The text was updated successfully, but these errors were encountered: