You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When files are very close to 1 GB (1,048,576 KB), only the first 1 GB of the file is uploaded when using S3FileSystem (I haven't tested with other file systems), whereas using boto3 directly works.
Reverting from version 2024.10.0 to 2024.6.1 fixes the issue (I haven't tested intermediate versions).
Here's a minimal example:
REMOTE_BUCKET="test_bucket"defcheck_fsspec_put(size=1096511984, dtype=np.float64):
"""Compare local and remote file sizes when objects are saved remotely via fsspec."""fs=s3fs.S3FileSystem()
s3=boto3.client('s3')
# Create fake data with the given sizedata=np.zeros((size//dtype().itemsize,), dtype=dtype)
# Save locally and to s3np.save(f"fake_data_{size}.npy", data)
fs.put_file(f"fake_data_{size}.npy", f"s3://{REMOTE_BUCKET}/fake_data_{size}.npy")
# Return local and remote file sizesoriginal_size=os.path.getsize(f"fake_data_{size}.npy")
fsspec_size=s3.head_object(Bucket=REMOTE_BUCKET, Key=f"fake_data_{size}.npy")["ContentLength"]
returnoriginal_size, fsspec_sizeforsizein [
1038576000, # works1096511984, # fails1106511984, # works
]:
original_size, fsspec_size=check_fsspec_put(size)
iforiginal_size!=fsspec_size:
print(f"fsspec only uploaded {original_size} / {fsspec_size} bytes")
The text was updated successfully, but these errors were encountered:
When files are very close to 1 GB (1,048,576 KB), only the first 1 GB of the file is uploaded when using
S3FileSystem
(I haven't tested with other file systems), whereas usingboto3
directly works.Reverting from version
2024.10.0
to2024.6.1
fixes the issue (I haven't tested intermediate versions).Here's a minimal example:
The text was updated successfully, but these errors were encountered: