Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChunkedEncodingError when reading a file in parts #1761

Closed
annadcunningham opened this issue Oct 2, 2024 · 1 comment · Fixed by #1827
Closed

ChunkedEncodingError when reading a file in parts #1761

annadcunningham opened this issue Oct 2, 2024 · 1 comment · Fixed by #1827

Comments

@annadcunningham
Copy link

annadcunningham commented Oct 2, 2024

We get a ChunkedEncodingError when reading a file in parts from fake GCS. We started seeing the issue when we bumped the python package urllib3 from v1 to v2. However, the app works fine against real GCS, so we believe something can be fixed in this emulator.

  • fake-gcs-server version: 1.48.0 (also have checked 1.50.0)
    • command: fake-gcs-server -log-level=debug -scheme=both -port-http=8000 -port=8443 -external-url=http://my-name:8000
  • python version: 3.11.9
  • google-cloud-storage==2.18.2
  • urllib3==2.2.3

Minimal reproducible app code:

from google.cloud import storage

def read_file_fully_vs_partially(bucket: str, filename: str) -> None:
    client = storage.Client()
    bucket = client.bucket(bucket)
    blob = bucket.blob(filename)

    print("Reading whole file:")  
    with blob.open("r") as ofh:
        print(ofh.read())  # this works fine

    print("Reading one line, then the rest:")
    with blob.open("r") as ofh:
        print(f"line 1: {next(ofh)}")
        print("Rest of file:")
        print(ofh.read())  # this line triggers the error

example logs from an execution, with file contents of

this is line 1
this is line 2
this is line 3

logs:

[my-app] 2024-10-02 20:47:05,959 [INFO] Reading whole file:
[fake-gcs-server] time=2024-10-02T20:47:05.970Z level=INFO msg="10.244.1.8 - - [02/Oct/2024:20:47:05 +0000] \"GET /download/storage/v1/b/freenome-mock-gcp-orchid-metadata-extract/o/file.txt?alt=media HTTP/1.1\" 206 0\n"
[my-app] 2024-10-02 20:47:05,972 [INFO] this is line 1
[my-app] this is line 2
[my-app] this is line 3
[my-app] 2024-10-02 20:47:05,972 [INFO] Reading line by line:
[fake-gcs-server] time=2024-10-02T20:47:05.973Z level=INFO msg="10.244.1.8 - - [02/Oct/2024:20:47:05 +0000] \"GET /download/storage/v1/b/freenome-mock-gcp-orchid-metadata-extract/o/file.txt?alt=media&generation=1727902024815535 HTTP/1.1\" 206 0\n"
[my-app] 2024-10-02 20:47:05,974 [INFO] line 1: this is line 1
[my-app]
[my-app] 2024-10-02 20:47:05,974 [INFO] Rest of file:
[fake-gcs-server] time=2024-10-02T20:47:05.975Z level=INFO msg="10.244.1.8 - - [02/Oct/2024:20:47:05 +0000] \"GET /download/storage/v1/b/freenome-mock-gcp-orchid-metadata-extract/o/file.txt?alt=media&generation=1727902024815535 HTTP/1.1\" 416 0\n"
[my-app] Traceback (most recent call last):
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/blob.py", line 4420, in _prep_and_do_download
[my-app]     self._do_download(
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/blob.py", line 1041, in _do_download
[my-app]     response = download.consume(transport, timeout=timeout)
[my-app]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/resumable_media/requests/download.py", line 263, in consume
[my-app]     return _request_helpers.wait_and_retry(
[my-app]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/resumable_media/requests/_request_helpers.py", line 155, in wait_and_retry
[my-app]     response = func()
[my-app]                ^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/resumable_media/requests/download.py", line 245, in retriable_request
[my-app]     self._process_response(result)
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/resumable_media/_download.py", line 188, in _process_response
[my-app]     _helpers.require_status_code(
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/resumable_media/_helpers.py", line 108, in require_status_code
[my-app]     raise common.InvalidResponse(
[my-app] google.resumable_media.common.InvalidResponse: ('Request failed with status code', 416, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
[my-app]
[my-app] During handling of the above exception, another exception occurred:
[my-app]
[my-app] Traceback (most recent call last):
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 748, in _error_catcher
[my-app]     yield
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 894, in _raw_read
[my-app]     raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
[my-app] urllib3.exceptions.IncompleteRead: IncompleteRead(0 bytes read, 1 more expected)
[my-app]
[my-app] The above exception was the direct cause of the following exception:
[my-app]
[my-app] Traceback (most recent call last):
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/requests/models.py", line 820, in generate
[my-app]     yield from self.raw.stream(chunk_size, decode_content=True)
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 1060, in stream
[my-app]     data = self.read(amt=amt, decode_content=decode_content)
[my-app]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 949, in read
[my-app]     data = self._raw_read(amt)
[my-app]            ^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 872, in _raw_read
[my-app]     with self._error_catcher():
[my-app]   File "/fn/lib/python3.11/contextlib.py", line 158, in __exit__
[my-app]     self.gen.throw(typ, value, traceback)
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/urllib3/response.py", line 772, in _error_catcher
[my-app]     raise ProtocolError(arg, e) from e
[my-app] urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read, 1 more expected)', IncompleteRead(0 bytes read, 1 more expected))
[my-app]
[my-app] During handling of the above exception, another exception occurred:
[my-app]
[my-app] Traceback (most recent call last):
[my-app]   File "/fn/lib/venv/bin/orchid_metadata_extract", line 6, in <module>
[my-app]     sys.exit(main())
[my-app]              ^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
[my-app]     return self.main(*args, **kwargs)
[my-app]            ^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
[my-app]     rv = self.invoke(ctx)
[my-app]          ^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
[my-app]     return ctx.invoke(self.callback, **ctx.params)
[my-app]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
[my-app]     return __callback(*args, **kwargs)
[my-app]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/usr/src/app/orchid_metadata_extract/cli.py", line 21, in main
[my-app]     extract_orchid_metadata(output_bucket, output_prefix)
[my-app]   File "/usr/src/app/orchid_metadata_extract/extract.py", line 39, in extract_orchid_metadata
[my-app]     logger.info(ofh.read())
[my-app]                 ^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/fileio.py", line 146, in read
[my-app]     result += self._blob.download_as_bytes(
[my-app]               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/python3.11/contextlib.py", line 81, in inner
[my-app]     return func(*args, **kwds)
[my-app]            ^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/blob.py", line 1469, in download_as_bytes
[my-app]     self._prep_and_do_download(
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/blob.py", line 4433, in _prep_and_do_download
[my-app]     _raise_from_invalid_response(exc)
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/google/cloud/storage/blob.py", line 4898, in _raise_from_invalid_response
[my-app]     if response.text:
[my-app]        ^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/requests/models.py", line 926, in text
[my-app]     if not self.content:
[my-app]            ^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/requests/models.py", line 902, in content
[my-app]     self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
[my-app]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[my-app]   File "/fn/lib/venv/lib/python3.11/site-packages/requests/models.py", line 822, in generate
[my-app]     raise ChunkedEncodingError(e)
[my-app] requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 1 more expected)', IncompleteRead(0 bytes read, 1 more expected))
fcanovai added a commit to fcanovai/fake-gcs-server that referenced this issue Nov 26, 2024
Fix a bug in which the server would return a 416 reply with an
empty body declaring a Content-Length of 1.

Closes fsouza#1761

Signed-off-by: Francesco Canovai <[email protected]>
@shutootaki
Copy link

A similar error occurs.

fsouza added a commit that referenced this issue Jan 3, 2025
* fix: avoid IncompleteRead on completed partial reads

Fix a bug in which the server would return a 416 reply with an
empty body declaring a Content-Length of 1.

Closes #1761

Signed-off-by: Francesco Canovai <[email protected]>

* fakestorage: add a test for invalid range

---------

Signed-off-by: Francesco Canovai <[email protected]>
Co-authored-by: francisco souza <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants