Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smart_open does not recover from connection errors when reading from S3 #551

Closed
mpenkov opened this issue Oct 24, 2020 · 0 comments · Fixed by #552
Closed

smart_open does not recover from connection errors when reading from S3 #551

mpenkov opened this issue Oct 24, 2020 · 0 comments · Fixed by #552
Assignees
Labels

Comments

@mpenkov
Copy link
Collaborator

mpenkov commented Oct 24, 2020

Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/var/lib/cloud/instance/scripts/part-001", line 113, in <module>
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     do_work()
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/var/lib/cloud/instance/scripts/part-001", line 87, in do_work
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     datawelder.partition.partition(reader, destination, numpartitions)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/datawelder/partition.py", line 357, in partition
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     for i, record in enumerate(reader, 1):
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/datawelder/readwrite.py", line 163, in __next__
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     line = next(self._fin)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3.8/gzip.py", line 390, in readline
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     return self._buffer.readline(size)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3.8/_compression.py", line 68, in readinto
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     data = self.read(len(byte_view))
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3.8/gzip.py", line 485, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3.8/gzip.py", line 96, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     self.file.read(size-self._length+read)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/smart_open/s3.py", line 511, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     self._fill_buffer(size)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/smart_open/s3.py", line 622, in _fill_buffer
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     bytes_read = self._buffer.fill(self._raw_reader)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/smart_open/bytebuffer.py", line 152, in fill
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     new_bytes = source.read(size)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/smart_open/s3.py", line 423, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     binary = self._read_from_body(size)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/smart_open/s3.py", line 411, in _read_from_body
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     binary = self._body.read(size)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/local/lib/python3.8/dist-packages/botocore/response.py", line 77, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     chunk = self._raw_stream.read(amt)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3/dist-packages/urllib3/response.py", line 529, in read
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     self.gen.throw(type, value, traceback)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:   File "/usr/lib/python3/dist-packages/urllib3/response.py", line 443, in _error_catcher
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]:     raise ProtocolError("Connection broken: %r" % e, e)
Oct 23 18:03:28 ip-172-31-36-170 cloud-init[1800]: urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant