Fix `ClientPayloadError` while reading #787

fjetter · 2023-09-11T15:42:03Z

Closes #786

The _call_and_read pattern was already implemented for _cat_file in #601 so I'm just applying this pattern again. I don't know the library well enough to judge if this can be refactored somehow else but I would prefer to separate fix from the refactor

fjetter · 2023-09-11T15:42:47Z

There were no tests added in #601 so I don't know how to proceed. If there are any suggestions on how to implement tests, I'm happy to follow up

martindurant · 2023-09-11T15:50:12Z

I would prefer to separate fix from the refactor

That is fine

martindurant · 2023-09-11T15:51:14Z

If there are any suggestions on how to implement tests, I'm happy to follow up

Unfortunately, there isn't a way I know of to get the moto server or any other mock to cause the case you were seeing. Se, as long as this doesn't break anything, but it fixes your workflow, we should be good.

fjetter · 2023-09-11T16:59:57Z

@martindurant can you enable the workflow for CI, please?

martindurant · 2023-09-11T18:06:56Z

I wonder if there's anything we can do about a retry wrapping around _call_s3, which itself has retries. Once the read on the body fails, I don't know whether it's possible/useful to try to repeat just that part.

fjetter · 2023-09-12T07:11:35Z

As it is right now, if the read on the body fails, the entire request is repeated. I think this is a requirement since I doubt the body is still valid after such a failure.

The fact that there is a duplicated retry in case of an exception in _call_s3 is unfortunate but I don't think this is a big deal. By default, this is retrying 5 times with a backoff of 1.7**i * 0.1, i.e. a full retry cycle take 1.89s + request time. Doing this again before failing is fine. I would also like to point out that this is the exact way _cat_file is doing it.

I see three options

Keep it as is and merge since this pattern is already used in _cat_file.
add a toggle to _call_s3 that allows one to disable retries. Simple and I would be fine doing this in this PR. However, I don't think we'll be doing the code base a favor.
Refactor the entire retry mechanism, likely remove it from _call_s3 and find a different abstraction/entry point that works better.

is definitely out of scope since I don't know the code base well enough to do this. This would likely require a holistic view on the code base and we'd need to review all _call_s3 invocation to decide where and why a retry is appropriate. I suspect there are more patterns that should be retried as a whole but I don't know. I encourage this kind of refactoring but I am the wrong person for it.

fjetter · 2023-09-12T11:59:57Z

@martindurant is it possible to get a patch release with this once merged?

martindurant · 2023-09-12T13:20:08Z

OK, I will merge this as it is, and consider refactoring in the future.

Thanks for contributing!

A release should be out in the next few days.

Ensure read in inner_fetch is retried

a055738

Don't await close

deccded

martindurant merged commit 24ad2b4 into fsspec:main Sep 12, 2023
5 checks passed

fjetter deleted the ensure_inner_fetch_retried branch September 12, 2023 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `ClientPayloadError` while reading #787

Fix `ClientPayloadError` while reading #787

fjetter commented Sep 11, 2023

fjetter commented Sep 11, 2023

martindurant commented Sep 11, 2023

martindurant commented Sep 11, 2023

fjetter commented Sep 11, 2023

martindurant commented Sep 11, 2023

fjetter commented Sep 12, 2023 •

edited

Loading

fjetter commented Sep 12, 2023

martindurant commented Sep 12, 2023

Fix ClientPayloadError while reading #787

Fix ClientPayloadError while reading #787

Conversation

fjetter commented Sep 11, 2023

fjetter commented Sep 11, 2023

martindurant commented Sep 11, 2023

martindurant commented Sep 11, 2023

fjetter commented Sep 11, 2023

martindurant commented Sep 11, 2023

fjetter commented Sep 12, 2023 • edited Loading

fjetter commented Sep 12, 2023

martindurant commented Sep 12, 2023

Fix `ClientPayloadError` while reading #787

Fix `ClientPayloadError` while reading #787

fjetter commented Sep 12, 2023 •

edited

Loading