-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: retry and retry_async support streaming rpcs #495
Conversation
Can the breaking change be avoided by deprecating functionality instead of changing it? |
Hmm yeah I guess it could, if we want to keep two versions of the AsyncRetry class hanging around for a while As I mentioned above though, IMO this doesn't have to be a breaking change. The only breaking part is changing an undocumented behaviour, which would only manifest as a bug when used with our gapic libraries, which have their own timeout logic. But I'll let you make that call. Let me know which option you prefer |
I would avoid having two copies of the Async Retry. As to whether we should flag this as a breaking change.... I want to say it's just a bug fix to the behavior but does not change the surface, so maybe we can call it non-breaking..... Not a 100% confident about this yet. |
assert retry_._deadline == 4 | ||
assert retry_._on_error is _some_function | ||
@pytest.mark.asyncio | ||
async def test_retry_streaming_target_bad_sleep_generator(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still in the unary_asyn test file.
|
||
with pytest.raises(exceptions.RetryError) as exc_info: | ||
await retry_async.retry_target(target, predicate, range(10), deadline=10) | ||
await retry_async.retry_target(target, predicate, range(10), **timout_kwarg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await retry_async.retry_target(target, predicate, range(10), **timout_kwarg) | |
await retry_async.retry_target(target, predicate, range(10), **timeout_kwarg) |
|
||
timeout_val = 10 | ||
# support "deadline" as an alias for "timeout" | ||
timout_kwarg = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timout_kwarg = ( | |
timeout_kwarg = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, looking at this again to try to settle the breaking change determination. I have two additional code comments in the files.
On the breaking change determination:
- this PR makes streaming retries more reliable, so any code users previously had to manually enforce retries on streams will now be needed less or not at all, but it won't be broken by this change. Correct?
- The other change os that exceptions are propagated through the retry block, so users may now be seeing exceptions that they did not see before. Correct?
Item 1 by itself would suggest marking this non-breaking, though it may be good to call attention to the change so users can simplify parts of their code that may no longer be needed.
Item 2 seems like a breaking behavioral change that may surprise users.
I think we should flag this as a breaking change, and make it clear at the top of the PR description/merge-commit message that the only expected behavioral changes that may require action on the user's side are the new exceptions they may now receive from failing retries.
Thoughts? @daniel-sanche @parthea
google/api_core/retry/__init__.py
Outdated
"exponential_sleep_generator", | ||
"if_exception_type", | ||
"if_transient_error", | ||
"_build_retry_error", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're exposing this, should we remove the leading private underscore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, removed
[List[Exception], RetryFailureReason, Optional[float]], | ||
Tuple[Exception, Optional[Exception]], | ||
] = _build_retry_error, | ||
init_args: _P.args = (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do the the sync and async versions of retry_target_stream
have init_args
and init_kwargs
but the unary retry_target
s do not? Shouldn't we be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to explain that in the comments here
Basically, every time you retry a unary rpc, it will use the same arguments, so it makes sense to lock them in to a partial before calling retry_target
With streaming rpcs, retries are more complicated, because we'd like to be able to recover the stream at the point it broke, instead of from the beginning. But the way to do that is different for each rpc. It's out of scope for this PR, but I can imagine at some point wanting to add a ResumptionStrategy
class, and provide some kind of call-back that allows you to modify requests between attempts. For that, we need to keep the method and it's call args separate when starting retry_target_stream
We could change retry_target
to work the same way to be consistent, but that would definitely be a breaking change. And it's unnecessary, because unary retries don't have the same problems as streams
TL;DR: Using a partial for retry_target_stream
would paint us into a corner, so I'm trying to make things flexible for future improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, makes sense. I wonder whether we should a brief note to that effect in the documentation for the two parameters ("these two parameters are provided in case the user needs to change the arguments in subsequent retry attempts"). But we can defer that for now.
I don't believe anything should change for users for either of these points. This PR is adding the new RetryStreaming and RetryStreamingAsync classes to gapic_generator currently uses the unary retries.Retry object for all rpcs, and that won't change after this PR is merged. Once this functionality is in place, the next step would be looking into bringing it to gapic (but that will actually be a whole separate project on its own, because streaming rpcs will require different semi-custom resumption strategies. I'm not currently planning on taking this on myself, but it ties in with work Leah Cole has been doing in Node) The main motivation of this change is making the retry streaming functionality available for veneer developers, until we figure out how to automate it for streaming rpcs in gapic libraries The part that I think could be considered breaking is
Our options are: a. mark this as a breaking change I'm fine with any option, just let me know which you both prefer |
OK, I see.
If this is an accurate summary, then I think calling this PR non-breaking is fine, which would result in this Separately, I'll file an issue to have |
One point of clarification: the current GAPICs do use |
31ade5b
to
34ec401
Compare
34ec401
to
f62439a
Compare
AsyncGenerator["_Y", None], target_iterator | ||
).athrow( | ||
sys.exception() # type: ignore | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like we could just use sys.exc_info()[1] to be compatible with all versions?
sys.exception()
looks cleaner, but I don't think it's worth the extra noise for the next few years. (And there's a performance consideration to the version check too, although I don't think athrow
should be called often)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's a great idea. Let me try that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. That worked. Merging.
From #485:
This PR addresses the issue by adding
retry_target_generator
functions to both the async and sync retry modules, which yield through the target rather than call it directly. Generator mode can be enabled using theis_generator
argument on the decorator.Fixes #485