-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bigquery: retry idempotent RPCs #4148
Conversation
This is a first cut at adding retry to the BigQuery client. My thinking is that most retry parameters are not interesting to users and should not be exposed, but the timeout (a.k.a. Another choice would be to allow an optional google.gax.CallOptions, as the generated code does. |
@jba this is a good start, but needs to be slightly different (and I apologize for not documenting what the retry pattern should look like).
So def get_dataset(self, dataset_ref, retry=DEFAULT_RETRY):
api_call = functools.partial(
self._connection.api_request,
method='GET',
path=dataset_ref.path)
if retry:
api_call = retry(api_call)
api_response = api_call()
return Dataset.from_api_repr(api_response) and (FYI: |
Thanks for the explanation. The one thing I'm not clear about is the exception subclass. BigQuery retry ignores the error code and is based solely on the |
Yikes. Do the error codes at least match up with idempotent http statuses?
Yeah, that could be reasonable, something like:
Possibly, if the default predicate doesn't work (e.g. if not all BackendErrors are retryable, just the subset that ends up being BigQueryTransientErrors). If that's the case, you can make it global: google.cloud.bigquery.default_retry_predicate That way custom retries are a bit easier: myretry = retry.Retry(bigquery.default_retry_predicate, deadline=60) You can also make the default retry object itself a global or class constant, so users can do; myretry = bigquery_client.DEFAULT_RETRY.with_deadline(60)
If you sublcass one of the existing exception classes they should only need to call |
The error codes are irrelevant. Only the reason field matters. So I guess the answer is no.
The default predicate won't work, for the reason you said.
Done. |
I don't understand the error I'm getting. It happens only under 2.7. Am I holding
|
api_response = self._connection.api_request( | ||
method='GET', path=dataset_ref.path) | ||
api_call = functools.partial( | ||
self._connection.api_request, |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
4abd655
to
e3e2a20
Compare
PTAL. In the latest push I added retry to nearly all methods that can support it. |
@@ -291,7 +293,8 @@ class HTTPIterator(Iterator): | |||
def __init__(self, client, api_request, path, item_to_value, | |||
items_key=_DEFAULT_ITEMS_KEY, | |||
page_token=None, max_results=None, extra_params=None, | |||
page_start=_do_nothing_page_start, next_token=_NEXT_TOKEN): | |||
page_start=_do_nothing_page_start, next_token=_NEXT_TOKEN, | |||
retry=None): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
When you create a function on the fly with You can get around it with: # Make the partial that you were making before.
partial_func = functools.partial(func, *args, **kwargs)
# Assign it an appropriate __module__ property based on the function it wraps.
partial_func.__module__ = func.__module__ After that it should work with |
@lukesneeringer No need, we already fixed that in core. :) |
retry stuff LGTM |
Add retry logic to every RPC for which it makes sense. Following the BigQuery team, we ignore the error code and use the "reason" field of the error to determine whether to retry. Outstanding issues: - Resumable upload consists of an initial call to get a URL, followed by posts to that URL. Getting the retry right on that initial call requires modifying the ResumableUpload class. At the same time, the num_retries argument should be removed. - Users can't modify the retry behavior of Job.result(), because PollingFuture.result() does not accept a retry argument.
Add retry logic to every RPC for which it makes sense. Following the BigQuery team, we ignore the error code and use the "reason" field of the error to determine whether to retry. Outstanding issues: - Resumable upload consists of an initial call to get a URL, followed by posts to that URL. Getting the retry right on that initial call requires modifying the ResumableUpload class. At the same time, the num_retries argument should be removed. - Users can't modify the retry behavior of Job.result(), because PollingFuture.result() does not accept a retry argument.
Add retry logic to every RPC for which it makes sense.
Following the BigQuery team, we ignore the error code and
use the "reason" field of the error to determine whether
to retry.