-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spanner] Server randomly returns "ServerException: INTERNAL: Received RST_STREAM with error code 2". #5473
Comments
I've contacted support and was informed that the Spanner team is aware of the issue and is working towards a fix. Closing it for now. |
Unfortunately, this was not completely addressed and we got a WONT FIX response from google support team. So I believe this error is here to stay. Would it be possible to add this error to the auto-retry mechanism below? google-cloud-php/Spanner/src/Database.php Lines 863 to 865 in e9ccbbb
Go's library already added a retry for all internal server errors recently (probably for the same reason). |
This seems to be a feature request to retry certain error responses. This behavior is in the works! |
Thank you! Hope to see that get merged soon! |
Java client added a fix to this as well. |
I too hope this response is incorporated ASAP. |
We are clarifying internally whether (easy to fix) simply adding a retry in existing call is sufficient or (will take longer) we need to re-create a connection. |
Here is a more detailed error.
We have been seeing this error for a few weeks now across various projects running various versions of
google/cloud-spanner
(including one that is running the latestv1.51.2
).I have not been able to reproduce this error since it happens randomly.
When the error occurs, it shows up in bulk within a span of a few seconds arcoss different pods on K8s.
This error seems to always occur at the first query within a transaction.
Would it be possible to add a retry for this specific error here?
I'm suggesting this because google-cloud-go seems to be doing something similar.
I usually don't post issues until I have reproducible code but this has been affecting production for weeks, so I am eager to get some kind of solution to mitigate the error.
Also, does anyone here know what "error code 2" is?
Understanding it might help to better understand the error.
Thanks.
Environment details
The text was updated successfully, but these errors were encountered: