Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add retry wrapper to emulator #37

Merged
merged 5 commits into from
Aug 16, 2021
Merged

feat: add retry wrapper to emulator #37

merged 5 commits into from
Aug 16, 2021

Conversation

coryan
Copy link
Contributor

@coryan coryan commented Aug 14, 2021

The emulator wraps each RPC with a "retry test", a mechanism to inject
failures and exercise the retry path in the client libraries. This
change adds the wrapper to the existing RPCs.

Part of the work for #24

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 14, 2021
@codecov
Copy link

codecov bot commented Aug 14, 2021

Codecov Report

Merging #37 (ddd845f) into main (28d3117) will increase coverage by 0.03%.
The diff coverage is 98.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   97.06%   97.10%   +0.03%     
==========================================
  Files          28       28              
  Lines        2935     3003      +68     
==========================================
+ Hits         2849     2916      +67     
- Misses         86       87       +1     
Flag Coverage Δ
unittests 97.10% <98.73%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
tests/test_emulator.py 98.11% <98.03%> (-0.11%) ⬇️
testbench/common.py 95.79% <100.00%> (+0.34%) ⬆️
testbench/error.py 91.30% <100.00%> (ø)
testbench/generation.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 28d3117...ddd845f. Read the comment docs.

@coryan coryan marked this pull request as ready for review August 14, 2021 16:49
@coryan coryan requested a review from a team August 14, 2021 16:49
emulator.py Outdated Show resolved Hide resolved
coryan added 5 commits August 16, 2021 17:31
The emulator wraps each RPC with a "retry test", a mechanism to inject
failures and exercise the retry path in the client libraries.  This
change adds the wrapper to the existing RPCs.
Copy link
Contributor

@cojenco cojenco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests! Thanks for adding them!

@coryan coryan merged commit 3905db1 into googleapis:main Aug 16, 2021
@coryan coryan deleted the feat-add-retry-wrappers-to-emulator branch August 16, 2021 17:41
cojenco added a commit that referenced this pull request Jan 6, 2025
danielduhh pushed a commit that referenced this pull request Jan 7, 2025
* ci: disable codecov uploads (#3)

* impl: add bidi streaming read patches (#1)

Add the bi-di streaming read RPCs as patches. In follow up PRs we can
implement the bidi functions. These patches may need updates from time
to time, as the line number hints change.

* chore: update patches to meet C++ requirements (#4)

I would prefer if we use the same patches across all the SDKs. C++ has
more strict requirements to generate the GAPICs, gRPC, and proto
libraries.

* chore: update proto patches circa 2024-06-13 (#6)

* chore: update patches c.20240612

* generate storage/v2 files

* generate google/ files

* Revert "generate google/ files"

This reverts commit 51217d21f933892bfca252d6c3c6294f68f53d2e.

* Revert "generate storage/v2 files"

This reverts commit c57c13a4bb8a9f4b8bd8f2e54232b4befe2de742.

* fix venv and update protos

* update readme

* feat: BidiRead initial scaffolding n ranges in 1 stream (#7)

* feat: BidiRead initial scaffolding n ranges in 1 stream

* attempt using a priorityQueue with bidireads

* feat: add crc32c checksum to BidiRead (#9)

* test: BidiRead object not found error handling (#10)

* feat: BidiReadObjectError part 1 object not found

* use grpc.StatusCode.value

* for discussion, return in error details

* object resolution errors, not found and precondition failed

* remove extra line change

* feat: pack BidiReadObjectError in error details for out of range (#11)

* feat: pack BidiReadObjectError in error details for out of range

* review comments

* fix(grpc_server): Higher fidelity BidiReadObject. (#12)

* fix(grpc_server): Higher fidelity BidiReadObject.

The emulator behaved slightly differently than the real implementation.
This brings its behavior on first message closer to the spec.

* feat(appendable): Appendable object proto support (#13)

Patch the BidiWriteObject protos to support appendable object
BidiWriteObject RPCs.

While I'm here, update `setup.py` to pin the correct version of
grpcio-tools for proto compilation, and have `update-protos.sh` use the
already `.gitignore`'d directory `.googleapis` as the base if none is
specified.

* fix: Correctly handle aborts with multiple ranges. (#16)

We never polled the `responses` queue after an `"abort"` message was
dequeued. That can stop us from ever joining `gather_thread` if another
range enqueues a response first, because the queue capacity is fixed. We
want to keep the queue capacity fixed so that we can increase the
likelihood of interleaving responses from multiple ranges.

Fix by draining the queue in the `finally` block before reraising.

Also, correct `abort` to `abort_with_status`: if the error is a `grpc.Status`,
we're meant to call `context.abort_with_status` rather than
`context.abort`.

* feat(grpc_retry): support retry bidi read object for conformance test. (#15)

* pull last changes

* add retry for bidi-read

* remove venv

* add abort

* fix: Correctly handle aborts with multiple ranges. (#16)

We never polled the `responses` queue after an `"abort"` message was
dequeued. That can stop us from ever joining `gather_thread` if another
range enqueues a response first, because the queue capacity is fixed. We
want to keep the queue capacity fixed so that we can increase the
likelihood of interleaving responses from multiple ranges.

Fix by draining the queue in the `finally` block before reraising.

Also, correct `abort` to `abort_with_status`: if the error is a `grpc.Status`,
we're meant to call `context.abort_with_status` rather than
`context.abort`.

* push chris patch

* modify tests

* pull last changes

* remove venv

* remove lint issues

* remove lint issues1

* remove lint issues2

* bring broken_stream down for better readability

* remove venv

---------

Co-authored-by: Chris Carlon <[email protected]>

* fix: Support client cancellation in BidiReadObject (#17)

Client cancellation surfaces as an exception in
`next(request_iterator)`. That wasn't handled before, so gather_thread
simply failed and never put the `"terminate"` action on the response
queue.

* feat: Handle appendable objects in BidiWriteObject (#18)

Appendable objects extend BidiWriteObject calls. This is a refactor and
a mostly-accurate implementation of appendable objects in the testbench.

There's a key difference - this simulates true appendable using
resumable uploads, which are not visible in ListObjects or similar calls
in the same way that appendable objects are. That's sufficient for
current testing.

* fix: Correct BidiReadObject early termination. (#19)

When a stream ends with early termination, but we still return some
bytes, the response should be consistent with the actual bytes returned.

Specifically, we adjust the read_limit to match the actual range
returned, and adjust the range_end bool to indicate that we know there
is more data to request.

* fix: Don't raise a bare string (#21)

This was a bug - I should have raised a RuntimeError. It doesn't come up
because it's meant to be an impossible codepath, but it's nicer to raise
a real error in that case.

* fix: Allow BidiWriteObjectRequest with no data. (#22)

BidiWriteObjectRequest is allowed to contain no checksummed_data.

* build: update cloudbuild for private images (#23)

* chore: merge from public circa 2024-10-24 (#25)

* chore: merge from public circa 2024-10-24

* chore: remove Notification, HMAC, SA grpc support (#693)

* chore: remove Notification, HMAC, SA grpc support

* fix lint

* feat: add write stall support  (#684)

* add code for write stall

* fix test

* remove unnecessary files

* remove unnecessary files

* write test

* undo test changes to remove unnecessary changes

* Update test_testbench_retry.py

* add test

* remove .idea files

* write stall changes

* remove .idea file

* Update test_testbench_retry.py

* test

* stall once for identiacal req

* add comment

* remove .idea file

* fix unit test

* fix unit test

* test changes

* test changes

* review comments

* remove .idea files

* lint fixes

* lint fixes

* lint fixes

* lint fixes

* lint fixes

* code patch fix

* support full uploads

* remove unnecessary things

* remove unnecessary things

* remove unnecessary things

* adding comment

* lint fix

* lint fix

* stall should not happen if uploaded less amount of data then stall size

* stall should not happen if uploaded less amount of data then stall size

* remove last two commit changes

* remove env files

* lint fix

* lint fix

* review comment and adding scenario where upload size is less then stall byte size in single shot

* lint fix

* lint fix

* chore: remove Notification, HMAC, SA grpc support (#693)

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Tulsi Shah <[email protected]>

* test: enable GCS bucket logs (#26)

* use gcs for logs instead

* enable both gcs bucket and monitoring

* feat(append): Import new proto patch. (#29)

Internally generated the public proto as if all Bidi APIs were external,
and regenerated this patch against .googleapis.

Kept everything in one patch for simplicity.

* feat: Only RAPID supports appendable objects. (#30)

* feat: Only RAPID supports appendable objects.

This restriction is subject to change.

* address lint issues

* feat(append): Support instructions for redirect. (#31)

Appendable objects support a redirection protocol. Support instructions
to force redirection errors. The redirection string itself is opaque, so
for caller convenience we allow any hyphen-separated string of lowercase
alphabetic characters.

* feat(append): Redirect-with-handle instructions. (#32)

Redirection may or may not include a write_handle. This supports
instructions for returning a redirect that _does_ include the
write_handle.

* feat(append): Update Object with finalize_time. (#33)

finalize_time can be used to identify unfinalized objects, i.e.
appendable objects.

* feat(append): Appendable object creation. (#34)

Insert appendable objects into metadata immediately upon creation.

This is not quite full appendable object semantics, but it's sufficient
to test preconditions.

* fix: Correct finalize condition. (#35)

Uploads sometimes have nonzero metadata, e.g. when a test upload
specifies generation -1. We really wanted to finalize appendable
uploads, so fix the condition to be explicit.

* feat(append): Add tests for finalizing appends. (#36)

This surfaced a bug when checking that the live version is finalized
before allowing an overwrite: I had incorrectly applied De Morgan's Law.

* fix: signal terminate in BidiReadObject (#37)

* docs: how to merge from public upstream (#40)

* docs: how to merge from public upstream

* update contents

* feat: Support error code for single shot upload (#699) (#39)

* support error for single shot upload

* remove .idea files

* lint fix

* review comments

* review comments

* review comments

* lint fix

* lint fix

* lint fix

Co-authored-by: Tulsi Shah <[email protected]>

* chore: Timeout with cancellation in BidiReadObject (#42)

We want to limit the total amount of time any one request can monopolize
a thread on the gRPC server, so we have a 10 second timer to terminate
the request. However, the existing code ends the stream with OK, which
is misleading to clients.

By switching to cancellation, we no longer need to set a timeout on the
gather thread join (since it observes an error from
next(request_iterator)) and we return a CANCELLED error which correctly
indicates to client code that they probably wrote a bug.

Now, tests with bugs that keep the stream open for more than 10 seconds
look like:

```
=== RUN   TestRetryConformance/grpc-1-[return-reset-connection_return-reset-connection]-storage.objects.get-1
    retry_conformance_test.go:696: want success, got rpc error: code = Canceled desc = CANCELLED
```

Tests pass when code correctly cleans up channels/RPCs.

N.B. if we ever need longer than 10 second RPCs and we _don't_ want them
to end in a CANCELLED error, we will have to revisit this.

* fix: Disable pipelining in BidiReadObject. (#43)

The real implementation of BidiReadObject may pipeline requests,
concurrently serving reads issued in separate messages. Unfortunately,
as best we can tell, gRPC in synchronous Python makes it hard to abort
the next(request_iterator) call when there's an error with a concurrent
read.

So we go back to a previous iteration of this code, which handled
messages in batches. We no longer need a response queue or a thread
pool.

* chore: Update bidi proto patch. (#44)

Deprecated fields have been removed from the proto we will upstream, and
I renamed read_limit in ReadRange to read_length (since it's a length!).

Verified with testbench unit tests and Go prelaunch SDK emulator tests.

* Revert "docs: how to merge from public upstream (#40)"

This reverts commit e96974d.

* Revert "test: enable GCS bucket logs (#26)"

This reverts commit cc704ab.

* Revert "build: update cloudbuild for private images (#23)"

This reverts commit 86db6be.

* Revert "feat: Only RAPID supports appendable objects. (#30)"

This reverts commit 47c63fa.

* update for public merge

---------

Co-authored-by: Carlos O'Ryan <[email protected]>
Co-authored-by: Chris Carlon <[email protected]>
Co-authored-by: shubham-diwakar <[email protected]>
Co-authored-by: Frank Natividad <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Tulsi Shah <[email protected]>
Co-authored-by: Frank Natividad <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants