Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: integrate acv2 feature branch #703

Merged
merged 41 commits into from
Jan 7, 2025
Merged

feat: integrate acv2 feature branch #703

merged 41 commits into from
Jan 7, 2025

Conversation

cojenco
Copy link
Contributor

@cojenco cojenco commented Jan 6, 2025

This integrates pre-launch-acv2 feature branch

  • BidiReadObject support
  • BidiWriteObject support that handles appendable objects
  • Forced Failures support: redirect-send-token-T
  • Forced Failures support: redirect-send-handle-and-token-T

coryan and others added 30 commits April 23, 2024 20:01
Add the bi-di streaming read RPCs as patches. In follow up PRs we can
implement the bidi functions. These patches may need updates from time
to time, as the line number hints change.
I would prefer if we use the same patches across all the SDKs. C++ has
more strict requirements to generate the GAPICs, gRPC, and proto
libraries.
* chore: update patches c.20240612

* generate storage/v2 files

* generate google/ files

* Revert "generate google/ files"

This reverts commit 51217d21f933892bfca252d6c3c6294f68f53d2e.

* Revert "generate storage/v2 files"

This reverts commit c57c13a4bb8a9f4b8bd8f2e54232b4befe2de742.

* fix venv and update protos

* update readme
* feat: BidiRead initial scaffolding n ranges in 1 stream

* attempt using a priorityQueue with bidireads
* feat: BidiReadObjectError part 1 object not found

* use grpc.StatusCode.value

* for discussion, return in error details

* object resolution errors, not found and precondition failed

* remove extra line change
* feat: pack BidiReadObjectError in error details for out of range

* review comments
* fix(grpc_server): Higher fidelity BidiReadObject.

The emulator behaved slightly differently than the real implementation.
This brings its behavior on first message closer to the spec.
Patch the BidiWriteObject protos to support appendable object
BidiWriteObject RPCs.

While I'm here, update `setup.py` to pin the correct version of
grpcio-tools for proto compilation, and have `update-protos.sh` use the
already `.gitignore`'d directory `.googleapis` as the base if none is
specified.
We never polled the `responses` queue after an `"abort"` message was
dequeued. That can stop us from ever joining `gather_thread` if another
range enqueues a response first, because the queue capacity is fixed. We
want to keep the queue capacity fixed so that we can increase the
likelihood of interleaving responses from multiple ranges.

Fix by draining the queue in the `finally` block before reraising.

Also, correct `abort` to `abort_with_status`: if the error is a `grpc.Status`,
we're meant to call `context.abort_with_status` rather than
`context.abort`.
#15)

* pull last changes

* add retry for bidi-read

* remove venv

* add abort

* fix: Correctly handle aborts with multiple ranges. (#16)

We never polled the `responses` queue after an `"abort"` message was
dequeued. That can stop us from ever joining `gather_thread` if another
range enqueues a response first, because the queue capacity is fixed. We
want to keep the queue capacity fixed so that we can increase the
likelihood of interleaving responses from multiple ranges.

Fix by draining the queue in the `finally` block before reraising.

Also, correct `abort` to `abort_with_status`: if the error is a `grpc.Status`,
we're meant to call `context.abort_with_status` rather than
`context.abort`.

* push chris patch

* modify tests

* pull last changes

* remove venv

* remove lint issues

* remove lint issues1

* remove lint issues2

* bring broken_stream down for better readability

* remove venv

---------

Co-authored-by: Chris Carlon <[email protected]>
Client cancellation surfaces as an exception in
`next(request_iterator)`. That wasn't handled before, so gather_thread
simply failed and never put the `"terminate"` action on the response
queue.
Appendable objects extend BidiWriteObject calls. This is a refactor and
a mostly-accurate implementation of appendable objects in the testbench.

There's a key difference - this simulates true appendable using
resumable uploads, which are not visible in ListObjects or similar calls
in the same way that appendable objects are. That's sufficient for
current testing.
When a stream ends with early termination, but we still return some
bytes, the response should be consistent with the actual bytes returned.

Specifically, we adjust the read_limit to match the actual range
returned, and adjust the range_end bool to indicate that we know there
is more data to request.
This was a bug - I should have raised a RuntimeError. It doesn't come up
because it's meant to be an impossible codepath, but it's nicer to raise
a real error in that case.
BidiWriteObjectRequest is allowed to contain no checksummed_data.
* chore: merge from public circa 2024-10-24

* chore: remove Notification, HMAC, SA grpc support (#693)

* chore: remove Notification, HMAC, SA grpc support

* fix lint

* feat: add write stall support  (#684)

* add code for write stall

* fix test

* remove unnecessary files

* remove unnecessary files

* write test

* undo test changes to remove unnecessary changes

* Update test_testbench_retry.py

* add test

* remove .idea files

* write stall changes

* remove .idea file

* Update test_testbench_retry.py

* test

* stall once for identiacal req

* add comment

* remove .idea file

* fix unit test

* fix unit test

* test changes

* test changes

* review comments

* remove .idea files

* lint fixes

* lint fixes

* lint fixes

* lint fixes

* lint fixes

* code patch fix

* support full uploads

* remove unnecessary things

* remove unnecessary things

* remove unnecessary things

* adding comment

* lint fix

* lint fix

* stall should not happen if uploaded less amount of data then stall size

* stall should not happen if uploaded less amount of data then stall size

* remove last two commit changes

* remove env files

* lint fix

* lint fix

* review comment and adding scenario where upload size is less then stall byte size in single shot

* lint fix

* lint fix

* chore: remove Notification, HMAC, SA grpc support (#693)

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Tulsi Shah <[email protected]>
* use gcs for logs instead

* enable both gcs bucket and monitoring
Internally generated the public proto as if all Bidi APIs were external,
and regenerated this patch against .googleapis.

Kept everything in one patch for simplicity.
* feat: Only RAPID supports appendable objects.

This restriction is subject to change.

* address lint issues
Appendable objects support a redirection protocol. Support instructions
to force redirection errors. The redirection string itself is opaque, so
for caller convenience we allow any hyphen-separated string of lowercase
alphabetic characters.
Redirection may or may not include a write_handle. This supports
instructions for returning a redirect that _does_ include the
write_handle.
finalize_time can be used to identify unfinalized objects, i.e.
appendable objects.
Insert appendable objects into metadata immediately upon creation.

This is not quite full appendable object semantics, but it's sufficient
to test preconditions.
Uploads sometimes have nonzero metadata, e.g. when a test upload
specifies generation -1. We really wanted to finalize appendable
uploads, so fix the condition to be explicit.
This surfaced a bug when checking that the live version is finalized
before allowing an overwrite: I had incorrectly applied De Morgan's Law.
cojenco and others added 9 commits November 22, 2024 15:16
* docs: how to merge from public upstream

* update contents
* support error for single shot upload

* remove .idea files

* lint fix

* review comments

* review comments

* review comments

* lint fix

* lint fix

* lint fix

Co-authored-by: Tulsi Shah <[email protected]>
We want to limit the total amount of time any one request can monopolize
a thread on the gRPC server, so we have a 10 second timer to terminate
the request. However, the existing code ends the stream with OK, which
is misleading to clients.

By switching to cancellation, we no longer need to set a timeout on the
gather thread join (since it observes an error from
next(request_iterator)) and we return a CANCELLED error which correctly
indicates to client code that they probably wrote a bug.

Now, tests with bugs that keep the stream open for more than 10 seconds
look like:

```
=== RUN   TestRetryConformance/grpc-1-[return-reset-connection_return-reset-connection]-storage.objects.get-1
    retry_conformance_test.go:696: want success, got rpc error: code = Canceled desc = CANCELLED
```

Tests pass when code correctly cleans up channels/RPCs.

N.B. if we ever need longer than 10 second RPCs and we _don't_ want them
to end in a CANCELLED error, we will have to revisit this.
The real implementation of BidiReadObject may pipeline requests,
concurrently serving reads issued in separate messages. Unfortunately,
as best we can tell, gRPC in synchronous Python makes it hard to abort
the next(request_iterator) call when there's an error with a concurrent
read.

So we go back to a previous iteration of this code, which handled
messages in batches. We no longer need a response queue or a thread
pool.
Deprecated fields have been removed from the proto we will upstream, and
I renamed read_limit in ReadRange to read_length (since it's a length!).

Verified with testbench unit tests and Go prelaunch SDK emulator tests.
@cojenco cojenco marked this pull request as ready for review January 6, 2025 22:36
@cojenco cojenco requested a review from a team as a code owner January 6, 2025 22:36
@@ -0,0 +1,526 @@
diff --git a/google/storage/v2/BUILD.bazel b/google/storage/v2/BUILD.bazel

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we can remove this once the new protos are synced in github.

@danielduhh danielduhh merged commit e48a569 into main Jan 7, 2025
12 checks passed
@danielduhh danielduhh deleted the acv2-merge-upstream branch January 7, 2025 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants