Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: replace internal dictionaries with protos in gapic calls #875

Merged
merged 12 commits into from
Nov 22, 2023

Conversation

daniel-sanche
Copy link
Contributor

Replace dicts with protos in the following rpc calls:

  • mutate_row
  • check_and_mutate
  • read_modify_write
  • MutateRowsOperationAsync

This will help us catch errors with easier type checking, and should have minor performance benefits

Fixes #779

@daniel-sanche daniel-sanche requested review from a team as code owners October 16, 2023 23:36
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigtable Issues related to the googleapis/python-bigtable API. labels Oct 16, 2023
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Oct 17, 2023
@daniel-sanche daniel-sanche added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 19, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 19, 2023
Copy link
Contributor

@igorbernstein2 igorbernstein2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the model just wrap the proto directly? then you won't have to have _EntryWithProto to memoize the pb

google/cloud/bigtable/data/_async/client.py Show resolved Hide resolved
@@ -36,6 +39,12 @@ class Mutation(ABC):
def _to_dict(self) -> dict[str, Any]:
raise NotImplementedError

def _to_pb(self) -> data_pb.Mutation:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to double store all of the attributes? Can't we store all of the attributes in the proto directly?
Also can we drop _to_dict altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main downside is we'd need a lot more boilerplate setters/getters (we lose the simplicity of the dataclasses). And marshaling to/from the protos is more expensive, but I guess that should'nt be too much of an issue here. I see this is how we handled it in the ReadRowsQuery class, which is more complicated than these would be.

What do you think of making these immutable? That would simplify the setters/getters, and then we wouldn't have to worry about making a static copy before starting a mutate_rows operation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we drop _to_dict altogether?

I think so. We do use row._to_dict in the test proxy, but I don't think we'd need it for mutations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally the logic for converting from models to protos for the test proxy is self contained in the proxy. A side goal of test proxy was to be a rosetta stone for expressing the same concepts across every cleint impl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I dont think you need to worry about someone mutating the model after passing it to the proxy...I dont think you need to be that defensive in your code

@@ -104,7 +116,7 @@ def __init__(
self.timeout_generator = _attempt_timeout_generator(
attempt_timeout, operation_timeout
)
self.mutations = mutation_entries
self.mutations = [_EntryWithProto(m, m._to_pb()) for m in mutation_entries]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make the mutation classes immutable, we can simplify this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making each individual mutation (ie SetCell, DeleteCell, etc) is a great idea
Howevr I dont think you can make the collection of mutations immutable (ie RowMutationEntry)

Copy link
Contributor

@igorbernstein2 igorbernstein2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pr is an improvement in its current state. Feel free to address the remaining comments in a follow up pr

@daniel-sanche daniel-sanche added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 22, 2023
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 22, 2023
@daniel-sanche
Copy link
Contributor Author

daniel-sanche commented Nov 22, 2023

Sure, opened a new issue for this here: #887

@daniel-sanche daniel-sanche merged commit 3ac80a9 into googleapis:experimental_v3 Nov 22, 2023
6 of 10 checks passed
daniel-sanche added a commit that referenced this pull request Feb 5, 2024
* feat: add new v3.0.0 API skeleton (#745)

* feat: improve rows filters (#751)

* feat: read rows query model class (#752)

* feat: implement row and cell model classes (#753)

* feat: add pooled grpc transport (#748)

* feat: implement read_rows (#762)

* feat: implement mutate rows (#769)

* feat: literal value filter (#767)

* feat: row_exists and read_row (#778)

* feat: read_modify_write and check_and_mutate_row (#780)

* feat: sharded read rows (#766)

* feat: ping and warm with metadata (#810)

* feat: mutate rows batching (#770)

* chore: restructure module paths (#816)

* feat: improve timeout structure (#819)

* fix: api errors apply to all bulk mutations

* chore: reduce public api surface (#820)

* feat: improve error group tracebacks on < py11 (#825)

* feat: optimize read_rows (#852)

* chore: add user agent suffix (#842)

* feat: optimize retries (#854)

* feat: add test proxy (#836)

* chore(tests): add conformance tests to CI for v3 (#870)

* chore(tests): turn off fast fail for conformance tets (#882)

* feat: add TABLE_DEFAULTS enum for table method arguments (#880)

* fix: pass None for retry in gapic calls (#881)

* feat: replace internal dictionaries with protos in gapic calls (#875)

* chore: optimize gapic calls (#863)

* feat: expose retryable error codes to users (#879)

* chore: update api_core submodule (#897)

* chore: merge main into experimental_v3 (#900)

* chore: pin conformance tests to v0.0.2 (#903)

* fix: bulk mutation eventual success (#909)

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the googleapis/python-bigtable API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants