Implement Bigtable v2 API #1850

tseaver · 2016-06-10T16:35:32Z

@dhermes if you can do a braindump here of what you've learned, and then assign to me, that would be great.

dhermes · 2016-06-10T16:48:36Z

Only the data API is in v2, the table and cluster admins are still v1
The majority of the changes are in ReadRowsResponse.
google/bigtable/v2/data.proto is essentially equivalent to google/bigtable/v1/bigtable_data.proto
google/bigtable/v2/bigtable.proto is essentially equivalent to google/bigtable/v1/bigtable_service.proto and google/bigtable/v1/bigtable_service_messages.proto combined
Some of the inclusive / exclusive fields have changed, but that's somewhat easy to track
There is a new method in the v1 Table Admin API that we should implement

garye · 2016-06-21T22:29:46Z

Some additional information:

Every API is in v2, not just data. See https://github.com/GoogleCloudPlatform/cloud-bigtable-client/tree/master/bigtable-protos/src/main/proto/google/bigtable/admin/v2.
ReadRowsResponse has changed quite a bit. Let's talk offline about the implementation and testing.
You now connect with project/instance rather than project/zone/cluster. This change is pervasive.
All column family operations are consolidated into ModifyColumnFamilies.
GC rules can no longer be expressed as strings/expressions
MutateRows returns a streaming response
ReadRowsRequest allows repeating ranges/sets

sduskis · 2016-06-21T22:32:49Z

I'd guess that MutateRows, bulk delete, and "bulk read" (a set of random row keys) are not in the current python implementation, since they were not there from day one and we added them along the way. They are useful, but not required functionality.

sduskis · 2016-06-21T22:36:47Z

We have a java implementation which processes v2 ReadRowsResponse objects. Hopefully, it can help give some insight into how to use the new API. There's also a json file we use to drive automated tests; we can explain more about it offline.

garye · 2016-06-21T22:39:37Z

If we implement happybase's Table.batch then MutateRows would be helpful to leverage, but I agree with Solomon that it's not required.

sduskis · 2016-06-21T22:44:03Z

Yeah, the new MutateRows is pretty awesome. It has a significant performance improvement over MutateRow.

Anything related to v2 comes first due to time sensitivity. We have a few performance improvement (MutateRows, multiple grpc channels) and reliability (retries) improvements that aren't as time sensitive.

garye · 2016-06-21T22:51:39Z

Here's a link to the go read rows implementation.
https://gist.github.com/garye/4b2ce32f8b6245e1024c7ce0a37b681d

tseaver · 2016-06-23T18:43:12Z

I've set up a feature branch for this project: I intend to merge PRs to that branch, and then merge that branch to master.

tseaver · 2016-06-23T19:09:49Z

Reviewing open gcloud-python issues related to Bigtable:

Table.rename returns a "not implemented" error (Review Bigtable Table.rename() "not implemented" status when Bigtable API is updated #1289). Is that method actually implemented in V2?
Generating code from the new .proto files is a pain (Document / automate(?) building generated files from protobuf specs #1482). Will there be a project published to PyPI containing that code (Start project to publish pb2 generated modules for services we depend on. #1384). E.g., see grpc-google-logging-v2.

sduskis · 2016-06-23T19:14:41Z

RenameTable was removed in v2. While it existed in v1, it was never actually implemented. Good catch.

lesv · 2016-06-23T21:29:05Z

The protos should be at #1895

garye · 2016-06-23T21:33:30Z

#1895 is bogus because it's targeted at master and breaks everything anyways, but at least look at the Makefile change. I fixed the makefile to find grpc_python_plugin wherever it is in your path but there's also a python script that runs and assumes a location, so I ended up copying grpc_python_plugin to the gcloud-python dir before running make generate.

tseaver · 2016-06-23T21:38:19Z

Setup:

Package proto-generated Python code for releaes to PyPI (see Start project to publish pb2 generated modules for services we depend on. #1384). If not feasible, regenerate using the Makefile and check in (see @garye's work on PR Generated protos for bigtable v2. #1895). Generated in Generate code for Bigtable v2 protobufs #1903.
Alias V1 imports / factories / endpoints to mark V1 source. Alias Bigtable V1 imports / factories / entry point constants. #1912.

Port modules to V2 protos:

System testing

Update system tests to fit V2 patterns. Bigtable system tests passing against V2 endpoints #1930
Verify that Happybase system tests pass. Convert gcloud.bigtable.happybase to V2 patterns #1931

FInalize

Merge bigtable-v2 branch to master. Land Bigtable v2 #1932
Make a 0.17.0 release from master. Prepare 0.17.0 release. #1940

Open issues:

Regenerating correct code from V2 protos. Regenerate Bigtable V2 protos #1918, and also grpc_python_plugin generates an empty import line grpc/grpc#7101
~~Instance creation requires explicit cluster configuration. V2 instance creation requires explicit cluster information #1929~~

lesv · 2016-06-23T22:04:32Z

@tswast FYI

tswast · 2016-06-23T23:31:10Z

Question for the eng team CCed on this thread. I see there are some more streaming APIs. Would it be possible to implement these APIs in the client as iterators (and stream the underlying results)?

#1812

Right now for PartialRowsData, the client accumulates the results in a semi-hidden list.

tseaver · 2016-06-24T22:14:26Z

@sduskis

There's also a json file we use to drive automated tests; we can explain more about it offline.

I'm trying to parse the chunks entries in that file via ReadRowsResponse.CellChunk.FromString, but that raises an exception: am I holding it wrong?

garye · 2016-06-24T22:16:12Z

Sounds like it should work - what was the exception?

On Fri, Jun 24, 2016, 6:14 PM Tres Seaver [email protected] wrote:

@sduskis https://github.com/sduskis

There's also a json file we use to drive automated tests; we can explain
more about it offline.

I'm trying to parse the chunks entries in that file via
ReadRowsResponse.CellChunk.FromString, but that raises an exception: am I
holding it wrong?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1850 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AACTr5Fyy2LtFgy9gl2KbQnhynLIukRxks5qPFbJgaJpZM4IzHp2
.

tseaver · 2016-06-24T22:21:43Z

This session is based on the new generation-from-protos stuff in PR #1903.

>>> import json
>>> from gcloud.bigtable._generated_v2.bigtable_pb2 import ReadRowsResponse
>>> with open('gcloud/bigtable/read-rows-acceptance-test.json') as f:
...     test_json = json.load(f)
... 
>>> chunk_pb = test_json['tests'][0]['chunks'][0]
>>> ReadRowsResponse.CellChunk.FromString(chunk_pb)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 780, in FromString
    message.MergeFromString(s)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 1080, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 1106, in InternalParse
    new_pos = local_SkipField(buffer, new_pos, end, tag_bytes)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 850, in SkipField
    return WIRETYPE_TO_SKIPPER[wire_type](buffer, pos, end)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 799, in _SkipGroup
    new_pos = SkipField(buffer, pos, end, tag_bytes)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 850, in SkipField
    return WIRETYPE_TO_SKIPPER[wire_type](buffer, pos, end)
  File "/home/tseaver/projects/agendaless/Google/src/gcloud-python/.tox/py27/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 820, in _RaiseInvalidWireType
    raise _DecodeError('Tag had invalid wire type.')
google.protobuf.message.DecodeError: Tag had invalid wire type.

tseaver · 2016-06-24T22:47:31Z

FWIW chunk_pb looks like a normal protobuf to me:

>>> chunk_pb
u'row_key: "RK"\nfamily_name: <\n  value: "A"\n>\nqualifier: <\n  value: "C"\n>\ntimestamp_micros: 100\nvalue: "value-VAL"\ncommit_row: false\n'

garye · 2016-06-24T23:16:17Z

I don't know the python protobuf APIs, but it looks like there are other
things like MergeFromString and
https://developers.google.com/protocol-buffers/docs/reference/python/google.protobuf.text_format-module
that
might be an alternative?

On Fri, Jun 24, 2016 at 6:47 PM Tres Seaver [email protected]
wrote:

FWIW chunk_pb looks like a normal protobuf to me:

chunk_pbu'row_key: "RK"\nfamily_name: <\n value: "A"\n>\nqualifier: <\n value: "C"\n>\ntimestamp_micros: 100\nvalue: "value-VAL"\ncommit_row: false\n'

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1850 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AACTrxVOI57u76RL9I8H0HuvO76ETezmks5qPF6JgaJpZM4IzHp2
.

tseaver · 2016-06-24T23:42:15Z

I missed that those chunks are rendered using text_format. The following works to parse one:

>>> from google.protobuf.text_format import Merge
>>> Merge(chunk_pb, chunk)

garye · 2016-06-25T00:09:45Z

Ah, sorry we should have mentioned. Glad it's working.

On Fri, Jun 24, 2016, 7:42 PM Tres Seaver [email protected] wrote:

I missed that those chunks are rendered using text_format. The following
works to parse one:

from google.protobuf.text_format import Merge>>> Merge(chunk-pb, chunk)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1850 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AACTr_aJj9nYMKbaxSozdpfXCoqTXtZnks5qPGtdgaJpZM4IzHp2
.

tseaver · 2016-06-27T20:50:13Z

@garye Am I meant to neglect the google.bigtable.v2.instance.Instance.cluster and related messages?

garye · 2016-06-27T20:55:24Z

@tseaver You mean for cluster administration?

tseaver · 2016-06-27T20:56:02Z

Yes -- am I supposed to be exposing the methods to manipulate clusters within an instance?

sduskis · 2016-06-27T20:58:02Z

We don't do that in go or Java. I would think Happybase doesn't have a way to manage clusters in v1 or instances in v2, since HBase doesn't have any similar concepts. If there isn't currently a way to do cluster management, we should not add new functionality.

tseaver · 2016-06-27T21:00:00Z

@sduskis I was just asking because gcloud.python exposed cluster managment APIs for V1: if they aren't relevant for V2, cool (they are defined right next to the V2 instance management APIs, which is which is what made me ask).

sduskis · 2016-06-27T21:08:00Z

Thanks for clarifying. I'd guess we need to expose operations that are similar to the way the GCP console operates. There is a ClusterService for some operations like resizing a cluster or changing the display name. I don't know enough about that. I'll ping the group to find out more about this.

The bottom line is that we need parity with the existing functionality in gcloud.python through the new APIs.

…atform/python-docs-samples#1850)

tseaver added the api: bigtable Issues related to the Bigtable API. label Jun 10, 2016

tseaver assigned dhermes Jun 10, 2016

tseaver mentioned this issue Jun 29, 2016

Land Bigtable v2 #1932

Merged

tseaver closed this as completed Jul 1, 2016

theacodes unassigned dhermes Sep 28, 2018

JustinBeckwith assigned tseaver Feb 1, 2021

parthea pushed a commit that referenced this issue Sep 22, 2023

Lib version update for Video Intelligence API [(#1850)](GoogleCloudPl…

7a5311a

…atform/python-docs-samples#1850)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Bigtable v2 API #1850

Implement Bigtable v2 API #1850

tseaver commented Jun 10, 2016

dhermes commented Jun 10, 2016 •

edited

Loading

garye commented Jun 21, 2016

sduskis commented Jun 21, 2016

sduskis commented Jun 21, 2016 •

edited

Loading

garye commented Jun 21, 2016

sduskis commented Jun 21, 2016

garye commented Jun 21, 2016

tseaver commented Jun 23, 2016

tseaver commented Jun 23, 2016 •

edited by mbrukman

Loading

sduskis commented Jun 23, 2016

lesv commented Jun 23, 2016

garye commented Jun 23, 2016

tseaver commented Jun 23, 2016 •

edited

Loading

lesv commented Jun 23, 2016

tswast commented Jun 23, 2016

tseaver commented Jun 24, 2016

garye commented Jun 24, 2016

tseaver commented Jun 24, 2016

tseaver commented Jun 24, 2016

garye commented Jun 24, 2016

tseaver commented Jun 24, 2016 •

edited

Loading

garye commented Jun 25, 2016

tseaver commented Jun 27, 2016

garye commented Jun 27, 2016

tseaver commented Jun 27, 2016

sduskis commented Jun 27, 2016

tseaver commented Jun 27, 2016

sduskis commented Jun 27, 2016

Implement Bigtable v2 API #1850

Implement Bigtable v2 API #1850

Comments

tseaver commented Jun 10, 2016

dhermes commented Jun 10, 2016 • edited Loading

garye commented Jun 21, 2016

sduskis commented Jun 21, 2016

sduskis commented Jun 21, 2016 • edited Loading

garye commented Jun 21, 2016

sduskis commented Jun 21, 2016

garye commented Jun 21, 2016

tseaver commented Jun 23, 2016

tseaver commented Jun 23, 2016 • edited by mbrukman Loading

sduskis commented Jun 23, 2016

lesv commented Jun 23, 2016

garye commented Jun 23, 2016

tseaver commented Jun 23, 2016 • edited Loading

lesv commented Jun 23, 2016

tswast commented Jun 23, 2016

tseaver commented Jun 24, 2016

garye commented Jun 24, 2016

tseaver commented Jun 24, 2016

tseaver commented Jun 24, 2016

garye commented Jun 24, 2016

tseaver commented Jun 24, 2016 • edited Loading

garye commented Jun 25, 2016

tseaver commented Jun 27, 2016

garye commented Jun 27, 2016

tseaver commented Jun 27, 2016

sduskis commented Jun 27, 2016

tseaver commented Jun 27, 2016

sduskis commented Jun 27, 2016

dhermes commented Jun 10, 2016 •

edited

Loading

sduskis commented Jun 21, 2016 •

edited

Loading

tseaver commented Jun 23, 2016 •

edited by mbrukman

Loading

tseaver commented Jun 23, 2016 •

edited

Loading

tseaver commented Jun 24, 2016 •

edited

Loading