Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert GrpcSerializer to use buildbuf_es instead of timostamm-protobuf-ts #1876

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

hotpocket
Copy link
Contributor

@hotpocket hotpocket commented Dec 20, 2024

As I understand it buildbuf_es has the ability to use transports other than grpc. Looking forward to allowing this to run in the web we need a transport that can be used there, namely a connect transport. Every attempt has been made to preserve backwards compatibility but there are some unavoidable differences:

  • The transport needs to be constructed and needs a cert & key to communicate with a tls server. To this end the kernelServer.ts was modified as well to expose this data with the goal of avoiding code duplication.
  • A force cast was needed in saveCellExecution.ts, presumably due to graphql.ts, with the suspected data types it contains coming from the timostamm proto bindings. This file exists in a __generated-platform__ directory but I was unable to find how this file was generated and investigate further.

Let me know what you think.

Thanks!

@jlewi
Copy link
Contributor

jlewi commented Dec 21, 2024

The transport needs to be constructed and needs a cert & key to communicate with a tls server. To this end the kernelServer.ts was modified as well to expose this data with the goal of avoiding code duplication.

I think this is temporary because right now the Executor is still using timostamm-ts; so we need to construct two different transports 1) one using timostamm-ts and 2) one using es libraries.

This should go away as soon as we migrate the Executor to use the es libraries and then we can go back to the pattern of having the Server just return a transport.

@hotpocket
Copy link
Contributor Author

it seems these tests are failing for the same reason that my prior PR failed for. Is this your assessment as well? I ran the tests locally and got all green lights.

@pastuxso
Copy link
Contributor

Hi @hotpocket, the issue is caused by the required permissions needed to run the E2E tests for the GitHub integration.

I’ve opened this PR to skip those tests specifically for forked repositories.

Let's see what @sourishkrout thinks about this solution.

@sourishkrout
Copy link
Member

it seems these tests are failing for the same reason that my prior PR failed for. Is this your assessment as well? I ran the tests locally and got all green lights.

This is a false alarm. This one 2e2 test fails because branches outside the Stateful org won't have access to a secret required to make this tests pass. Safe to ignore.

@sourishkrout
Copy link
Member

Let's see what @sourishkrout thinks about this solution.

Merged into main which will definitely help avoid false positives. Thank you!

Copy link
Member

@sourishkrout sourishkrout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my question, please.

@@ -474,36 +478,35 @@ export class WasmSerializer extends SerializerBase {
}

export class GrpcSerializer extends SerializerBase {
Copy link
Member

@sourishkrout sourishkrout Dec 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hotpocket The existing abstraction that elect to run the serializer via GRPC or WASM (legacy; should be removed). Ideally we'd have a BufEsSerializer or something like that which makes it possible runing serializers using a feature flag.

I'm somewhat uncomfortable making a hard cutover because it requires front-loading of extensive testing which is very difficult to coordinate.

Copy link
Contributor Author

@hotpocket hotpocket Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sourishkrout This was done successfully here as ConnectSerializer but was reverted due to the amount of duplicated code. The serialization logic is very similar, but the structure of the types as they manifest with each proto binding required changes inside many serializer functions (off the top of my head: the marshal methods, save, revive, cache, ...). I made an intermediary base class (GrpcSerializerBase), but there was still significant code that lived in the concrete implementations. The initial decision was to duplicate code, hence this implementation, but it was deemed to be too much duplication, and a potential maintenance issue.

The timostamm bindings seem to be assumed throughout the code base, so using ES has been challenging here, and I expect will be elsewhere as we walk the dependencies. I have not been able to find a one size fits all approach so any suggestions are greatly welcomed.

The above commit link was work in progress leading us to where we are with this PR. If you'd like me to clean it up a bit let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional context.

I think in order to eventually support the Connect Protocol we need to support both the connect protocol and the gRPC protocol for some period of time. There are two reasons for this

  1. The Server currently only supports the gRPC protocol; so the connect protocol won't be functional until the server supports the connect protocol
  2. Doing a hard cutover to connect protocol in both the server and extension is going to be more difficult to coordinate
  3. We probably want to bake the connect protocol before dropping support for the gRPC protocol.

The original thinking was to support gRPC and Connect simultaneously we need to have two different Serializer classes using the timostamm and ES libraries respectively.

However, we realized that since the connect libraries support both the gRPC protocol and Connect Protocol we can support both protocols with a single class but letting that class use either a gRPC transport or a connect transport.

@sourishkrout regarding

I'm somewhat uncomfortable making a hard cutover because it requires front-loading of extensive testing

Are there particular risks you are worried about?

Since this PR is changing

  1. The Generated code used to represent the protos
  2. The library used to serialize the protos and gRPC

I can think of two main failure modes

  1. A bug in the conversion logic between VSCode data structures and the new proto classes
  2. A bug in the connect implementation of the proto & gRPC spec

For #1 the code diff is quite small. So between the small code diff and unittests, how big do you think the risk?

Are you worried about #2? I would have expected the connect libraries to be well tested and vetted since its kind of the whole point of the underlying company.

Is there some other risk you forsee? Does the executor have some dependency on the serializer that isn't being surfaced in the diff?

That said if you want to preserve the existing code path and let the new one bake for it. I think that's pretty straightforward. It should be fairly simple just to restore the existing GRPCSerializer class and introduce an option to select the serializer.

As @hotpocket mentioned the complexity with two classes arose because of an effort to avoid code duplication. I don't think there's any reason to avoid code duplication if we want to preserve the GRPCSerializer. The impetus for minimizing code duplication was that both classes would need to exist for a long time because we won't be able to remove GRPC support anytime soon. However, we only need to keep the GRPCSerializer around long enough to validate that the Connect class is working as intended with gRPC. If that's a reasonable short time (1 week? 2 weeks?) then the burden of supporting duplicate code paths is pretty low.

So @sourishkrout if we preserve GRPCSerializer, how much bake time would you want before you are comfortable removing it. If you and me use the ConnectSerializer for 1 week is that sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sourishkrout @jlewi I think this approach makes good sense. The changes are small enough to isolate issues should they arise and it avoids maintaining duplicate code paths.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @sourishkrout if we preserve GRPCSerializer, how much bake time would you want before you are comfortable removing it. If you and me use the ConnectSerializer for 1 week is that sufficient?

Looking at it again now. I'll get back shortly.

Copy link
Member

@sourishkrout sourishkrout Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For #1 the code diff is quite small. So between the small code diff and unittests, how big do you think the risk?

Are you worried about #2? I would have expected the connect libraries to be well tested and vetted since its kind of the whole point of the underlying company.

Not worried about 2.) at all. Relying on the same assumptions as you stated.

1.) is on my mind because Typescript-generated types aren't all the same due to their structural typing nature. It's more like the integration surface and how unit tests likely won't guard against small, impactful changes.

Here's my proposal. We address the following open issues:

  1. Cherry-pick my review PR into this one.
  2. Move the Connect transport into the KernelServer (change interface accordingly).
    (3. Sebastian will test integration with Cloud Product.)

Once ready to merge, we squashed this PR when it's not blocking any other pending PRs (e.g. early next week). We put out a pre-release and break it in for ~1wk. If it goes well, we can consider a stable release. If not, we can decide if it's fixable or revert accordingly and reconsider the path forward.

Wdyt @jlewi @hotpocket?

@@ -512,8 +515,45 @@ export class GrpcSerializer extends SerializerBase {
)
}

private async initParserClient(transport?: GrpcTransport) {
this.client = initParserClient(transport ?? (await this.server.transport()))
private async initClient(): Promise<void> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd likely be cleaner/easier to have the KernelServer supply transports for both GRPC and ES.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this appears to be a timostamm transport used by a timostamm HealthClient. Should I seek out an analogue in the ES space? How would you suggest making this modification to avoid the smallest cascade of changes due to the dependency graph?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hotpocket I think what @sourishkrout was suggesting is that right now the Kernel has a method Kernel.transport which returns a GRPCTransport for the timostamm library. I think he's suggesting using the same pattern to create the transport for the ES library rather than creating the transport inside your serializer.ts

e.g. add a method kenel.connectTransport() which returns the transport. Then all the code for initializing the transport can move out of the serializer into the kernel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the suggestion is to add a function to kernelServer to construct and return an es transport (likely identical to what was added in serializer in this PR) but not add any of the monitoring or integration that it does for the timostamm transport. If we cut over to es in the future then the monitoring and integration can be address then IIUC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @hotpocket, essentially, the idea is to modify the IServer to have both this.server.tcpTransport() and this.server.connectTransport(). We don't need the health check to use Connect.

@jlewi
Copy link
Contributor

jlewi commented Jan 7, 2025

@hotpocket @sourishkrout WDYT?

@sourishkrout
Copy link
Member

@hotpocket @jlewi did some simple refactoring to get closer to merging. Can we please cherry-pick this commit into this branch? I can't use @hotpocket's branch as a base without doing some forking limbo.

@hotpocket
Copy link
Contributor Author

I have merged your cherry-picked code. I also moved the transport method to kernelServer and followed the patterns found there for the pre-existing transport, minus the health status checks.

Some new code has caused a coverage failure at 78.93% of 79%. If you'd like me to write some tests let me know. I looked at the kernelServer.test.ts to see if I should add tests for the new connect transport method but didn't see any prior transport tests to model it on. Let me know how you want to proceed WRT testing coverage and anything else you find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants