Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatibility with golang/protobuf #386

Open
jellevandenhooff opened this issue Feb 20, 2018 · 14 comments
Open

compatibility with golang/protobuf #386

jellevandenhooff opened this issue Feb 20, 2018 · 14 comments

Comments

@jellevandenhooff
Copy link

jellevandenhooff commented Feb 20, 2018

I was recently bitten by the incompatibility between golang/protobuf and gogo/protobuf when unmarshaling oneof types (see eg #238).

I worry that the suggested solution of not mixing golang/protobuf and gogo/protobuf is going to be really difficult. For example, the gRPC codebase has a bunch of references to golang/protobuf, and I don't want to be in the business of maintaining a gRPC fork.

For the oneof issue, the problem seems to be that XXX_OneofFuncs interface is subtly different between golang/protobuf and gogo/protobuf because of the proto.Message type not being the same.

It would be great to have the two packages support living in the same binary. The goproto_registration flag is a helpful step in that direction. Maybe that is too difficult, in which case it would be great if the known incompatibilities were listed somewhere.

@johanbrandhorst
Copy link
Member

I've listed the known complications of using gogo with various parts of the gRPC ecosystem in my blog: https://jbrandhorst.com/post/gogoproto/. Hopefully this should answers any questions you have.

@awalterschulze
Copy link
Member

@jellevandenhooff thanks for opening this issue.
I am also quite disappointed in the current situation.

Please let us know if @johanbrandhorst post helps and if you still have questions.
Also if you can think of more issues that would also be helpful in maybe coming up with a solution.

@jellevandenhooff
Copy link
Author

Thanks for the replies, @johanbrandhorst and @awalterschulze.

The blog post is helpful. It doesn't mention the oneof issue from #238 I ran into, which I think is because it only causes an error when using gogoproto without any flags (ie without generating the Marshal and Unmarshal methods). I was trying to migrate our codebase step-by-step and seeing nothing break in the process with first gogoproto, then gogoslick, and then perhaps customizing further, but here the intermediate step is actually the troublesome one.

I wish I had a good mental model for the interaction between golang and gogo protobuf. Here's what I think is going on, and it would be great to hear if this fits your view of the world:

  • Both golang and gogo protobuf maintain a registry of proto messages that generated code registers itself with. This registry isn't critical when marshaling and unmarshaling, but is used by eg. gRPC gateway introspection.
  • The golang protobuf library cannot safely marshal or unmarshal gogo protobuf generated types because the gogo protobuf generated types might use extensions that golang protobuf does not know about.
  • The golang protobuf library can safely marshal gogo protobuf types that implement the Marshaler and Unmarshaler interface. It will not try to access gogo descriptors. There are no real guarantees here, but it works today.

While the golang protobuf cannot handle gogo generated structs, I don't know if the reverse case is also broken. It seems to me that a sufficiently up-to-date gogo protobuf runtime should be able to handle all golang protobuf generated types. Is that right? If so, the gogo/googleapis package shouldn't be necessary for marshal and unmarshal correctness, but maybe it's necessary to get a Size() handler on all methods. Does that sound right?

The reason I wonder the above is because it would guarantee correct marshaling and unmarshaling by always using the gogoproto runtime, and that would be comforting. Does that seem like it would work? If not, what is missing from the gogoproto runtime?

To prevent the issue from #238, it would be great if the runtime libraries could refuse to deal with protos they did not understand. Could the gogo generator add some stuff to its struct tags that would make the golang protobuf reflect code error out? Another idea might be to ensure compatibility by always generating Marshal and Unmarshal methods, even if they are only thin shims around gogoproto.Marshal and gogoproto.Unmarshal for users that don't want large generated binaries.

Another idea might be to get the golang protobuf library to somehow support custom field types so that the gogo protobuf library could register its custom types with the golang protobuf library and use that as the canonical library. But that would require a bunch of upstream cooperation.

@jellevandenhooff
Copy link
Author

jellevandenhooff commented Feb 21, 2018

Going a bit on a limb here, but after reading golang/protobuf#268 and golang/protobuf#280, I wish protobuf dropped all struct tags and instead generated structs implementing

type Message interface {
   Marshal() ([]byte, error)
   Unmarshal([]byte) error
   Size() int
}

maybe with optional support for Text() and CompactText(), and linked to a simple global registry. If that code used a shared runtime library behind the scenes, great, but individual applications consuming protobuf messages would not have to know about the runtime library.

That change seems forward-compatible enough that it just might be acceptable to the upstream golang protobuf library...

@johanbrandhorst
Copy link
Member

@jellevandenhooff thanks for that, really well summarized, and I wasn't aware of the oneof issue so I'll add that to my list-of-things-to-add-to-the-post. I'm currently working on a demo repo where I can explore all these use cases and workarounds.

@johanbrandhorst
Copy link
Member

johanbrandhorst commented Feb 21, 2018

How do you use gogoproto without generating marshal and unmarshalling methods? Does that mean golang/protobuf attempts to unmarshal the types? As far as I could tell when I looked at the gRPC proto codec, the only thing they used was the proto.Message and proto.Marshaler.

@dsnet
Copy link

dsnet commented Feb 22, 2018

Most of the incompatibility between golang/protobuf and gogo/protobuf can be traced to two issues:

  • proto: make the Message interface behaviorally complete golang/protobuf#364 A fundamental issue with the proto.Message interface, where a "message" is classified by the type system depending on the presence of a ProtoMessage method, but the behavior is determined by being a struct with a special layout of fields and struct tags. This goes against the Go type system and results in the mess described above. In order to fix this, proto.Message must be a behaviorally complete interface such that all basic functionality can be implemented in terms of its methods alone.
  • proto: remove or improve registration golang/protobuf#268 Registration is a form of global state, and is inconsistently stored between the golang/protobuf/proto and gogo/protobuf/proto packages. I have not done sufficient investigation into what to do with registration to reduce incompatibilities, but I am reasonably convinced that at least all forms of registration except type and extension registration can be eliminated.

It may seem like adding the Marshal, Unmarshal, and Size methods would be sufficient, but that is still not sufficient to describe the behavior of a message as it still can't be used to implement functionality like Clone or HasExtension. A protobuf message is functionally a set of fields (with various properties), and any complete API must reflect that nature in its interface. Thus, a proper proto.Message interface must provide the functionality to get and set various fields similar to the C++ protobuf reflection API.

It might seem like a Go protobuf reflection API would be very slow, and if everything was implemented according to it, and you would not be wrong. That is where special case functionality like the Marshal and Unmarshal come into place. You see, Marshal and Unmarshal are not the lowest-level operations on a Message, they just happen to be the most common operations that most people care about.

I have been spending a lot of time working on golang/protobuf#364, and it is a difficult endeavor coming up with an API that properly represents the behavior described in the protobuf specification. I believe resolving golang/protobuf#364 is the path forward to reconciling the interoperation issues between golang/protobuf and gogo/protobuf.

Yes, we could add special-case logic to golang/protobuf/proto to accept protoc-gen-gogo generated messages, and gogo/protobuf/proto could add special-case logic to support protoc-gen-go generated messages. However, this isn't going to scale adding an endless sequence of special cases. Let's solve this once and for all the correct way.

In fact, when the proto.Message interface is behaviorally complete, I can foresee the gogo/protobuf/proto package entirely going away (only removing the proto package; there is value in the continued existence of gogo/protobuf/protoc-gen-*). This will be possible because the behaviorally complete Message interface will make the proto package completely agnostic to what the underlying implementation is.

@jellevandenhooff
Copy link
Author

Hi @dsnet, thanks for pointer to golang/protobuf#364. That is exactly what I was thinking of, and it does seem like the one solution that would allow a universe of many different proto.Message implementations.

However, we might not need a universal solution. I think for the vast majority of protobuf users in Go, they either use golang protobuf or gogo protobuf. It used to be pretty easy to just use gogo in all your packages, but with grpc becoming more prevalent, the golang protobuf library becomes hard to avoid in large Go projects. So figuring out how gogo and golang protobuf can co-exist is what my goal is. I would love for the runtimes to merge, but I worry that will take quite a while.

If the gogo protobuf library can handle all code generated by golang protobuf, and the gogo protobuf generated code ensures that it will not be mishandled by the golang protobuf runtime, then for all intents and purposes the libraries can safely coexist. Until that is the case, I intend to use gogo protobuf's runtime library in my codebase, ban imports of golang protobuf except for the call sites in gRPC that I've audited and look safe, and then things my fine.

@johanbrandhorst
Copy link
Member

@dsnet I'm incredibly excited to see how you proceed with this, and it fills me with confidence that you are tackling this issue. Thank you!

@awalterschulze
Copy link
Member

@jellevandenhooff

The generated Marshalers and Unmarshalers of gogoprotobuf assume that the messages that are fields also have generated Marshalers and Unmarshalers.
That is the reason we need gogo/googleapis.

If you don't generate any methods and simply use gogo/protobuf/proto.Marshal to marshal a message generated with golang/protobuf then it should work.
golang/protobuf/proto.Marshal and gogo/protobuf/proto.Marshal will both work with messages that have a generated Marshaler.

For unmarshaling gogo/protobuf.proto.Unmarshal should work a message generatd with golang/protobuf, except in cases where you use extensions or Any and require the registerd protobufs to resolve an message based on field number or name.
golang/protobuf/proto.Unmarshal and gogo/protobuf/proto.Unmarshal will both work with messages that have a generated Unmarshaler.
But I the extensions and Any messages will still need to registered with the right proto library.

So its a bit more complex than one would hope, sorry :(

Luckily you can see @dsnet chiming in on a gogoprotobuf issue, so things upstream are looking up :)

And yes I also wish golang/protobuf generated the Marshal, Unmarshal and Size methods.

@awalterschulze
Copy link
Member

@johanbrandhorst I think @jellevandenhooff is using gogo_out , but without any code generation, marshalers and unmarshalers, etc. at least just to start off with.
Then somewhere (because gRPC) golang/protobuf/proto is used to marshal/unmarshal the oneof and then the problem occurs.
@jellevandenhooff my quick fix is to just generate the marshalers and unmarshalers or am I misunderstanding your problem?
@jellevandenhooff Would it be possible to create a small repo that reproduces your problem?

@awalterschulze
Copy link
Member

@dsnet golang/protobuf#364 seems like a really cool idea.
Marshal, Unmarshal, Size, Clone, Equal methods are really simple to generate.
For golang/protobuf#268 Unmarshaling extensions and Any can be done by injecting possible extension, Any messages, instead of resolving to a Register pattern that has package name conflicts anyway.
I hope this is also a way to optionally remove any dependency on any proto library.
But yes if I can't have zero proto libraries I will gladly settle for one :)

On a side note and I want to stress that I don't want to be destructive here, but I struggle not to mention it.
But technically if everyone imported gogo/protobuf/proto instead of golang/protobuf/proto we would only need one proto package.
Because gogo/protobuf/proto can handle all golang/protobuf/proto messages and more, if they are registered: bad registry, bad dog ;)
So the easiest / quickest way (for me at least) to resolve this would be to merge the gogo/protobuf/proto diff into golang/protobuf/proto.
But don't get me wrong, I am open to exploring other ways as well.

@jellevandenhooff
Copy link
Author

To follow-up: Yes, I started out by using gogo_out, which made things more complicated than by using gogofast_out. Little did I know when I embarked on this adventure :)

The solution for our codebase was as @awalterschulze mentioned to switch to gogofast, and to set up a linter to prevent any imports of golang/protobuf/proto (with an exception for vendored gRPC). That's what I would recommend to anyone else struggling with this issue.

I found in our codebase actually a mix of gogo/protobuf/proto and golang/protobuf/proto imports thanks to goimports and some downstream dependencies that made gogo/protobuf available. That was a rather frightening experience.

Here's some tests that break if you mix and match different proto libraries for either code generation, marshaling, or unmarshaling:

message TestRequest {
  int64 sleep_ms = 1;
}

message OptionA {
  int64 value = 1;
}

message OptionB {
  string value = 1;
}

message OneOfHolder {
  oneof oneof_holder {
    OptionA option_a = 1;
    OptionB option_b = 2;
  }
}
func TestStableFormat(t *testing.T) {
	testCases := []struct {
		title   string
		message proto.Message
	}{
		{
			title: "empty",
			message: &testproto.TestRequest{
				SleepMs: 0,
			},
		},
		{
			title: "non-empty",
			message: &testproto.TestRequest{
				SleepMs: 10,
			},
		},
		{
			title: "oneof a non-empty", message: &testproto.OneOfHolder{
				OneofHolder: &testproto.OneOfHolder_OptionA{
					OptionA: &testproto.OptionA{
						Value: 10,
					},
				},
			},
		},
		{
			title: "oneof a empty", message: &testproto.OneOfHolder{
				OneofHolder: &testproto.OneOfHolder_OptionA{
					OptionA: &testproto.OptionA{},
				},
			},
		},
		{
			title: "oneof b non-empty", message: &testproto.OneOfHolder{
				OneofHolder: &testproto.OneOfHolder_OptionB{
					OptionB: &testproto.OptionB{
						Value: "foo",
					},
				},
			},
		},
		{
			title: "oneof b empty", message: &testproto.OneOfHolder{
				OneofHolder: &testproto.OneOfHolder_OptionB{
					OptionB: &testproto.OptionB{},
				},
			},
		},
	}

	for _, testCase := range testCases {
		t.Run(testCase.title, func(t *testing.T) {
			snapshotter := testhelpers.NewSnapshotter(t)
			defer snapshotter.Verify()

			// Check that marshaling works.
			bytes, err := proto.Marshal(testCase.message)
			assert.NoError(t, err)

			// Ensure marshaled representation does not change.
			snapshotter.Snapshot("golden proto", testCase.message, bytes)

			// Check that unmarshal returns the same result.
			copy := reflect.New(reflect.TypeOf(testCase.message).Elem()).Interface().(proto.Message)
			err = proto.Unmarshal(bytes, copy)
			assert.NoError(t, err)
			assert.Equal(t, testCase.message, copy)
		})
	}
}

The oneof message doesn't get unmarshaled if you use the wrong decoder and, very surprising to me, the empty message gets the default values marshaled if you use the wrong encoder.

@awalterschulze
Copy link
Member

I am relieved to hear that the quick fix worked :D
If the Marshalers and Unmarshalers are generated and you don't use Any or Extensions, then its not so serious which proto package you are importing.

But a linter sounds like a good idea.
Back in the day, when I was lucky enough to use grpc, I used to do a sed inside grpc to only import gogo/protobuf/proto and totally removed golang/protobuf/proto from my repository.
Then you don't even need a linter :), but you are technically maintaining a fork :(

Could you provide some extra context in this test, what is testhelpers and Verify and Snapshot?
Which proto package is being used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants