Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unmarshal unknown extensions into XXX_unrecognized instead of into extenson map #386

Closed

Conversation

jhump
Copy link
Contributor

@jhump jhump commented Jun 27, 2017

This change would address #385

@jhump
Copy link
Contributor Author

jhump commented Jun 28, 2017

@matloob, you are the last person to make material changes in this stuff (I realize that was many months ago). Are you still working on protos? Could you take a look or perhaps tag whoever is the best person to look?

extmap[int32(tag)] = ext
if !regExtInit {
msgType := reflect.Zero(reflect.PtrTo(st)).Interface().(Message)
regExt = RegisteredExtensions(msgType)
Copy link
Contributor Author

@jhump jhump Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's more clear to just do regExt := RegisteredExtensions(msgType) here and skip memoizing from one iteration of the loop to the next. But I'm leary of doing any more work than necessary in something like proto unmarshaling since that is likely to be a hot path in many servers. (Though, admittedly, extensions are likely used infrequently enough that this is not necessary, especially since the work we are saving is just a single map lookup for string key. Should I omit this? Thoughts, suggestions?)

@jhump
Copy link
Contributor Author

jhump commented Jun 30, 2017

Ping!

Do issues and pull requests go into a queue that someone will eventually look at? Or do they largely fall on deaf ears unless I tag the right person, and that person has time to take a look? (Based on previous attempts to contribute, it kind of feels like the latter.)

I know there might be low signal-to-noise ratio in the stuff that Github community sends Google's way. But if someone could at least triage and drop a brief comment -- like an ack or nack, not necessarily full review -- that would be super helpful. I'm hoping that this could bubble up onto someone's radar and be interpreted as signal, not noise.

@dsnet
Copy link
Member

dsnet commented Jun 30, 2017

My main concern with this change is backwards compatibility. There are users that clear out or use the XXX_unrecognized fields for various reasons (security, performance, etc). Some of them may have wanted the unknown extensions in there, while this would be breaking others. The API of message extensions in the Go proto implementation are not the most consistent, but plethora of existing codebases that depend on this makes this really difficult to change.

If I may ask, what is the major benefit to you that they are unmarshaled to XXX_unrecognized as opposed to some other mechnism? While XXX_unrecognized is exported, the XXX_ prefix was supposed to indicate that these are private fields.

Unless there is a good reason, I would rather keep the current behavior. Secondly, with the release of proto3, (hopefully) the use of proto2 and message extensions will diminish drastically.

@dsnet
Copy link
Member

dsnet commented Jun 30, 2017

But if someone could at least triage and drop a brief comment -- like an ack or nack, not necessarily full review -- that would be super helpful.

I apologize for the slow rate of response. I will be focusing on protobufs more after the Go1.9 release.

@jhump
Copy link
Contributor Author

jhump commented Jun 30, 2017

@dsnet: Thank you for the reply!

There are users that clear out or use the XXX_unrecognized fields for various reasons (security, performance, etc). Some of them may have wanted the unknown extensions in there, while this would be breaking others

I don't really follow this logic. I would expect users that care about unknown fields would be more interested in ones defined in the message itself vs. those defined as extensions.

I do understand it's a change in backwards compatibility. Another thought I had was to provide an accessor, like GetRawExtension() (analogous to existing GetExtension() but returns raw bytes and could be used for unknown extensions). What do you think about that?

If I may ask, what is the major benefit to you that they are unmarshaled to XXX_unrecognized as opposed to some other mechnism?

If you see the bug report (#385), the main thing I am trying to accomplish is that they be exposed somehow. This is kind of related to #364, if you consider handling of extensions and unrecognized fields as "behavior" of a protobuf message (like the APIs in Java and C++).

In particular, I have code to convert from a generated message to a dynamic message. I would like to be able to re-parse unrecognized fields during this step since the dynamic message's descriptor and registered extensions may be more exhaustive than what was linked into the binary actually doing the conversion. I could serialize the message to bytes and then de-serialize into the dynamic message, but that feels too inefficient. So I instead was hoping to only de-serialize the bytes for unrecognized fields, but trying to simplify awfulness in the current implementation (to work around the fact that there are no ways to access unrecognized extensions).

@dsnet
Copy link
Member

dsnet commented Jun 30, 2017

I don't really follow this logic. I would expect users that care about unknown fields would be more interested in ones defined in the message itself vs. those defined as extensions.

People use protos is all sorts of crazy ways that I'm no longer surprised by weird things people do with them. Subtle changes in behavior like this makes me very nervous.

That being said, I apologize for not seeing that you referenced #385, which seems a like legitimate need. Whether that does mean moving those unrecognized bytes to XXX_unrecognized or introducing new API surface will need to be thought through carefully. New API is certainly the least likely to break people, but it would be nice to not to have new API. Let me investigate more deeply the implications of unmarshalling into XXX_unrecognized.

@dsnet
Copy link
Member

dsnet commented Jan 11, 2018

I tested this change and too many targets broke since they were depending on extensions not being in XXX_unrecognized. I believe the proper fix for this is a behaviorally complete API for introspecting into all aspects of a proto message (#364). Access to unrecognized extensions would be part of that API and we can finally kill off the XXX_unrecognized field and friends.

I know it's been some time, but I've made good progress on an API for #364. I will probably publish a document in the near future.

@dsnet dsnet closed this Jan 11, 2018
@jhump
Copy link
Contributor Author

jhump commented Jan 11, 2018

@dsnet, thanks for the update. I eagerly await your document, describing a fully-featured message API!

@golang golang locked and limited conversation to collaborators Jun 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants