-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing value/null support for scalar value types in proto 3 #1606
Comments
While proto3 mentions using proto2 Message Types
It might not be good application design to expect the .proto files and binary format to stay the same after switching to proto3. Even though it appears that using the |
@lostindark, workaround (2) using an oneof is a wire compatible change. |
The issue is it doesn't produce nice code, and it needs box/unbox when use (hurt performance). |
@lostindark what if we optimize that case (one primitive field in an oneof) to eliminate the box/unbox cost? |
@xfxyjwf That seems a good solution with minimum change required. However, I do wish I don't need to name the oneof, like below: |
I am thinking of a dedicated syntax for that. Something like: message Foo {
nullable int32 value = 1;
} which is basicaly a syntax sugar for: message Foo {
oneof value_oneof {
int32 value = 1;
}
} I'll bring this up with the team and see how others think about this. |
message Foo gets error on v2: "value" is already defined in "Foo". |
Oneof fields are just like regular fields and are in the parent message's naming scope. You need to define your message as: message Foo {
int x = 1;
oneof v1 { int32 value1 = 2; }
oneof v2 { int32 value2 = 3; }
} All these fields need different names and field numbers. |
The C#/.NET library already optimizes for the Even in my own application which used protobuf2, I relied on the HasXXX functions for field presence. This was used to broadcast updates, only filling in the fields that had changed. Not present didn't mean that it was the default value, but that it hadn't changed from the previous value. I think though, this was probably viewed as a mistake in the original design/spec since it conflicts with the eliding of fields with default values. Going forward, I don't see many realistic ways to say that v2 and v3 are binary compatible for applications/messages that relied on this behavior. |
@ngbrown: I don't want to start special-casing the situation where there's a oneof with only a single field in it, no. That feels like complexity for little benefit. The way proto3 has been designed to work is that if you want a primitive field with presence, you use the wrapper type. I wouldn't want to subvert that. |
In struggling (#1655) with this 'bug' I realized that I don't understand the rational behind having default values at all when you can't set the default value or make the field required, i.e., trust that you should use the default value if the field isn't set on the wire. Why not do away with defaults all together? If a field is set put it on the wire even it is a zero length string or a zero. Otherwise one can assume it is null. This problem is killing me because I have a situation where a int32 field may or may not be set and one of the valid data values is zero. As it stands I have to use oneof or do something hackish like +1 to all values zero or greater before serialization and -1 after parsing the message. In the current incarnation enums are particularly problematic. Given that "every enum definition must contain a constant that maps to zero as its first element" and fields are inherently optional it is impossible to use it in an application. Was the zeroth element explicitly set before serialization? Was it not set? There should be a big warning- don't use the zeroth element in an enum. You won't know if it has been set or not. It is best to set the first element to something that will never be selected, e.g., NOT_SET = 0; |
Ran across this in python_message.py and thought it may be pertinent. Just trying to help...
|
@jskeet The problem with wrapper type is it is not wire format compatible with existing proto 2 message. Assume you want to upgrade to proto 3 for some services in a large project, what is the option here if the project already leverage HasXXX feature in proto 2? Upgrade to a different wire format? The cost might be too high. |
@xfxyjwf Is there any update on the discussion? |
Thanks for replying @lostindark. I thought my comment may have fallen on deaf ears. I'm actually coming to this from a new project perspective. I was attracted to v3 vs v2 because of its tight integration with JSON. Since v3 doesn't support HasXXX I've so far worked around this issue by wrapping fields that may be set to the default value in oneof statements. This is very clumsy and brittle to me. Enums take the cake though. All of my enums have the necessary 0th enum element set to NOT_SET. I still don't get why defaults are needed at all in v3 since they are effectively not supported anymore. I know defaults aren't set over the wire. One open question that I have is whether fields set to a default should have some sort of has_been_set_bit flipped to true so that the receiver knows to set the value of a seemingly valueless field to the default? That way when you try to access it you get the default value instead of nothing. |
@christian-storm In my perspective proto 3 has some design issue. In the doc it says:
This make people think all fields are optional now. Does people need optional fields? Yes we do. We don't want wrapper fields (why extra space? and not compatible with proto 2), and we don't want ugly one of approach. I hope this will be fixed in proto 3, or else it will be a big problem for people who have similar requirements. |
We also need this. Adding oneof to every fields that should be nullable if pretty verbose and syntaxic sugar version would be fantastic. What do you think? |
message Foo {
oneof value_oneof {
int32 value = 1;
}
} so, this is the recommended/only way to handle a nullable field? this is quite a non-obvious usage for surely nearly everyone designing their first protobuf messages has this question, the docs could give more guidance I think |
another possibility I've stumbled on is using a wrapper type, since the default value for so you can:
the available wrappers correspond the "Well-Known Types" listed in the docs https://developers.google.com/protocol-buffers/docs/reference/google.protobuf although I didn't see it explained anywhere what they're intended to be used for or how to import them but it seems a bit nicer than the
...the compiler complains about re-use of whereas with the wrappers, the field is always just called it's annoying that, either way, your nullable fields therefore have different get/set code to other fields to be honest, if I am wondering if it makes sense to just use the wrappers for all my fields. |
A question to the protobuf team if they're around. Why not add a new wire type for null values? There are 3 bits reserved in the format allowing for 7 values, and only 6 are used (2 of which are deprecated). |
@ckamel, adding a new wire type could work, but I think there are a few big downsides to doing that:
|
That makes sense. I think null would be a worthy cause for one of those unused wire types left :) But if the older parsers can't ignore unknown wire types then rolling this out would be complicated. |
Hey, Right now, not having nullable field is a very annoying issue:
You will probably argue that this kind of use-case should clearly end in adding extra fields in order to explicitly encode meaning of null value - which is the cleanest way to go. @acozzette I understand the above point but let me point out a few things:
|
I think @dopuskh3 's suggestion would be interesting, adding a isXXXDefined (or isXXXnull) method will not hurt the existing code using protobuf 3. |
I concur with @dopuskh3, having nullable types in ProtoBuf is a very interesting option. I think people migrating from JSON to ProtoBuf are faced with that problem sonner or later, and I'd love to see a clean way to make the transition smoother. In some applications, null actually carries information. Consider for instance the time since last event: if the event actually occured, then you have an int value, otherwise null. |
@dopuskh3 Have you considered using proto2? Proto2 already has generated methods for checking presence like the one you suggested--if you have a field For proto3 I think currently the best approach is to either use the wrapper types or |
Five years of pain coming to an end! |
Frustrating but working. I hope this is not required in 3.13. |
Thanks for the effort on this! Does this got released (not experimental) any more on [https://github.com/protocolbuffers/protobuf/releases/tag/v3.13.0](v3.13.0]? I did not see it on the Release page and wanted to double check |
Funny, we just added very similar support for |
Will encoders be obliged to write every field they know about, regardless of whether it is equivalent to a zero? That is, will checking for presence of a field indicate whether the write-side knew of the existence of the field? |
@yulrizka Proto3 optional is still guarded by an experimental flag in the 3.13 release. @kriskowal No, encoders with proto3 optional fields will work the same way as optional fields in proto2. The field will be written only if it was present, and the notion of presence is unrelated to whether the value is 0 or not. |
Is there any targeted future release version for field presence feature to ship without the experimental flag? |
Well this issue was a rollercoaster to read. I'm glad that in the end, we got this feature. I can understand some of the pros of the original proto3 approach that were expressed, notably that it simplifies language generator implementations for languages like Go, and that it makes the wire protocol more efficient for folks who don't need to check for field presence. I like that the new I can't tell yet if I'll take advantage of |
Has* function is only generated for scalar value types but not nested message? |
@shwuhk As far as I know, there needs to be a way to check for presence of nested messages. Most often with generated code, I've seen this with "has" functions, but I've noticed that it isn't always given to you. For Go, the nested message type ends up being a pointer, which means you can check for presence without a "has" function. You'd check whether the pointer is nil: message MyMessage {
google.protobuf.BoolValue my_bool_value = 1;
} if myMessage.MyBoolValue != nil {
fmt.Printf("Data provided. It is %v.\n", myMessage.MyBoolValue.Value)
} else {
fmt.Printf("Data not provided.\n")
} I'd double check your language's Protobuf documentation and raise an issue in its repo if the ability to check for field presence for non-scalar fields is lacking. |
@mattwelke I think it is not consistency in the generated codes of different language. "has" functions are generated in Java codes but not C# and I am now checking the presence of nested messages just like you said, Thanks for your help, you make it very clearly! |
jsonstring:[{"Type":"String","tag":1,"value":"root"}] |
@lilorsarry: That appears to be unrelated to this issue. Please open a new issue, providing significantly more detail (ideally a short but complete program to reproduce the problem). |
From the protobuf wire format we can tell whether a specific field exists or not. And in protobuf 2 generated code, all fields have a "HasXXX" method to tell whether the field exists or not in code. However in proto 3, we lost that ability. E.g. now we can't tell if a int32 field is missing, or has a value of 0 (default for int32 type).
In many scenario, we need the ability to differentiate missing vs default value (basically nullable support for scalar types).
Feng suggest workarounds:
However, this requires change the message definition, which is a no go if you need to keep the message compatible while upgrade from proto2 to proto 3.
We need to add back the support to detect whether a scalar type field exist or not.
More discussion here: https://groups.google.com/forum/#!topic/protobuf/6eJKPXXoJ88
The text was updated successfully, but these errors were encountered: