-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonpb: change Marshal/Unmarshal to return error if any required field is not set #472
Conversation
is not set. For Unmarshal, this means JSON is either missing any required field or has required field set to null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Marshaling code looks good, but the Unmarshal is a no-go. :(
jsonpb/jsonpb.go
Outdated
} | ||
|
||
// Marshal it back to check for missing required field error. | ||
// TODO: Parse through the message and check for any missing required field w/o having to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO will need to be resolved before this could be considered for merging. Marshaling the JSON just after Unmarshaling is going to build in a lot of overhead that wouldn’t be a good idea to pass around. (Especially for everyone using proto3, where required fields have been entirely removed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this needs to be improved. Sorry if I wasn't clear in my initial posting. I didn't intend to merge this yet. Since I didn't have time this week to continue doing this, I decided to post this PR for others to try it out and see if anyone may be interested in improving it further.
Note though that a proto3 message can contain a proto2 message and vice-versa. So, as far as I know, the code will still need to traverse all the fields and subfields of the message looking for required ones that are not set.
Hm… you are correct, the best kind of correct. Though, I think anyone using proto3 might have entirely forgotten that fields were even allowed to be required… or maybe just those of us who got the Google “required considered harmful” meme drilled forcefully into our head by SREs… |
This change is to improve upon PR #472. Instead of simply calling Marshal within Unmarshal to check for required fields, traverse the proto message instead looking only for required fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated Unmarshal's logic for checking required fields to traverse the message looking for those instead of simply calling Marshal as latter is most likely more inefficient.
The code assumes the use of field tags in proto messages as with most of the existing codebase. I'm not sure if this is good enough though.
I've added more tests as well for the use cases that I'm aware of.
jsonpb/jsonpb.go
Outdated
// IsNil will panic on most value kinds. | ||
switch value.Kind() { | ||
case reflect.Chan, reflect.Func, reflect.Interface: | ||
if value.IsNil() { | ||
continue | ||
} | ||
case reflect.Ptr: | ||
if prop.Required && value.IsNil() { | ||
return fmt.Errorf("required field %q is not set", prop.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, for error messages like this, should I use Name or OrigName? Or is it dependent on whether message is for marshaling or unmarshaling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Properties.Name
is fine. It's even documented as such: "name of the field, for error messages"
jsonpb/jsonpb.go
Outdated
// This function is called from Unmarshal and assumes certain actions are done ahead. | ||
func checkRequiredFields(pb proto.Message) error { | ||
// Most well-known type messages do not contain any required field. The "Any" type may have | ||
// required fields, however the Unmarshaler should have invoked proto.Marshal on the value field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean proto.Unmarshal here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. During unmarshaling of an "Any" JSON to a Go proto message, after it resolves and constructs the embedded message, it will actual do a proto marshal to a serialized form and store the bytes into Any.Value field. See https://github.com/golang/protobuf/blob/dev/jsonpb/jsonpb.go#L751.
I've updated the comment to explain this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better, but I still have concerns… they maybe just require a code comment or two explaining things though.
jsonpb/jsonpb.go
Outdated
|
||
func checkRequiredFieldsInValue(v reflect.Value) error { | ||
if pm, ok := v.Interface().(proto.Message); ok { | ||
if err := checkRequiredFields(pm); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this isn’t in a loop, then it’s not necessary to check the error value before returning it. i.e. you can just use return checkRequiredFields(pm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I previously have this block inlined above multiple times and then pulled it out to this helper func but forgot that this added check is no longer required. Fixed.
jsonpb/jsonpb.go
Outdated
|
||
// IsNil will panic on most value kinds. | ||
switch field.Kind() { | ||
case reflect.Map: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this before, and you told me how a proto3 could contain a proto2 message, but maps are exclusive to proto3, so do they really need to be checked?
Consider adding some code comments explaining the rational of checking maps despite their appearance being expected to be mutually exclusive with required fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, maps are allowed in proto2. I have a test case for this.
jsonpb/jsonpb.go
Outdated
return err | ||
} | ||
|
||
return checkRequiredFields(pb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to work the code into UnmarshalNext, rather than iterate over all the fields a second time? (albeit, yes, fairly fast code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. The current unmarshaling logic does rely on json.Unmarshal to populate fields. To embed the check for required in there means that it will still need to check for all fields that have been unmarshaled.
But you are correct that doing the check w/in the unmarshaling code may be more efficient. The unmarshaling code is quite complicated though that I prefer to avoid mixing this logic in right now. A refactor/rewrite of the unmarshaling code should account for this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jsonpb
logic is pretty non-performant to begin with. We should aim for correctness in this PR.
This package is in need a major re-factoring (really, the entire protobuf repo). Let's revisit this when that refactoring occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though UnmarshalNext
is used by almost nobody, why is this check in Unmarshal
as opposed to UnmarshalNext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks for catching. Fixed.
jsonpb/jsonpb.go
Outdated
// Oneof fields need special handling. | ||
if sfield.Tag.Get("protobuf_oneof") != "" { | ||
// field is an interface containing &T{real_value}. | ||
v := field.Elem().Elem() // interface -> *T -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered the case where a oneof contains a simple scalar (string, bytes, varint), which in proto3 is not a pointer?
What will this code do in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current proto compiler will always produce an interface field for a oneof field. It will also produce wrapper structs, one for each oneof field. And hence, this block will grab the oneof field inside the wrapper struct which can be any type.
Added more comments and clean up based on feedback from PR#472.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I've updated the comments and did the clean up you suggested.
jsonpb/jsonpb.go
Outdated
// Oneof fields need special handling. | ||
if sfield.Tag.Get("protobuf_oneof") != "" { | ||
// field is an interface containing &T{real_value}. | ||
v := field.Elem().Elem() // interface -> *T -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current proto compiler will always produce an interface field for a oneof field. It will also produce wrapper structs, one for each oneof field. And hence, this block will grab the oneof field inside the wrapper struct which can be any type.
jsonpb/jsonpb.go
Outdated
|
||
// IsNil will panic on most value kinds. | ||
switch field.Kind() { | ||
case reflect.Map: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, maps are allowed in proto2. I have a test case for this.
jsonpb/jsonpb.go
Outdated
return err | ||
} | ||
|
||
return checkRequiredFields(pb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly. The current unmarshaling logic does rely on json.Unmarshal to populate fields. To embed the check for required in there means that it will still need to check for all fields that have been unmarshaled.
But you are correct that doing the check w/in the unmarshaling code may be more efficient. The unmarshaling code is quite complicated though that I prefer to avoid mixing this logic in right now. A refactor/rewrite of the unmarshaling code should account for this logic.
jsonpb/jsonpb.go
Outdated
// This function is called from Unmarshal and assumes certain actions are done ahead. | ||
func checkRequiredFields(pb proto.Message) error { | ||
// Most well-known type messages do not contain any required field. The "Any" type may have | ||
// required fields, however the Unmarshaler should have invoked proto.Marshal on the value field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. During unmarshaling of an "Any" JSON to a Go proto message, after it resolves and constructs the embedded message, it will actual do a proto marshal to a serialized form and store the bytes into Any.Value field. See https://github.com/golang/protobuf/blob/dev/jsonpb/jsonpb.go#L751.
I've updated the comment to explain this better.
jsonpb/jsonpb.go
Outdated
|
||
func checkRequiredFieldsInValue(v reflect.Value) error { | ||
if pm, ok := v.Interface().(proto.Message); ok { | ||
if err := checkRequiredFields(pm); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I previously have this block inlined above multiple times and then pulled it out to this helper func but forgot that this added check is no longer required. Fixed.
jsonpb/jsonpb.go
Outdated
// IsNil will panic on most value kinds. | ||
switch value.Kind() { | ||
case reflect.Chan, reflect.Func, reflect.Interface: | ||
if value.IsNil() { | ||
continue | ||
} | ||
case reflect.Ptr: | ||
if prop.Required && value.IsNil() { | ||
return fmt.Errorf("required field %q is not set", prop.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Properties.Name
is fine. It's even documented as such: "name of the field, for error messages"
jsonpb/jsonpb.go
Outdated
return err | ||
} | ||
|
||
return checkRequiredFields(pb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jsonpb
logic is pretty non-performant to begin with. We should aim for correctness in this PR.
This package is in need a major re-factoring (really, the entire protobuf repo). Let's revisit this when that refactoring occurs.
jsonpb/jsonpb.go
Outdated
return err | ||
} | ||
|
||
return checkRequiredFields(pb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though UnmarshalNext
is used by almost nobody, why is this check in Unmarshal
as opposed to UnmarshalNext
?
jsonpb/jsonpb.go
Outdated
return nil | ||
} | ||
|
||
v := reflect.ValueOf(pb).Elem() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to verify that pb is a pointer to a struct. If it isn't, then return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I'm guessing we'll simply ignore message if not pointer to struct.
jsonpb/jsonpb.go
Outdated
// A oneof field is an interface implemented by wrapper structs containing the actual oneof | ||
// fields. Field is an interface containing &T{real_value}. | ||
if sfield.Tag.Get("protobuf_oneof") != "" { | ||
v := field.Elem().Elem() // interface -> *T -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check:
- for nil on both the interface.
- that the interface element is a pointer to a struct
- that the pointer is not nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return err | ||
} | ||
} | ||
case reflect.Slice: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "bytes" scalar for proto will fall under this case and will need to be specially handled.
IIRC, []byte{}
means that a zero-length bytes is set, while []byte(nil)
means that it was never set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I forgot about that. Added handling of non-repeated slice.
Yes, I'm going with []byte{} to mean it is set.
Make checking in Marshal to be done up front in order to account for custom marshalers, i.e. do not rely on custom marshalers to do the checking. Reuse the same checkRequiredFields func on both Marshal and Unmarshal. Added more check logic per feedback from PR#472.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
Per conversation with @dsnet, I've also changed Marshal to call checkRequiredFields before it marshals the given message. This is for consistency in ensuring that the required field checking logic is done by jsonpb and not rely on any custom marshaler/unmarshaler.
jsonpb/jsonpb.go
Outdated
return err | ||
} | ||
|
||
return checkRequiredFields(pb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks for catching. Fixed.
jsonpb/jsonpb.go
Outdated
return nil | ||
} | ||
|
||
v := reflect.ValueOf(pb).Elem() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I'm guessing we'll simply ignore message if not pointer to struct.
return err | ||
} | ||
} | ||
case reflect.Slice: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I forgot about that. Added handling of non-repeated slice.
Yes, I'm going with []byte{} to mean it is set.
jsonpb/jsonpb.go
Outdated
// A oneof field is an interface implemented by wrapper structs containing the actual oneof | ||
// fields. Field is an interface containing &T{real_value}. | ||
if sfield.Tag.Get("protobuf_oneof") != "" { | ||
v := field.Elem().Elem() // interface -> *T -> T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Thanks for the review, @dsnet. @puellanivis do you have more feedback/comments on this PR before I merged it in? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entirely superficial/nitpicky concerns.
jsonpb/jsonpb.go
Outdated
continue | ||
} | ||
v = v.Elem() | ||
if v.Kind() != reflect.Struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&& v.NumFields() > 0
maybe, just to be sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant to check for NumField() == 0 so that Field(0) below won't panic. That would be a bug in protoc-gen-go, but I went ahead and added the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but any value of NumFields() less than 0 would be an error state. And checking for greater than zero, and “is not zero” is either: identical with unsigned ints, or just extra CYA for signed ints.
Sure NumField shouldn’t ever return a negative number, but iI find it to generally be a good idea to remember that technically it might… even if only erroneously. So, I typically prefer to avoid weak/narrow tests, when a stronger/broader test is available for the same cost. (I also usually test if len(array) < 1
rather than just len(array) == 0
for the same reasons.)
It could be viewed as an example of “overengineering” but I prefer to cover all load-scenarios rather than only the specific load-scenarios that one thinks might cause a problem.
I mean, if you can order 100 nuts for $100 in either 3 lbs load-hanlding, or 1 lb load-handling, even if you expect that you will only need at most 1 lb of load-handling, it doesn’t cost anything extra to get the higher load handling ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
jsonpb/jsonpb.go
Outdated
continue | ||
} | ||
v = v.Elem() | ||
if v.Kind() != reflect.Struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant to check for NumField() == 0 so that Field(0) below won't panic. That would be a bug in protoc-gen-go, but I went ahead and added the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm going to merge this now. If there are other concerns, we can always create another PR to fix.
jsonpb/jsonpb.go
Outdated
continue | ||
} | ||
v = v.Elem() | ||
if v.Kind() != reflect.Struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
FYI, this change effectively breaks the support for dynamic/custom messages that was added in #325. The way in which required fields are checked herein assumes that the message is normal generated message, without providing a way to override/customize. In particular, if a given struct has any unexported field that contains a pointer value (including a slice or map with an element that contains a pointer value) and that pointer value is not nil, then this will cause a panic here:
I'll open an issue for it. |
Change Marshal/Unmarshal to return error if any required field is not set.
For Unmarshal, this means JSON is either missing any required field or has required field set to null.
This is an initial attempt at fixing #440. While I've modified Marshal to produce an error for required fields that are not set, I simply reuse this in Unmarshal to check for required fields which is probably not as efficient.
Anyways, I won't have time to improve on this this week and am simply putting this PR out in case someone would like to try it out and/or feel free to pick it up from here.
I've also yet to test the impact of this change on our existing codebase.