-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MessageToJson outputs the wrong type for uint64 and int64 in Python #2954
Comments
As per proto3 JSON spec, uint64/int64 fields should be printed as decimal strings. See: The reason is that uint64/int64 is not part of JSON spec and many JSON libraries only support double precision. To prevent precision loss our proto3 JSON spec requires serializers to put int64/uint64 values in strings. |
I wanted ask the same question for I respect Regardless to this unexpected behavior, I am assuming only by the name The question is, should |
Isn't MessageToDict part of the json_format package? It should honor the same proto3 JSON spec as with MessageToJson. |
It's all nice that m = test_pb2.Message.FromString(serialized_proto_bytes)
d = MessageToDict(m)
copy = test_pb2.Message(**d) because of type inconsistency. |
I'll add my vote to @WloHu that the code fragment above should work (ie the process should be reversible), even if we have to specify some optional argument like: |
…SON (#5010) * Fix an issue with the timestamps in the endpoint response Signed-off-by: Chenran Li <[email protected]> According to issue #4037, the returned JSON of the endpoints has `creation_timestamp` and `last_updated_timestamp` as strings, not numbers. It's different from what was documented in the [official doc](https://www.mlflow.org/docs/latest/rest-api.html#mlflowregisteredmodel). The reason is we are calling Google's `MessageToJson` API to convert protobuf to json, which implicitly converts int64/fixed64/unit64 fields to strings. And they claimed it's a feature not a bug (see the [discussion](protocolbuffers/protobuf#2954 (comment))). According to the bug reporter, this bug doesn't exist in Azure ML mlflow server (which is essentially our Databricks mlflow server). That's because we are using ScalaPB's `ToJson()` API for all the Databricks endpoints, and it doesn't convert int64 to string. There is no way to let `MessageToJson` API not convert int64 to strings. Nor are there any other good Python proto-to-json libraries. So to fix this bug, we have to choose from: * (what I'm doing in this PR) manually converting the int64/uint64/fixed64 fields back to numbers after calling `MessageToJson` * (too risky so I chose not to do) writing our customized `MessageToJson` API
For my case
|
Even if it is part of the json_format package, it is not formatting to json when calling MessageToDict. So why on earth keep this JSON convention as the default behavior? |
Here's a very simple proto file:
and I created a protobuf message in Python as follows:
Note that
foo6
andfoo1
should not have quotes as they are integers and not strings. Using theprotobuf_to_dict
module does not have this problemMy
protoc
version is3.2.0
and python version is3.5.2
The text was updated successfully, but these errors were encountered: