Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Upgrade GRPC/Proto to Latest #2857

Closed
wants to merge 19 commits into from

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Mar 1, 2021

Which problem is this PR solving?

Short description of the changes

  • Changes to use the latest proto builder from OTel
    • we can build a jaeger image with latest and use it if desired
  • Copy in removed method generateAndRegisterManualResolver
    • this method looks wild to me. Using a time value for the scheme? Possibly something to clean up in a future PR
  • Regenerated proto with the latest builder
  • Changed a few packages to use gogoproto over google proto
  • Added /proto-gen/api_v2/codec.go. This forces grpc to use gogoproto over google proto.
    • I hate the location of this, but unsure if there's a better place that will be always be imported in a package that uses grpc.

@joe-elliott joe-elliott requested a review from a team as a code owner March 1, 2021 15:10
@joe-elliott joe-elliott requested a review from yurishkuro March 1, 2021 15:10
@mergify mergify bot requested a review from jpkrohling March 1, 2021 15:10
@codecov
Copy link

codecov bot commented Mar 1, 2021

Codecov Report

Merging #2857 (c72be74) into master (e4cacd0) will increase coverage by 0.00%.
The diff coverage is 80.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2857   +/-   ##
=======================================
  Coverage   95.91%   95.91%           
=======================================
  Files         223      223           
  Lines        9695     9699    +4     
=======================================
+ Hits         9299     9303    +4     
- Misses        326      327    +1     
+ Partials       70       69    -1     
Impacted Files Coverage Δ
cmd/collector/app/zipkin/http_handler.go 100.00% <ø> (ø)
plugin/storage/memory/memory.go 100.00% <ø> (ø)
cmd/agent/app/reporter/grpc/builder.go 94.23% <80.00%> (-1.61%) ⬇️
pkg/config/tlscfg/cert_watcher.go 92.20% <0.00%> (-2.60%) ⬇️
plugin/storage/integration/integration.go 77.34% <0.00%> (-0.56%) ⬇️
plugin/storage/badger/spanstore/reader.go 96.08% <0.00%> (+0.71%) ⬆️
cmd/query/app/server.go 97.08% <0.00%> (+1.45%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4cacd0...c72be74. Read the comment docs.

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is model/model/model.pb.go file needed?

Makefile Outdated Show resolved Hide resolved
model/codec.go Outdated
// Unmarshal implements encoding.Codec
func (c *gogoCodec) Unmarshal(data []byte, v interface{}) error {
return proto.Unmarshal(data, v.(proto.Message))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is already explicitly covered by the existing tests for marshalling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all grpc/proto marshalling/unmarshalling will now go through this file.

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
@joe-elliott
Copy link
Member Author

joe-elliott commented Mar 2, 2021

Is model/model/model.pb.go file needed?

Nope. Has been removed.

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
@@ -0,0 +1,48 @@
// Copyright (c) 2021 The Jaeger Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe putting this into /model is better, because no other APIs can be used without model, but not vice versa.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer model, but this is where the cracks in this approach show. At this point what I have pushed works, but it is based on the ordering of the init() in two different packages. The init method in /proto-gen/api_v2/codec.go has to occur after this init:

https://github.com/grpc/grpc-go/blob/master/encoding/proto/proto.go#L33

The only package in Jaeger that eventually includes google.golang.org/grpc/encoding/proto is api_v2 so putting codec.go in this package guarantees that we are called second and gogoproto is the registered coded for proto with grpc. See this code to see the registration process.

What this means is that future, unrelated changes to Jaeger could magically break this by accidentally calling the grpc/encoding/proto init after our init. The good news is that CI would catch this, the bad news is that this is bad.

I don't hate this change. It allows us to stay up to date with OTel and grpc, but there are obvious drawbacks. I support either moving forward (with additional comments in codec.go to explain the above issue) or being patient with gogoproto for a few months to see if things resolve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also fine with this, but I suggest literally pasting your explanation into the comments.

Also, if we move codec.go into model/, can we not add an _ import into it to ensure that grpc/encoding/proto is called first?

@joe-elliott
Copy link
Member Author

github is not updating, but currently all tests are passing aside from "CIT Memory And Badger". It is failing with

integration.go:184: plugin error: rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial unix /tmp/plugin047540702: connect: connection refused"

I will look into this when I get a chance.

@yurishkuro
Copy link
Member

currently all tests are passing aside from "CIT Memory And Badger"

It would be exercising gRPC storage plugin infra. A bit concerning that it fails, since wire format / handshakes should not have changed with gRPC upgrade.

@yurishkuro
Copy link
Member

I wonder if this has something to do with grpc-plugin module itself and which version of gRPC it usees. There isn't good logging from the test, unfortunately, my guess is that either start-up fails of the initial handshake fails

    integration.go:89: Waiting for storage backend to update documents, iteration 1 out of 30
    integration.go:184: grpc stream error: rpc error: code = Unavailable desc = transport is closing

@joe-elliott
Copy link
Member Author

joe-elliott commented Mar 5, 2021

Ok, I hit what may be a real blocker that requires either serious changes in the hashicorp go plugin or gogoproto. The memstore plugin is exiting with the below error. (Others are seeing similar issues.)

panic: protobuf tag not enough fields in Empty.state:

Which comes from this line in the gogo proto unmarshalling code:

https://github.com/gogo/protobuf/blob/b03c65ea87cdc3521ede29f62fe3ce239267c1bc/proto/table_unmarshal.go#L341

This is coming from hashicorps use of the Empty type here:

https://github.com/hashicorp/go-plugin/blob/0c19a1387af146eaa3335dad0a021741344ac3e9/internal/plugin/grpc_stdio.pb.go#L9

The empty type is now a wrapper around a google.golang.org/protobuf type here:

https://github.com/protocolbuffers/protobuf-go/blob/master/types/known/emptypb/empty.pb.go#L52

Note that the empty type has 3 metadata fields: state, sizeCache and unknownFields. These fields were formerly ??, XXX_sizecache and XXX_unrecognized.

So I can actually get tests to pass by ignoring these fields, but I have to directly edit gogo proto vendored code to do it. Tests pass by extending this line to include the new fields:

https://github.com/gogo/protobuf/blob/master/proto/table_unmarshal.go#L327

if f.Name == "XXX_NoUnkeyedLiteral" || f.Name == "XXX_sizecache" || f.Name == "state" || f.Name == "sizeCache" || f.Name == "unknownFields" {
    continue
}

I don't think I can take this any farther. The above change technically works, but doesn't feel like a valuable PR to gogoproto. I don't have any real, deep understanding of what this change does in the broader context of proto parsing.

@pavolloffay
Copy link
Member

done in #3056

thanks @joe-elliott and @albertteoh for the prior work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade google.golang.org/grpc to 1.35.0
4 participants