-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add grpc reflection functionality to grpc client #1987
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this PR and sorry for the delay in reviewing!
I am not sure if having a client.reflect()
method in the init context is the right approach here. Generally, we try to avoid network calls in the init context... On the other hand, this is somewhat similar to client.load()
, so I am not sure if we can't make an exception this time 🤷♂️
Other than this API design issue, I skimmed the rest of the code and it looks good. As I mentioned on slack, we've already closed the k6 v0.32.0 release window, so this can only be merged after we release it around May 10th. So we'll discuss the API design internally (or in the issue) and get back to you around then.
Codecov Report
@@ Coverage Diff @@
## master #1987 +/- ##
==========================================
- Coverage 71.75% 71.63% -0.12%
==========================================
Files 182 183 +1
Lines 14259 14286 +27
==========================================
+ Hits 10231 10234 +3
- Misses 3387 3410 +23
- Partials 641 642 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Yeah I was trying to avoid that but the only other option I could come up with was to do it in the VU loop, but then that'll cause extra load on the API side because the reflection api will be getting called every loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contribution @joshcarp! Are you still available to work on this PR? We'd like to try to get it merged for the upcoming v0.33.0 release (planned for end of June), but after taking a deeper look I'm not sure we'd have time to fix the leftover things here.
As for tests, we should probably add reflection support to the test server and test against that.
js/modules/k6/grpc/reflect.go
Outdated
net.Dialer{ | ||
Timeout: 2 * time.Second, | ||
KeepAlive: 10 * time.Second, | ||
}, netext.NewResolver(net.LookupIP, 0, types.DNSfirst, types.DNSpreferIPv4)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmm shouldn't all of this inherit from the global settings (state.Options
)? The UserAgent
for sure should be consistent with the one set for HTTP and eventually WS 😅, and the Dialer/TLS/DNS settings also seem out of place here, since these should be configurable and not static.
Hopefully this wouldn't be a big problem and we can fix it before #883.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it's in the VU context and this has been removed
samples/grpc_reflection.js
Outdated
import { check } from "k6"; | ||
|
||
let client = new grpc.Client(); | ||
client.reflect("127.0.0.1:10000", { plaintext: true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @na-- that having this in the init context is problematic. Keep in mind that the init context is evaluated at least 4 times (for getting the options
, for setup()
, teardown()
and handleSummary()
even if these aren't defined in the script), and then for every created VU. So it's wasteful to have the gRPC connection/reflection happen here multiple times, and it could impact VU initialization time.
To me it would make sense to do this in setup()
, but since we can't easily share data from there it wouldn't be feasible for distributed execution. Maybe we should prioritize having a per-VU init function (#785), though even in that case it would be wasteful for each VU to do the reflection, and the service definitions (whether loaded statically or reflected) should be shared among VUs anyway. So it seems like we need some kind of SharedArray
and per-VU init to make this happen, which is not trivial to add.
And from an API perspective, I think reflection should be an implementation detail, so also agree with @na-- that it would make sense for this to be an optional parameter to the Client
constructor, instead of a separate method that scripts have to call.
Could it be added as an argument to client.connect()
instead, in which case we could cache the definitions internally and avoid the init context problems? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client.load('local/path/to.proto')
needed to be in the init context, simply because it if wasn't, then we couldn't bundle the .proto
files for k6 cloud
or k6 archive
. But for reflection, I'm currently on the fence, but leaning very heavily towards requiring it to be in the VU context...
I don't have a strong opinion either way if that should be done as a parameter to the constructor that will make connect()
get the service definitions on the first connection, or if it should be done somewhat like this:
let client = new grpc.Client();
export default function () {
if (__ITER == 0) {
console.log(`Initializing ${__VU}...`)
client.connect("127.0.0.1:10000", { plaintext: true })
client.getProtoDefinitions(); // or whatever, using the already established connection
}
// ...
}
In either case, the iteration_duration
of that first iteration will be a bit skewed, but there's nothing we can do about that until we have #785 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work as well, but I'd prefer if we avoided the use of __ITER
for what should be built-in functionality, so I'm for handling that internally in connect()
if possible. There's no reason to have scripts manually call getProtoDefinitions()
unless we plan to expose them in JS for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw since it seems like the constructor doesn't have any parameters currently, while connect()
does, and since it seems like the very rough consensus is that this should happen during connect()
, should the option be a connect()
parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very versed in gRPC, but from what I can see:
- in order to have the reflection we need to connect to the server so it is only natural that it should happen with the
connect
method or after it - if there is a way to detect that the server can give us the proto files I think k6 should just get the, And then have a way to tell it not to with an argument on the
connect
method. If it isn't possible I have no opinion on whether it should be an argument or a function call after the connect 🤷 - We can definitely cache across-VUs as well as per-vu with the same/similar method used in SharedArray or the xk6-segment extension ... I do consider this trivial as well ;) . Although probably there should be a way to tell if you don't want to 🤷 maybe in a separate PR. This should remove the whole
__ITER==0
thing and we can probably use the server address as the key (from what I understand)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cross-VU sharing of protobuf definitions is definitely something out of scope for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it so that the client.reflect
function is called at the end of the connect
function only if {reflect: true}
is passed in.
I have used a sync.Once
to ensure that it is only called once per client
Hey, |
21f4c03
to
d3f19f8
Compare
Codecov Report
@@ Coverage Diff @@
## master #1987 +/- ##
=========================================
Coverage ? 71.98%
=========================================
Files ? 178
Lines ? 14309
Branches ? 0
=========================================
Hits ? 10301
Misses ? 3376
Partials ? 632
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @joshcarp, thanks for picking this back up. The latest changes LGTM. We'd like to get this merged for the upcoming v0.34.0 release, slated for end of August.
A few points besides the minor comments:
-
The performance is surprisingly poor when testing against a local
samples/grpc_server
withsamples/grpc_reflection.js
:grpc_req_duration....: avg=538.05ms min=75.52ms med=440ms max=959.33ms p(90)=907.96ms p(95)=933.64ms iteration_duration...: avg=549.21ms min=76.89ms med=474.45ms max=960.83ms p(90)=914.34ms p(95)=937.59ms
It's not much better with
samples/grpc.js
, even withoutconsole.log()
-ing the message. So the server might be unoptimized, but I'm wondering if you know the reason for this.I suppose it's related to how we connect and disconnect on each iteration. Ideally we could keep the connection open across iterations, but that might be a much bigger change. And this is the way WebSockets work now, so it's probably acceptable as is.
-
I see you registered the reflection on the
httpmultibin
server, but I don't see any tests for it. Do you think some could be added alongside the existing ones inclient_test.go
? -
There are some linter errors that should be fixed, and we'll look into the xk6 test failures, which are probably related to the recent k6io -> grafana org change.
Hey
Yeah, I started testing it although ran into some problems because the grpc test cases seem to all use the same state which means it's difficult to test both
Apologies for that
investigating |
Hey @imiric
|
b58d35a
to
de0b076
Compare
Performance issues seem to be an issue in k6 already: #1846 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick changes @joshcarp!
Nice work on the tests too. I'd prefer smaller tests for individual methods (TestClientLoad
, TestClientConnect
, etc.), but this is great as well.
This LGTM for merging, but let's wait for someone else from the k6 team to also review it, possibly sometime next week.
Performance issues seem to be an issue in k6 already: #1846
Yeah, I remember that issue, but the performance in this test is orders of magnitude worse. In any case, it's also that way on master
with samples/grpc.js
, so it's not related to these changes, but probably to the test server itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, great job.
Please do not delete license files and rebase on top of master
which will also fix the currently failing CI for you. (you can also squash everything, although we will do that either way in the end 🤷 )
As I mentioned in one of the comments I am not certain caching (or more accurately not re-reflecting on each connection) is a good idea, especially given all the things that this break currently. On the other hand the API currently isn't all that useful and as I mention I doubt we won't redo it as a whole at some point in the future so ... maybe have caching for now ... and hope people won't have problems with the fact it does break some workflows that reflection enables 🤷
@imiric the abysmal performance you are noticing is because the current grpc test server has a bunch of sleeps in it to be more "realistic" which probably should be dropped. My numbers below are from after I've dropped them.
js/modules/k6/grpc/client.go
Outdated
if tags == nil { | ||
tags = make(map[string]string) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because it get's called on the reflect call and it has no tags?
Because maybe we should tag it with something in that case
js/modules/k6/grpc/client.go
Outdated
var ok bool | ||
reflect, ok = v.(bool) | ||
if !ok { | ||
return false, fmt.Errorf(`invalid reflect value %#v need bool`, v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, fmt.Errorf(`invalid reflect value %#v need bool`, v) | |
return false, fmt.Errorf(`invalid value for 'reflect': '%#v', it needs to be boolean`, v) |
I hope the added quotes will help users understand that it is the value of 'reflect' instead of some kind of reflect value ... also in general good idea to print random values you get in quotes in order to be easier to understand where they start and end.
js/modules/k6/grpc/client.go
Outdated
@@ -69,6 +71,7 @@ var ( | |||
type Client struct { | |||
mds map[string]protoreflect.MethodDescriptor | |||
conn *grpc.ClientConn | |||
sync sync.Once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename this to reflectOnce
?
js/modules/k6/grpc/client.go
Outdated
var err error | ||
c.sync.Do(func() { | ||
err = c.reflect(ctxPtr) | ||
}) | ||
if err != nil { | ||
return false, fmt.Errorf("error invoking reflect API: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The err
will only be populated on the first run which means that if you run multiple iterations and the first one fails reflect the API you will get
ERRO[0000] error invoking reflect API: rpc error: code = Unimplemented desc = unknown service grpc.reflection.v1alpha.ServerReflection
at go.k6.io/k6/js/common.Bind.func1 (native)
at file:///home/mstoykov/work/k6io/k6/samples/grpc_reflection.js:7:64(9) executor=shared-iterations scenario=default source=stacktrace
ERRO[0000] method "/main.RouteGuide/GetFeature" not found in file descriptors
at go.k6.io/k6/js/common.Bind.func1 (native)
at file:///home/mstoykov/work/k6io/k6/samples/grpc_reflection.js:9:13(21) executor=shared-iterations scenario=default source=stacktrace
ERRO[0000] method "/main.RouteGuide/GetFeature" not found in file descriptors
at go.k6.io/k6/js/common.Bind.func1 (native)
at file:///home/mstoykov/work/k6io/k6/samples/grpc_reflection.js:9:13(21) executor=shared-iterations scenario=default source=stacktrace
ERRO[0000] method "/main.RouteGuide/GetFeature" not found in file descriptors
at go.k6.io/k6/js/common.Bind.func1 (native)
at file:///home/mstoykov/work/k6io/k6/samples/grpc_reflection.js:9:13(21) executor=shared-iterations scenario=default source=stacktrace
ERRO[0000] method "/main.RouteGuide/GetFeature" not found in file descriptors
at go.k6.io/k6/js/common.Bind.func1 (native)
at file:///home/mstoykov/work/k6io/k6/samples/grpc_reflection.js:9:13(21) executor=shared-iterations scenario=default source=stacktrace
as you can see the much more prevalent message is that k6 couldn't find file descriptors.
This also means that if the connection fails for any reason on the first request you practically bricked this connection.
Additional problems:
- if the server reloads to change the definition and the user decided to use reflection so that ... works, it won't
- The connection can technically connect to multiple servers over many iterations, but this will prevent it to load different definitions for them.
I currently am of the opinion that it might be better off to not cache the definition and keep reloading them on each iteration just so that we don't tie our hands now. On the other hand, IMO this whole grpc API will be thrown out at some point when we start implementing everything it should support, so ... might as well have caching now 🤷 .
From my experiments. 100 VUs with caching it does 2.3k without caching it does a bit above 1k iter/s and with the __ITER == 0
hack around 10k.
We can obviously keep adding flags, but again IMO the whole grpc module will need a rewrite and redoing of the API at some point so I am not certain how much adding flags for corner cases as the mentioned above is ... reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the server reloads to change the definition and the user decided to use reflection so that ... works, it won't
is this a problem? grpc reflection isn't really a dynamic thing that is expected to change definitions ever, at least in all the time that I've used it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some more feedback. Don't feel like you should just start working on it as we are likely going to have to discuss it some more :).
Also thanks for the PR again 🙇
var err error | ||
c.reflectOnce.Do(func() { | ||
err = c.reflect(ctxPtr) | ||
}) | ||
if err != nil { | ||
return false, fmt.Errorf("error invoking reflect API: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code still makes it so that in the case of the reflect not working out, on next iterations, you won't even get the error again. In the case of a real script with multiple VUs this will mean that you will just get the error from the Invoke not working (or working sometimes depending I guess on how it fails in the case that for example some calls get parsed and work while the rest error out - I don't know if that is possible).
Additionally the current grpc API let's people connect to different servers through multiple connect
calls which means that in that case this feature will just not work which seems ... bad to me.
I don't really have a really good fix for both cases (and I am not certain the second one is important 🤷), but the proposal is to have something like:
client.Connect(....)
if (!client.HasReflected()) {
client.Reflect() // this always does the reflection on the current
}
But preferably with a better interface :).
Don't just do this change, we will have to discuss it and this change is already unlikely to land in v0.34.0 :(.
} | ||
resp, err := methodClient.Recv() | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if some of those errors are wrapped with fmt.Errorf and some additional context as this to me seems like something that will likely get ... not very useful error messages.
client := reflectpb.NewServerReflectionClient(c.conn) | ||
methodClient, err := client.ServerReflectionInfo(*ctxPtr) | ||
if err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call emits a metric(which is why you want to return non-nil tags in the GetTags) and after some internal discussion we kind of don't want to emit it at all if possible and the simplest solution so far is to make a new context with cancel and cancel it on exit from here.
For some reason the metric gets emitted on the connection close but I couldn't find to prevent this, but that is another problem it seems like something lives way too much. This also means that it's easier to reproduce it with multiple iterations as otherwise sometimes the metric doesn't get emitted fast enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I guess it might be better to just set an empty map as the tags here instead of changing the getTags
method
Hey, I think that merging it might give people the (possibly) incorrect idea that the grpc functionality is reliable and ready to use, which it is not. If I get time I'll run more benchmarks against a non-grpc server and possibly using a non k6 tool to triage grpc issues.
|
Hmm it shouldn't be a big problem if the initial Also, in general, we shouldn't compare k6 gRPC performance to k6 HTTP performance, that would be comparing apples to oranges. If we're worried about k6's gRPC implementation being slow for some reason, we should compare it to other gRPC implementations. But, again, that shouldn't be a big concern here, since the performance of the |
grpc.Client().reflect
function.convertToMethodInfo
to convert file descriptors into[]MethodInfo
andprotoreflect.MethodDescriptor
. This is used in bothLoad
andReflect
samples/grpc_reflection.js
example of how to use this new functionHelp needed:
This PR closes: #1930