Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug flaky TestStreamForServer test #626

Closed
emcfarlane opened this issue Nov 6, 2023 · 7 comments · Fixed by #627
Closed

Debug flaky TestStreamForServer test #626

emcfarlane opened this issue Nov 6, 2023 · 7 comments · Fixed by #627
Labels
bug Something isn't working

Comments

@emcfarlane
Copy link
Contributor

emcfarlane commented Nov 6, 2023

Describe the bug

On bump golang.org/x/net from 0.16.0 to 0.17.0 tests can fail on Send reporting a closed request body error:

--- FAIL: TestStreamForServer (0.25s)
    --- FAIL: TestStreamForServer/get-spec (0.37s)
        connect_ext_test.go:1562: 
            assertion:	assert.Nil
            got:	unknown: EOF

To Reproduce

go test -run=TestStreamForServer -count=1000

Additional context

Tracing down the calls:

  • Write returns an io.EOF error from a close on the request body (io.ErrClosedPipe)
  • Close is called by the transport
  • abortStream from a process reset stream frame with the error:
stream error: stream ID 3; NO_ERROR; received from peer:

Unrelated to changes in x/net 0.17.0. Occurs in the latest v1.12 release.

@emcfarlane emcfarlane added the bug Something isn't working label Nov 6, 2023
@jhump
Copy link
Member

jhump commented Nov 7, 2023

Hmm. Do you know why the stream is being aborted? That error message ("NO_ERROR received from peer") suggests the server sent a GOAWAY frame with error code NO_ERROR, which is supposed to mean that the server is gracefully shutting down. Is there a possible race between an RPC operation completing and the server being shutdown prematurely? (Like maybe there's a defer function that should instead be a t.Cleanup function somewhere?)

@emcfarlane
Copy link
Contributor Author

The PR goes into the details. There are client Sends that are never matched by a server Receive. I think the only reason this was passing is due to buffering of request body reads. So the server is shutting down the connection but I don't think it's a race between close. I did move the server init to use the same *testing.T as the test but not the issue causing the failure.

Two fixes:

  • Send(nil) is ignored if we get a close.
  • Send(msg) is matched by a Receive(msg) to ensure Send succeeds

@jhump
Copy link
Member

jhump commented Nov 7, 2023

The PR goes into the details.

Okay, moving this convo to the PR.

@emcfarlane
Copy link
Contributor Author

Would be good to discuss the intended error behaviour here. I think that Send(nil) should always succeed when initializing the request headers. From the get-spec test above, the behaviour that Send(nil) could return io.EOF is surprising to me and I'd expect shouldn't happen before calling CloseRequest:

connect-go/connect_ext_test.go

Lines 1561 to 1563 in 20b5723

stream := client.CumSum(context.Background())
assert.Nil(t, stream.Send(nil))
assert.Nil(t, stream.CloseRequest())

Currently Send(nil) wait's for request body read. The behaviour of io.Reader states:

If len(p) == 0, Read should always return n == 0. It may return a non-nil error if some error condition is known, such as EOF.

Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0. Callers should treat a return of 0 and nil as indicating that nothing happened; in particular it does not indicate EOF.

We could also remove the call to Write and instead call ensureRequestMade() to avoid capturing Close errors. I think this behaviour would follow the godoc for Send(nil).

func (d *duplexHTTPCall) Write(data []byte) (int, error) {
d.ensureRequestMade()
// Before we send any data, check if the context has been canceled.
if err := d.ctx.Err(); err != nil {

@jhump
Copy link
Member

jhump commented Nov 9, 2023

Your last comment over in the PR helps me understand the situation more clearly. I think I misunderstood your suggestion to never return errors for Send(nil), and I definitely was not grok'ing this:

There are client Sends that are never matched by a server Receive.

The above doesn't seem aberrant to me since a send doesn't have to be received. But I misunderstood how this caused the issue. I think I would have worded it: "the handler closes the stream with a non-error response without accepting any messages".

I think I understand the issue better now. Apologies for being dense about it before. I was interpreting "behavior change" and your previous proposal too broadly. But I do see now that Send(nil) should indeed return nil, even if the server has closed the stream, under the following conditions:

  1. If zero actual messages have been sent or received (so this send is clearly intended to only send headers).
  2. If the error returned from Receive would be io.EOF. In other words: the stream is closed with no error.

@emcfarlane
Copy link
Contributor Author

I've changed the fix to check if the Send(nil) caused the request to be sent. Only then do we avoid writing the nil checking the state of the io.Pipe. Afterwards any Send(nil) will write.

Sorry for the confusion. I've revised the description on the ticket to hopefully clarify the issue.

@jhump
Copy link
Member

jhump commented Nov 9, 2023

Sorry for the confusion.

Nah, not your fault. I think I was splitting my attention too many ways and not spending enough time to truly comprehend the issue here before providing that earlier feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants