Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsqlrun: add test for stream timeout #14132

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Mar 14, 2017

We didn't have any tests exercising a stream failing to connect within
the registry's timeout.


This change is Reviewable

@andreimatei
Copy link
Contributor Author

cc @RaduBerinde

@RaduBerinde
Copy link
Member

Nice, thanks! :lgtm:


Review status: 0 of 4 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/sql/distsqlrun/flow_registry.go, line 185 at r2 (raw file):

// UnregisterFlow removes a flow from the registry. Any subsequent
// ConnectInboundStream calls will wait for the flow in vain.

This suggests that they will be stuck forever, mention that they will time out.


pkg/sql/distsqlrun/flow_registry_test.go, line 165 at r2 (raw file):

// Test that, if inbound streams are not connected within the timeout, errors
// are propagated to their consumers future attempts to connect them fail.

and future attempts


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Mar 14, 2017

Reviewed 2 of 2 files at r1, 4 of 4 files at r2.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


pkg/sql/distsqlrun/flow_registry_test.go, line 37 at r2 (raw file):

// false.
//
// A copy of the flowRegistry's flowEntry is returned so that the entry's fields

you are accessing a map on this copy, but maps are shallow copies. see race failures in TC.


Comments from Reviewable

@RaduBerinde
Copy link
Member

pkg/sql/distsqlrun/flow_registry_test.go, line 37 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

you are accessing a map on this copy, but maps are shallow copies. see race failures in TC.

The problem is with the fe.inboundStreams[streamID1].timedOut access below. It should be done under the lock.


Comments from Reviewable

@RaduBerinde
Copy link
Member

Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


pkg/sql/distsqlrun/flow_registry_test.go, line 37 at r2 (raw file):

Previously, RaduBerinde wrote…

The problem is with the fe.inboundStreams[streamID1].timedOut access below. It should be done under the lock.

Oh, now I see what you meant. Yeah, we could also make a copy of the map here.


Comments from Reviewable

@andreimatei andreimatei force-pushed the distsql-test-stream-connection-timeout branch 2 times, most recently from 087dbbf to 24cd6fa Compare March 14, 2017 19:39
@andreimatei
Copy link
Contributor Author

Review status: 0 of 5 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


pkg/sql/distsqlrun/flow_registry.go, line 185 at r2 (raw file):

Previously, RaduBerinde wrote…

This suggests that they will be stuck forever, mention that they will time out.

Done.


pkg/sql/distsqlrun/flow_registry_test.go, line 37 at r2 (raw file):

Previously, RaduBerinde wrote…

Oh, now I see what you meant. Yeah, we could also make a copy of the map here.

Thanks; I've done something.


pkg/sql/distsqlrun/flow_registry_test.go, line 165 at r2 (raw file):

Previously, RaduBerinde wrote…

and future attempts

Done.


Comments from Reviewable

@RaduBerinde
Copy link
Member

:lgtm:


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Mar 14, 2017

Reviewed 4 of 4 files at r3, 5 of 5 files at r4.
Review status: all files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.


pkg/sql/distsqlrun/flow_registry.go, line 185 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

"time out"


pkg/sql/distsqlrun/flow_registry.go, line 40 at r4 (raw file):

// stream to a receiver to push rows to.
//
// All fields are protected by the flowRegistry mutex (except the receiver,

this comment is appropriate on the member whose type is inboundStreamInfo, not on the type.


pkg/sql/distsqlrun/flow_registry.go, line 61 at r4 (raw file):

// flowEntry is a structure associated with a (potential) flow.
// All fields are protected by the flowRegistry mutex, except flow, whose

same comment; this should be on the data structure of this type, rather than the type itself


pkg/sql/distsqlrun/flow_registry.go, line 243 at r4 (raw file):

Non-test callers should pass flowStreamDefaultTimeout.

Looks likely to rot and not particularly useful - surely this is easy to discover based on prevailing convention.


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

//
// It returns the Flow that the stream is connecting to, the receiver that the
// stream mush push data to, a cleanup function that must be called to

"must"


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

//
// It returns the Flow that the stream is connecting to, the receiver that the
// stream mush push data to, a cleanup function that must be called to

"and a cleanup function"


pkg/sql/distsqlrun/flow_registry.go, line 247 at r4 (raw file):

// It returns the Flow that the stream is connecting to, the receiver that the
// stream mush push data to, a cleanup function that must be called to
// de-register the flow from the registry after all the data has been pushed.

unregister


pkg/sql/distsqlrun/flow_registry.go, line 274 at r4 (raw file):

	cleanup := func() {
		fr.Lock()
		defer fr.Unlock()

is there a need for this defer?


pkg/sql/distsqlrun/flow_registry.go, line 282 at r4 (raw file):

func (fr *flowRegistry) finishInboundStreamLocked(fid FlowID, sid StreamID) {
	flowEntry := fr.getEntryLocked(fid)
	streamEntry := flowEntry.inboundStreams[sid]

every other location checks for presence; is it intended that this does not?


pkg/sql/distsqlrun/flow_registry_test.go, line 60 at r4 (raw file):

	si, ok := entry.inboundStreams[sid]
	if !ok {
		return inboundStreamInfo{}, errors.Errorf("missing stream entry")

since you're Errorfing this, include sid?


pkg/sql/distsqlrun/flow_registry_test.go, line 225 at r4 (raw file):

	reg.UnregisterFlow(id1)
	if _, _, _, err := reg.ConnectInboundStream(id1, streamID1, jiffy); !testutils.IsError(
		err, "not found") {

again, this wrapping is the hardest possible style to parse


pkg/sql/distsqlrun/server.go, line 194 at r4 (raw file):

	}
	f, receiver, cleanup, err := ds.flowRegistry.ConnectInboundStream(
		flowID, streamID, flowStreamDefaultTimeout)

worst possible wrapping :(


pkg/sql/distsqlrun/server.go, line 199 at r4 (raw file):

	}
	log.VEventf(ctx, 1, "connected inbound stream %s/%d", flowID.Short(), streamID)
	defer cleanup()

this should be above the log line


Comments from Reviewable

We didn't have any tests exercising a stream failing to connect within
the registry's timeout.

I've changed the fr.ConnectInbound interface slightly - made it return a
RowReceiver explicitly so that callers don't access the
inboundStreamInfo's fields without the flow registry lock - which is
documented to be required.
@andreimatei andreimatei force-pushed the distsql-test-stream-connection-timeout branch from 24cd6fa to 1632f44 Compare March 14, 2017 21:36
@andreimatei
Copy link
Contributor Author

Review status: 2 of 5 files reviewed at latest revision, 14 unresolved discussions.


pkg/sql/distsqlrun/flow_registry.go, line 185 at r2 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"time out"

Done.


pkg/sql/distsqlrun/flow_registry.go, line 40 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this comment is appropriate on the member whose type is inboundStreamInfo, not on the type.

meh. I moved it but I'm not sure; for this inner type that only lives in one mape, it might have been more visible here.


pkg/sql/distsqlrun/flow_registry.go, line 61 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

same comment; this should be on the data structure of this type, rather than the type itself

Done.


pkg/sql/distsqlrun/flow_registry.go, line 243 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Non-test callers should pass flowStreamDefaultTimeout.

Looks likely to rot and not particularly useful - surely this is easy to discover based on prevailing convention.

Done.


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"must"

Done.


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"and a cleanup function"

Done.


pkg/sql/distsqlrun/flow_registry.go, line 247 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

unregister

Done.


pkg/sql/distsqlrun/flow_registry.go, line 274 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

is there a need for this defer?

Done.


pkg/sql/distsqlrun/flow_registry.go, line 282 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

every other location checks for presence; is it intended that this does not?

yeah, intended. if it's missing, it'll panic on result dereferencing


pkg/sql/distsqlrun/flow_registry_test.go, line 60 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

since you're Errorfing this, include sid?

Done.


pkg/sql/distsqlrun/flow_registry_test.go, line 225 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

again, this wrapping is the hardest possible style to parse

Done.


pkg/sql/distsqlrun/server.go, line 194 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

worst possible wrapping :(

this one is the best possible :)


pkg/sql/distsqlrun/server.go, line 199 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this should be above the log line

Done.


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Mar 14, 2017

Reviewed 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

y u remove the oxford comma :(


pkg/sql/distsqlrun/flow_registry.go, line 282 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

yeah, intended. if it's missing, it'll panic on result dereferencing

okay. i think you can probably produce a more useful panic message than the NPE panic.


pkg/sql/distsqlrun/flow_registry.go, line 72 at r5 (raw file):

	// inboundStreams are streams that receive data from other hosts, through the
	// FlowStream API. All fields in the inboundStreamInfos are protected by the
	// flowRegistry mutex (except the receiver, whose methods can be called

where is the flow registry mutex? perhaps "this entry's parent registry's mutex"


pkg/sql/distsqlrun/server.go, line 194 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

this one is the best possible :)

Not true!

... := ds.flowRegistry.ConnectInboundStream(
  flowID, streamID, flowStreamDefaultTimeout
)

as-written, it's hard to see why the indentation is set back on the next line (this was the pathological case we discussed in that ancient nerd battle).


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


pkg/sql/distsqlrun/flow_registry.go, line 246 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

y u remove the oxford comma :(

I like oxford commas too but not updating the PR for it. If should have\ come up with one more thing!


pkg/sql/distsqlrun/flow_registry.go, line 72 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

where is the flow registry mutex? perhaps "this entry's parent registry's mutex"

mine is more clear


pkg/sql/distsqlrun/server.go, line 194 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Not true!

... := ds.flowRegistry.ConnectInboundStream(
  flowID, streamID, flowStreamDefaultTimeout
)

as-written, it's hard to see why the indentation is set back on the next line (this was the pathological case we discussed in that ancient nerd battle).

it's set back because it's a new statement, starts with an if. We'll have to disagree on this one.


Comments from Reviewable

@andreimatei andreimatei merged commit 634e66f into cockroachdb:master Mar 14, 2017
@andreimatei andreimatei deleted the distsql-test-stream-connection-timeout branch March 14, 2017 23:11
@tamird
Copy link
Contributor

tamird commented Mar 14, 2017

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


pkg/sql/distsqlrun/flow_registry.go, line 72 at r5 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

mine is more clear

literally isn't - i asked because I couldn't guess it. then I opened this file and found that flowRegistery has a map of flows, which is why I suggested this edit.


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


pkg/sql/distsqlrun/flow_registry.go, line 72 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

literally isn't - i asked because I couldn't guess it. then I opened this file and found that flowRegistery has a map of flows, which is why I suggested this edit.

flowRegistry is clearer than "parent". flowEntry is logically a part of flowRegistry. Sounds like you figured it out after opening the file - sometimes you have to open the file :P


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants