-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracing: remove panic in Finish #83625
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @andreimatei, and @stevendanna)
pkg/util/tracing/crdbspan.go
line 547 at r1 (raw file):
// We don't have space for this recording. Let's collect just the structured // records. for ci := range childRecording {
I think a better fix is to get rid of the assertion in the structured case below, and make that code deal with multiple spans too. That's how it was before #81079, and I'm not sure if there was a specific reason why it changed.
I think it's important for all this code to be fairly permissive in what recording it accepts. For example, as written, I think this code would crash if an RPC is made within a RecordingStructured
span, and then the remote child is dynamically changed via \tracez
to RecordingVerbose
: a verbose, multi-span recording will come back and the client span will choke on it.
Please add the following test with this occassion:
// Test that a RecordingStructured parent does not panic when asked to ingest a
// remote verbose recording. Ingesting a recording of different type is unusual,
// since children are created with the parent's recording mode, but it can
// happen if the child's recording mode was changed dynamically.
func TestRemoteSpanWithDifferentRecordingMode(t *testing.T) {
tr := NewTracer()
s1 := tr.StartSpan("p", WithRecording(tracingpb.RecordingStructured))
s2 := tr.StartSpan("c", WithRemoteParentFromSpanMeta(s1.Meta()), WithRecording(tracingpb.RecordingVerbose))
s3 := tr.StartSpan("cc", WithParent(s2), WithRecording(tracingpb.RecordingVerbose))
s3.Finish()
r := s2.FinishAndGetConfiguredRecording()
require.NotPanic(s1.ImportRemoteRecording(r))
s1.Finish()
}
pkg/util/tracing/span_test.go
line 408 at r1 (raw file):
} // TestRecordingFinishWithMaxSpans is a regression test to ensure that a verbose
let's rephrase this test as Aditya was suggesting, and then I don't have to nitpick about this comment.
Call it TestRecordingDowngradesToStructuredIfTooBig
and check that you end up with a recording with just one or two spans or whatever, and all the structured events.
d768e25
to
b94c14f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @andreimatei)
pkg/util/tracing/crdbspan.go
line 547 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think a better fix is to get rid of the assertion in the structured case below, and make that code deal with multiple spans too. That's how it was before #81079, and I'm not sure if there was a specific reason why it changed.
I think it's important for all this code to be fairly permissive in what recording it accepts. For example, as written, I think this code would crash if an RPC is made within a
RecordingStructured
span, and then the remote child is dynamically changed via\tracez
toRecordingVerbose
: a verbose, multi-span recording will come back and the client span will choke on it.Please add the following test with this occassion:
// Test that a RecordingStructured parent does not panic when asked to ingest a // remote verbose recording. Ingesting a recording of different type is unusual, // since children are created with the parent's recording mode, but it can // happen if the child's recording mode was changed dynamically. func TestRemoteSpanWithDifferentRecordingMode(t *testing.T) { tr := NewTracer() s1 := tr.StartSpan("p", WithRecording(tracingpb.RecordingStructured)) s2 := tr.StartSpan("c", WithRemoteParentFromSpanMeta(s1.Meta()), WithRecording(tracingpb.RecordingVerbose)) s3 := tr.StartSpan("cc", WithParent(s2), WithRecording(tracingpb.RecordingVerbose)) s3.Finish() r := s2.FinishAndGetConfiguredRecording() require.NotPanic(s1.ImportRemoteRecording(r)) s1.Finish() }
Done.
pkg/util/tracing/span_test.go
line 408 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
let's rephrase this test as Aditya was suggesting, and then I don't have to nitpick about this comment.
Call itTestRecordingDowngradesToStructuredIfTooBig
and check that you end up with a recording with just one or two spans or whatever, and all the structured events.
Done. We don't get all of the structured events because of the size of the ring buffer, but I've checked that we get as many as possible given the size of the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but let's wait for Aditya to seewhat he says about that assertion. I think at some point he was relying on it, but maybe that changed.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @andreimatei, and @stevendanna)
pkg/util/tracing/span_test.go
line 460 at r2 (raw file):
s3.Finish() r := s2.FinishAndGetConfiguredRecording() require.NotPanics(t, func() { s1.ImportRemoteRecording(r) })
please also test that s1's recording only has one span in it
In cockroachdb#81079, we added an assertion that failed if a child recording of a RecordingStructured span had more than one span recording. However, this is problematic for couple of reasons: 1) The assertion was on a code path shared with RecordingVerbose spans; and, 2) A RecordingStructured span can have a RecordingVerbose child. The RecordingVerbose child is likely to have more than one span recording. As a result, we have seen roachtests failing with panic: RecordingStructured has 12 recordings; expected 1 Here, we remove the assertion. Release note: None
b94c14f
to
c66abb3
Compare
bors r=adityamaru, |
Build failed (retrying...): |
Build succeeded: |
In #81079, we added an assertion that failed if a child recording of a
RecordingStructured span had more than one span recording. However,
this is problematic for couple of reasons:
The assertion was on a code path shared with RecordingVerbose
spans; and,
A RecordingStructured span can have a RecordingVerbose child. The
RecordingVerbose child is likely to have more than one span recording.
As a result, we have seen roachtests failing with
Here, we remove the assertion.
Fixes #83502
Release note: None