Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stmtdiagnostics: save the bundle on a statement timeout #88080

Merged
merged 1 commit into from
Sep 30, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Sep 16, 2022

Previously if the traced statement is canceled due to a statement timeout, the statement bundle would be created but would fail on the insertion into the system table. This is suboptimal because we already did all the work to collect the bundle as well as it might be desired to see the partial trace, so this commit makes it so that the bundle is saved correctly.

Fixes: #73477.

Epic: CRDB-14510

Release note (bug fix): Previously, when a statement bundle was collected for a query that results in an error due to a statement_timeout, the bundle would not be saved, and this is now fixed.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)


pkg/sql/stmtdiagnostics/statement_diagnostics.go line 483 at r1 (raw file):

	var diagID CollectedInstanceID
	if ctx.Err() != nil {
		// If the context was canceled (likely due to a statement timeout), we

What if it wasn't a timeout? Are we potentially swallowing/hiding some real error? What if we get here and the statement hasn't timed out yet?

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)


pkg/sql/stmtdiagnostics/statement_diagnostics.go line 483 at r1 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

What if it wasn't a timeout? Are we potentially swallowing/hiding some real error? What if we get here and the statement hasn't timed out yet?

More specifically if the code gets past this and then times out would the unit test fail?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)


pkg/sql/stmtdiagnostics/statement_diagnostics.go line 483 at r1 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

More specifically if the code gets past this and then times out would the unit test fail?

Maybe we should at least log the error?

@yuzefovich yuzefovich force-pushed the stmt-diag-timeout branch 2 times, most recently from cb27705 to 53316a8 Compare September 27, 2022 22:37
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly-added unit test exposed a minor bug (namely that the conditional request is not removed from the registry even after it was satisfied), which I'll fix separately. In order to make CI happy I had to move the test a bit.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @rytaft)


pkg/sql/stmtdiagnostics/statement_diagnostics.go line 483 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Maybe we should at least log the error?

This code is only executed at the very end, after the query has completed running and we assembled everything for the bundle. If somehow the context gets an error only when we got here, it seems ok to proceed. It's true that we don't know the true reason for why the context is busted, so I added a deadline of 10 seconds for inserting the bundle. I also added logging of the error.

Previously if the traced statement is canceled due to a statement
timeout, the statement bundle would be created but would fail on the
insertion into the system table. This is suboptimal because we already
did all the work to collect the bundle as well as it might be desired to
see the partial trace, so this commit makes it so that the bundle is
saved correctly.

Release note (bug fix): Previously, when a statement bundle was
collected for a query that results in an error due to
a `statement_timeout`, the bundle would not be saved, and this is now
fixed.
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig craig bot merged commit 5990da6 into cockroachdb:master Sep 30, 2022
@craig
Copy link
Contributor

craig bot commented Sep 30, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stmtdiagnostics: save bundle even in case of statement timeout
4 participants