Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add contention_events to cluster_execution_insights #90660

Merged
merged 1 commit into from
Oct 31, 2022
Merged

sql: add contention_events to cluster_execution_insights #90660

merged 1 commit into from
Oct 31, 2022

Conversation

j82w
Copy link
Contributor

@j82w j82w commented Oct 25, 2022

The original contention column will remain
to make query operations faster. The events
are being put into a json column because it's
possible there could be multiple blocking events
for a single statement. The json column avoids the complexity of adding another table and keeping it
in sync with the insights table.

The table can be joined with index_columns and tables to get the database name, table name, and index name that contention occurred on. This does not contain the blocking statement information, and the blocking fingerprint id.

closes: #88561

Release note (sql change): Adds contention_events
to cluster_execution_insights. This is used
to see which transaction is blocking the specific
statement.

@j82w j82w requested review from a team October 25, 2022 21:38
@j82w j82w requested a review from a team as a code owner October 25, 2022 21:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@matthewtodd matthewtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, nice! Insights layer LGTM, would like to hear especially from queries.

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from me!
As Matthew mentioned, get a review from queries to make sure there isn't any other places that you should add the new events

Reviewed 9 of 21 files at r1, 3 of 7 files at r2, 6 of 9 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w)


pkg/sql/crdb_internal.go line 6409 at r3 (raw file):

	exec_node_ids              INT[] NOT NULL,
	contention                 INTERVAL,
	contention_events				   JSONB,

nit: misalignment here


pkg/sql/execinfrapb/component_stats.proto line 141 at r3 (raw file):

  optional util.optional.Uint num_internal_seeks = 8 [(gogoproto.nullable) = false];

  // contention_events hit at the statement level

nit: period


pkg/sql/execstats/stats.go line 36 at r3 (raw file):

// contention time from the given recording or, if the recording is nil, from
// the tracing span from the context. All contention events found in the trace
// are included.

can you update this comment to reflect the new return values?


pkg/sql/sqlstats/insights/insights.proto line 104 at r3 (raw file):

  google.protobuf.Duration contention = 18 [(gogoproto.stdduration) = true];
  repeated string index_recommendations = 19;
  // contention_events hit at the statement level

nit: period


pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_encoding.go line 298 at r3 (raw file):

//		  "type": "object",
//	   [{
//	  			"blocking_txn_id": { "type": "string" },

is this right? you're using camelCase on all other keys and I can see your encodeJSON also using camelCase for this key


pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_encoding.go line 301 at r3 (raw file):

//	 			"durationMs": { "type": "number" },
//					"rawIndexID": { "type": "number" },
//					"tableId": { "type": "number" }

nit: misalignment
curious: why "raw" index id? seems like you use just indexID as the key on your encodeJSON function

@j82w
Copy link
Contributor Author

j82w commented Oct 26, 2022

pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_encoding.go line 301 at r3 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: misalignment
curious: why "raw" index id? seems like you use just indexID as the key on your encodeJSON function

I copied an outdated json. I will fix it.

@j82w
Copy link
Contributor Author

j82w commented Oct 26, 2022

pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_encoding.go line 298 at r3 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

is this right? you're using camelCase on all other keys and I can see your encodeJSON also using camelCase for this key

Nope, the code is correct. I accidentally copied the wrong json. It's now fixed.

@j82w
Copy link
Contributor Author

j82w commented Oct 26, 2022

pkg/sql/execstats/stats.go line 36 at r3 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

can you update this comment to reflect the new return values?

Done.

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! :lgtm:

Reviewed 7 of 7 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @j82w)

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I have some nits, but :lgtm: otherwise.

Reviewed 9 of 21 files at r1, 3 of 7 files at r2, 4 of 9 files at r3, 6 of 7 files at r4, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @j82w)


pkg/sql/crdb_internal.go line 6494 at r4 (raw file):

		contentionEvents := tree.DNull
		if insight.Statement.ContentionEvents != nil && len(insight.Statement.ContentionEvents) > 0 {

nit: no need for the non-nil check - len and cap of nil slice is 0.


pkg/sql/crdb_internal.go line 6539 at r4 (raw file):

		))

		if err != nil {

nit: hm, we have some inconsistency in how we're handling errors in this function - in some cases we use CombineErrors and don't short-circuit, in others we short-circuit. I think we should choose one or the other.


pkg/sql/colexecop/operator.go line 79 at r4 (raw file):

	// GetCumulativeContentionTime returns the amount of time KV reads spent
	// contending. It must be safe for concurrent use.
	GetCumulativeContentionTime() (time.Duration, []roachpb.ContentionEvent)

nit: we should rename this method, e.g. to GetContentionInfo.


pkg/sql/execstats/traceanalyzer.go line 127 at r4 (raw file):

	NetworkMessages       int64
	ContentionTime        time.Duration
	ContentionEvent       []roachpb.ContentionEvent

nit: do ContentionEvents.


pkg/sql/execstats/traceanalyzer.go line 161 at r4 (raw file):

	s.NetworkMessages += other.NetworkMessages
	s.ContentionTime += other.ContentionTime
	if len(other.ContentionEvent) > 0 {

nit: you can omit the length check here too and just use append - it'll have an exactly the same behavior.


pkg/sql/execstats/traceanalyzer.go line 254 at r4 (raw file):

		a.nodeLevelStats.KVTimeGroupedByNode[instanceID] += stats.KV.KVTime.Value()
		a.nodeLevelStats.ContentionTimeGroupedByNode[instanceID] += stats.KV.ContentionTime.Value()
		if len(stats.KV.ContentionEvents) > 0 {

nit: ditto


pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_encoding.go line 297 at r4 (raw file):

//		  "title": "system.statement_statistics.contention_events",
//		  "type": "object",
//	   [{

nit: misalignment - seems like a couple of spaces are missing.

The original contention column will remain
to make query operations faster. The events
are being put into a json column because it's
possible there could be multiple blocking events
for a single statement. The json column avoids the
complexity of adding another table and keeping it
in sync with the insights table.

The table can be joined with index_columns and tables
to get the database name, table name, and index name
that contention occurred on. This does not contain
the blocking statement information, and the blocking
fingerprint id.

closes #88561

Release note (sql change): Adds contention_events
to cluster_execution_insights. This is used
to see which transaction is blocking the specific
statement.
@j82w
Copy link
Contributor Author

j82w commented Oct 31, 2022

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 31, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Oct 31, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add statement level contention tracking for insights
5 participants