Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission: include more work queue info in tracing and errors #114664

Merged

Conversation

aadityasondhi
Copy link
Collaborator

See individual commits.


admission: inject queue kind into work queue tracing

This patch plumbs down a queue type identifier (QueueKind) into the
WorkQueue to help track down the origin of items in the queue.

Informs: #113990

Release note: None


admission: include work priority in trace and errors

This patch now includes work priority in the WorkQueue traces and
error messages to better help track where work items originate from.

Informs #113990.

Release note: None

@aadityasondhi aadityasondhi requested review from a team and sumeerbhola November 17, 2023 20:11
@aadityasondhi aadityasondhi requested a review from a team as a code owner November 17, 2023 20:11
Copy link

blathers-crl bot commented Nov 17, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aadityasondhi aadityasondhi added the backport-23.2.x Flags PRs that need to be backported to 23.2. label Nov 17, 2023
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi)


pkg/util/admission/admission.go line 553 at r1 (raw file):

const (
	KVAdmissionQueue QueueKind = iota
  • Should we just dotype QueueKind string, since all we care about is the string representation. The indirection just seems to add code.
  • We can leave it as the empty string for WorkKinds SQLKVResponseWork and SQLSQLResponseWork since there is only one queue each for those. Then work %s deadline expired while waiting in queue: %s would use the QueueKind if non-empty, else fallback to using the WorkKind.String().
  • The strings we need are "kv-regular-cpu-queue", "kv-elastic-cpu-queue", "kv-regular-store-queue", "kv-elastic-store-queue".

@aadityasondhi aadityasondhi force-pushed the 20231116.admission-work-queue-tracing branch 2 times, most recently from 1a83a05 to 32a9af6 Compare November 29, 2023 17:17
Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/util/admission/admission.go line 553 at r1 (raw file):

Previously, sumeerbhola wrote…
  • Should we just dotype QueueKind string, since all we care about is the string representation. The indirection just seems to add code.
  • We can leave it as the empty string for WorkKinds SQLKVResponseWork and SQLSQLResponseWork since there is only one queue each for those. Then work %s deadline expired while waiting in queue: %s would use the QueueKind if non-empty, else fallback to using the WorkKind.String().
  • The strings we need are "kv-regular-cpu-queue", "kv-elastic-cpu-queue", "kv-regular-store-queue", "kv-elastic-store-queue".

Done. PTAL

@aadityasondhi aadityasondhi force-pushed the 20231116.admission-work-queue-tracing branch from 32a9af6 to d5002ee Compare November 29, 2023 17:48
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi)


pkg/util/admission/admission.go line 555 at r3 (raw file):

//
// It is left empty for SQL types of WorkKind.
type QueueKind string

do we need to mark this by implementing SafeValue so that it plays well with redaction?


pkg/util/admission/work_queue.go line 380 at r3 (raw file):

	ambientCtx log.AmbientContext,
	workKind WorkKind,
	queueKind QueueKind,

in initWorkQueue if the queueKind is empty, let's initialize q.queueKind to workKind.String(). Then we don't need the if-else block when doing the logging and tracing.


pkg/util/admission/work_queue.go line 815 at r3 (raw file):

		recordAdmissionWorkQueueStats(span, waitDur, q.workKind, q.queueKind, true)
		log.Eventf(ctx, "deadline expired, waited in %s queue for %v",
			q.workKind, waitDur)

The log.Eventf should also use the queueKind.


pkg/util/admission/admissionpb/admission_stats.proto line 23 at r3 (raw file):

  int64 wait_duration_nanos = 1 [(gogoproto.casttype) = "time.Duration"];
  // String representation of admission work kind.
  string work_kind = 2;

we could just rename this as queue_kind and use the same tag number. The queue_kind subsumes the information provided by work_kind.

@aadityasondhi aadityasondhi force-pushed the 20231116.admission-work-queue-tracing branch from d5002ee to 10935cf Compare December 5, 2023 21:57
Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/util/admission/admission.go line 555 at r3 (raw file):

Previously, sumeerbhola wrote…

do we need to mark this by implementing SafeValue so that it plays well with redaction?

Done.


pkg/util/admission/work_queue.go line 380 at r3 (raw file):

Previously, sumeerbhola wrote…

in initWorkQueue if the queueKind is empty, let's initialize q.queueKind to workKind.String(). Then we don't need the if-else block when doing the logging and tracing.

Done.


pkg/util/admission/work_queue.go line 815 at r3 (raw file):

Previously, sumeerbhola wrote…

The log.Eventf should also use the queueKind.

Done.


pkg/util/admission/admissionpb/admission_stats.proto line 23 at r3 (raw file):

Previously, sumeerbhola wrote…

we could just rename this as queue_kind and use the same tag number. The queue_kind subsumes the information provided by work_kind.

Done.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi)


pkg/util/admission/work_queue.go line 815 at r3 (raw file):

Previously, aadityasondhi (Aaditya Sondhi) wrote…

Done.

I was suggesting the same string as prior to this PR: "deadline expired, waited in %s queue for %v" with the queueKind. The workKind is always inferrable from the queue kind.


pkg/util/admission/work_queue.go line 850 at r5 (raw file):

	}
	if queueKind == "" {
		queueKind = QueueKind(workKind.String())

why do we need this if WorkQueue.queueKind is always being passed here and it is already initialized?

This patch plumbs down a queue type identifier (`QueueKind`) into the
`WorkQueue` to help track down the origin of items in the queue.

Informs: cockroachdb#113990

Release note: None
This patch now includes work priority in the `WorkQueue` traces and
error messages to better help track where work items originate from.

Informs cockroachdb#113990.

Release note: None
@aadityasondhi aadityasondhi force-pushed the 20231116.admission-work-queue-tracing branch from 10935cf to 87f5496 Compare December 6, 2023 03:29
Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes requested, PTAL again.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/util/admission/work_queue.go line 815 at r3 (raw file):

Previously, sumeerbhola wrote…

I was suggesting the same string as prior to this PR: "deadline expired, waited in %s queue for %v" with the queueKind. The workKind is always inferrable from the queue kind.

Ah I miss understood, yeah makes sense.


pkg/util/admission/work_queue.go line 850 at r5 (raw file):

Previously, sumeerbhola wrote…

why do we need this if WorkQueue.queueKind is always being passed here and it is already initialized?

Sorry, that was me just being hasty. You are right, we always set this in initWorkQueue.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r8, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi)

Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=sumeerbhola

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

@craig
Copy link
Contributor

craig bot commented Dec 7, 2023

Build succeeded:

@craig craig bot merged commit da982d5 into cockroachdb:master Dec 7, 2023
4 checks passed
Copy link

blathers-crl bot commented Dec 7, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from ba253a2 to blathers/backport-release-23.2-114664: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.x failed. See errors above.


error creating merge commit from ba253a2 to blathers/backport-release-23.2.0-rc-114664: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.0-rc failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants