Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission: introduce soft slots for speculative concurrency #78519

Merged
merged 1 commit into from
Jul 6, 2022

Conversation

sumeerbhola
Copy link
Collaborator

This is for using additional theads for Pebble compaction compression
and later for adjusting the number of concurrent compactions.

Since these slots are not acquired or released in a very fine grained
manner, we've chosen to model these in way that allows for
over-commitment. The SoftSlotGranter should be used for acquiring
and releasing such slots, instead of WorkQueue, since there is no
queueing for such slots.

Release note: None

@sumeerbhola sumeerbhola requested review from bananabrick and a team March 25, 2022 18:55
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@sumeerbhola sumeerbhola marked this pull request as draft March 25, 2022 18:56
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick)


pkg/util/admission/granter.go, line 481 at r1 (raw file):

	usedSoftSlots       int
	totalSlots          int
	totalSoftSlots      int

don't look at this yet -- I added totalSoftSlots in a last iteration of this code and it is somewhat bolted on without any clear invariants relating totalSlots and totalSoftSlots.

@sumeerbhola sumeerbhola force-pushed the admission_compaction branch from e97d670 to 451f7c6 Compare March 28, 2022 17:52
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick)


pkg/util/admission/granter.go, line 481 at r1 (raw file):

Previously, sumeerbhola wrote…

don't look at this yet -- I added totalSoftSlots in a last iteration of this code and it is somewhat bolted on without any clear invariants relating totalSlots and totalSoftSlots.

cleaned this up, so it is ready for a look.

Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some questions here for my own understanding, which I felt was lacking as I was performing the benchmarking.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go, line 470 at r2 (raw file):

// causes a delay in returning slots to admission control, and we don't want
// that delay to cause admission delay for normal work. Hence, we model slots
// granted to background activities as "soft-slots". Granting a soft-slot has

If I understand this correctly, then it is possible that at some point in time totalSlots < usedSlots + usedSoftSlots, right?

For example:
We could have one total slot available, and we could have a get for a soft slot, which would succeed, followed by a get for a regular slot, which would also succeed.

It seems as if we try to enforce this invariant: usedSoftSlots+usedSlots <= totalSlots at soft slot grant time, but it doesn't hold throughout the lifetime of the program.


pkg/util/admission/granter.go, line 590 at r2 (raw file):

		spareSoftSlots = spareSlots
	}
	if spareSoftSlots <= 0 {

spareSlots < 0 could evaluate to true right?


pkg/util/admission/granter.go, line 1508 at r2 (raw file):

	// So the natural behavior of the slot adjustments does not guarantee
	// totalSlots >= totalSoftSlots. We add logic to impose this on top of the

Is this the only invariant which needs to be maintained throughout the lifetime of a program?

Let's say this invariant wasn't true. I'm curious about the bad behaviours which would occur.

totalSoftSlots is only used in tryGetSoftSlots for any decision making:

	spareSoftSlots := sg.totalSoftSlots - sg.usedSoftSlots
	spareSlots := sg.totalSlots - (sg.usedSlots + sg.usedSoftSlots)
	if spareSlots < spareSoftSlots {
		spareSoftSlots = spareSlots
	}

Since, spareSoftSlots is set to the minimum of the spareSoftSlots, and spareSlots, is the totalSlots >= totalSoftSlots invariant required?

I think we want to make sure that totalSoftSlots is actually decremented when it should be. Which leads me to the question, why do we have a used > 0 check in tryDecreaseSlots.

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go, line 470 at r2 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

If I understand this correctly, then it is possible that at some point in time totalSlots < usedSlots + usedSoftSlots, right?

For example:
We could have one total slot available, and we could have a get for a soft slot, which would succeed, followed by a get for a regular slot, which would also succeed.

It seems as if we try to enforce this invariant: usedSoftSlots+usedSlots <= totalSlots at soft slot grant time, but it doesn't hold throughout the lifetime of the program.

Correct. We are willing to over-subscribe for the sake of not blocking regular work. What we tend to observe when there is a high amount of regular work is that the slot utilization stays high (i.e., usedSlots/totalSlots is close to 1). The hope is that in that case totalSoftSlots will be < totalSlots and no soft slots will be granted.

But I've realized there is some dubious thinking here, which I need to correct.
There ought to be two slot limits: totalHighLoadSlots, totalLowLoadSlots, with totalHighLoadSlots >= totalLowLoadSlots. The code tries to do something like this in the slot adjustment logic but isn't quite right.
Then there are two kinds of slot consumers: regular and optional background activities, which will have usedRegularSlots, usedSoftSlots.
The invariant at granting a regular slot: usedRegularSlots <= totalHighLoadSlots
The invariant at granting a soft slot: usedRegularSlots + usedSoftSlots <= totalLowLoadSlots

This isn't just a name change from what is here. We are incorrectly doing usedRegularSlots + usedSoftSlots <= totalHighLoadSlots and usedSoftSlots <= totalLowLoadSlots when granting soft slots. And we are incorrectly doing calls like try{Increase,Decrease}Slots(usedSoftSlots, totalLowLoadSlots).


pkg/util/admission/granter.go, line 590 at r2 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

spareSlots < 0 could evaluate to true right?

Correct

@sumeerbhola sumeerbhola force-pushed the admission_compaction branch from 451f7c6 to 1c6b211 Compare April 12, 2022 22:46
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go, line 470 at r2 (raw file):

Previously, sumeerbhola wrote…

Correct. We are willing to over-subscribe for the sake of not blocking regular work. What we tend to observe when there is a high amount of regular work is that the slot utilization stays high (i.e., usedSlots/totalSlots is close to 1). The hope is that in that case totalSoftSlots will be < totalSlots and no soft slots will be granted.

But I've realized there is some dubious thinking here, which I need to correct.
There ought to be two slot limits: totalHighLoadSlots, totalLowLoadSlots, with totalHighLoadSlots >= totalLowLoadSlots. The code tries to do something like this in the slot adjustment logic but isn't quite right.
Then there are two kinds of slot consumers: regular and optional background activities, which will have usedRegularSlots, usedSoftSlots.
The invariant at granting a regular slot: usedRegularSlots <= totalHighLoadSlots
The invariant at granting a soft slot: usedRegularSlots + usedSoftSlots <= totalLowLoadSlots

This isn't just a name change from what is here. We are incorrectly doing usedRegularSlots + usedSoftSlots <= totalHighLoadSlots and usedSoftSlots <= totalLowLoadSlots when granting soft slots. And we are incorrectly doing calls like try{Increase,Decrease}Slots(usedSoftSlots, totalLowLoadSlots).

I've changed the code along these lines and added a comment that makes a distinction between the squishiness of soft slots and the idea of moderate load vs high load slots.

Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go, line 475 at r3 (raw file):
Thanks for the extra documentation. It answers some of my questions.

I'll think about this:

// the slot is still "using" it, while a soft-slot is "squishy" and in some
// cases we can pretend that it is not being used. Say we are allowed

@sumeerbhola
Copy link
Collaborator Author

Recording what we discovered in experiments run by @bananabrick with kv50.

The total-moderate-load-slot mechanism doesn't work well in that even though CPU utilization is consistently high, and runnable counts are usually on the high side, compactions are sometimes able to get slots for concurrent compression because the used (moderate-load and regular) slots can be low.
An example is the following:

2022-05-16 18:19:59.624756486 +0000 UTC m=+117.659591841 [runnable count 372 threshold*procs 256]
2022-05-16 18:19:59.624779063 +0000 UTC m=+117.659614417 [very overloaded 254 164 0 1]

The latter log format is [<load> <total high load slots> <total moderate load slots> <used regular slots> <used soft slots>].
Note that even though the runnable count is high, of the (254, 164), (total-high-load, total-moderate-load) slots respectively, only 1 slot is being used.
In hindsight, I should have foreseen this problem: the slot mechanism behaves in a bi-modal manner in nodes that do both KV and SQL processing. While there is backlogged KV work, the slot usage is high, and blocks all SQL work, but eventually all callers have done their KV processing and are queued up for SQL work. The latter causes bursts of grants (because it uses tokens), gated only by the grant-chain mechanism, during which runnable count is high but used (KV) slots are low. This situation doesn't last long because soon followup KV work will start queueing (once the SQL processing is done), but that is exactly when we have low slot usage and high CPU utilization and will mistakenly let through concurrent compression for compactions. This behavior is also the cause of the wide fluctuations in total-high-load slots that one sees in many workloads -- during this time when slot usage is low but runnable counts are high, we start decreasing the total-high-load slots (we've been ok with this since it still gets the job done -- there is nothing specifically bad in having fluctuations).

One observation here is that we can afford to be more conservative in calculating total-moderate-load slots since we don't care about saturating CPU for the less important work that is controlled by these slots. So we could use a slow reacting and conservative signal to decide on the value of total-moderate-load slots.
A second observation is that the current work type that is being controlled by these total-moderate-load slots is 100% cpu (IO time/CPU time ratio is 0), because it is concurrent compression for compactions. This will not be true for future things we integrate with total-moderate-load slots, like ExportRequest (but hopefully cache hit rates are high, and we'll have fixed any false contention on locks/latches since these requests are in the past).

Sketch of approach: Keep the current total-moderate-load slot adjustment logic, but add an additional cap on the value using the runnable goroutine observations, since that observation accurately reflects that the CPU load is high even when slot usage is low.

  • Compute an exponentially smoothed runnable goroutine value (using the 1ms sampling rate) which gives enough weight to the last 1s of samples. Let this value be G_r.
  • runnable_threshold*procs - G_r is roughly the headroom we have for high utilization. But since we are doing this for total-moderate-load slots, compute the headroom as H=runnable_threshold*procs/K - G_r, where say K=2. Some of the runnable may be because of soft slots we have already allocated. It's probably ok to double count them (since we are being conservative).

@bananabrick bananabrick force-pushed the admission_compaction branch from 1c6b211 to 9827839 Compare May 24, 2022 02:41
@bananabrick
Copy link
Contributor

Just rebased this.

@bananabrick bananabrick force-pushed the admission_compaction branch 5 times, most recently from 821a044 to 0b5f6eb Compare May 26, 2022 20:25
@bananabrick bananabrick marked this pull request as ready for review May 26, 2022 20:29
@bananabrick bananabrick requested a review from a team May 26, 2022 20:29
@bananabrick bananabrick changed the title [DNM] admission: introduce soft slots for speculative concurrency admission: introduce soft slots for speculative concurrency May 27, 2022
Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go line 1558 at r5 (raw file):

	threshold := int(KVSlotAdjusterOverloadThreshold.Get(&kvsa.settings.SV))

	// Moderate slots can be adjusted using two separate mechanisms. We tune the

moderateSlotsClamp is what I came up with to make the tuning of totalModerateSlots testable. I'm happy to learn about a better way to solve this.


pkg/util/admission/granter.go line 394 at r6 (raw file):

	totalModerateLoadSlots int
	moderateSlotsClamp     int
	failedSoftSlotsGet     bool

I'm unsure that optimization with failedSoftSlotGet is necessary. It seems like it's handling a specific case, and it didn't seem to do much during benchmarking.


pkg/util/admission/granter.go line 1600 at r6 (raw file):

		// also handle the case where the used slots are a bit less than total
		// slots, since callers for soft slots don't block.
		if (usedSlots >= total || (usedSlots >= closeToTotalSlots && kvsa.granter.failedSoftSlotsGet)) &&

I'm not con

@bananabrick bananabrick force-pushed the admission_compaction branch from 0b5f6eb to 89ca827 Compare June 1, 2022 21:30
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


-- commits line 4 at r7:
threads


-- commits line 5 at r7:
and later may be used for ...


-- commits line 8 at r7:
in a way


-- commits line 15 at r7:
leftover comment. I am assuming the PR description will talk about the approach where this runnable stuff will come up.


pkg/util/admission/granter.go line 1558 at r5 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

moderateSlotsClamp is what I came up with to make the tuning of totalModerateSlots testable. I'm happy to learn about a better way to solve this.

Is the issue here mainly about printing kvSlotAdjuster state for datadriven testing output?
I don't think we should delay the clamping. It is confusing, and it is possible that something that was clamped to 0 because moderateSlotsClamp was negative now rises to 1, which means we could use up 1 CPU on compression when we don't actually want to.

There's literally one of these structs in a whole CockroachDB process, so nothing here needs to be space efficient. We can keep whatever extra state we want (like ioLoadListener.adjustTokenResult does). So how about keeping the following in the kvSlotAdjuster state

totalModerateSlotsPreClamp
runnableEWMA
moderateSlotsClamp

and then write min(totalModerateSlotsPreClamp, moderateSlotsClamp) to granter.totalModerateSlots.

The next time the additive increase/decrease logic will of course start with granter.totalModerateSlots, as you have here.


pkg/util/admission/granter.go line 394 at r6 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

I'm unsure that optimization with failedSoftSlotGet is necessary. It seems like it's handling a specific case, and it didn't seem to do much during benchmarking.

responded below


pkg/util/admission/granter.go line 1600 at r6 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

I'm not con

?

I think it's fine to remove failedSoftSlotsGet and closeToTotalSlots, since you are not seeing it being useful in experiments.


pkg/util/admission/granter.go line 393 at r7 (raw file):

	totalHighLoadSlots     int
	totalModerateLoadSlots int
	moderateSlotsClamp     int

Can you add code comments for moderateSlotsClamp, failedSoftSlotsGet, runnableEWMA, runnableAlpha.

Also move all these into kvSlotAdjuster. The only reason kvSlotAdjuster was reaching into slotGranter for the total and used slots is that those were sufficient for the slot adjustment logic, and those were needed by the slotGranter for its granting logic. These new fields are only needed for adjustment, so they belong in kvSlotAdjuster.


pkg/util/admission/granter.go line 457 at r7 (raw file):

	}
	sg.usedSoftSlots += allocatedSlots
	sg.usedSlotsMetric.Update(int64(sg.usedSoftSlots))

usedSoftSlotsMetric


pkg/util/admission/granter.go line 465 at r7 (raw file):

	defer sg.coord.mu.Unlock()
	sg.usedSoftSlots -= count
	sg.usedSlotsMetric.Update(int64(sg.usedSoftSlots))

usedSoftSlotsMetric


pkg/util/admission/granter.go line 741 at r7 (raw file):

	SQLStatementLeafStartWorkSlots: 100,    /* arbitrary, and unused */
	SQLStatementRootStartWorkSlots: 100,    /* arbitrary, and unused */
	RunnableAverageAlpha:           0.991,  /* Gives enough weight to at least a few hundred samples. */

I think the convention is for alpha to be the multiplier for the latest sample, so ewma = samplealpha + ewma(1-alpha).

There's a bit of a problem here in that the CPULoad can sometimes be reported at 250ms intervals and then we don't want this 0.991 value.
We could have something like
// RunnableAverageAlphaPerMillisecond (R) is used to compute the alpha for the exponentially weighted runnable count. The actual alpha is R * sampleDurationInMillis, clamped by [0.001,0.5]. Note that this value is made configurable via options only for unit testing purposes.
RunnableAverageAlphaPerMillisecond: 0.009

func (kvsa *kvSlotAdjuster) CPULoad(runnable int, procs int, samplePeriod time.Duration) {
   alpha = RunnableAverageAlphaPerMillisecond * (samplePeriod/time.Millisecond)
   if alpha > 0.5 {
     alpha = 0.5
   } else if alpha < 0.001 {
      alpha = 0.001
   }
   ...
``

pkg/util/admission/granter.go line 1568 at r7 (raw file):

	// the next time CPULoad is called. Since CPULoad is called at least 1000 times
	// a second, we're only delaying the adjustment of totalModerateSlots by 1ms
	// which is fine.

Can you cleanup #78519 (comment) and include most of the substance in that, including justification for this design, in a code comment here.

@sumeerbhola sumeerbhola requested a review from a team June 2, 2022 19:38
@bananabrick bananabrick force-pushed the admission_compaction branch from 89ca827 to c711888 Compare June 5, 2022 20:51
Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


-- commits line 15 at r7:

Previously, sumeerbhola wrote…

leftover comment. I am assuming the PR description will talk about the approach where this runnable stuff will come up.

It was from a commit squash. Updated the commit message.

@bananabrick bananabrick force-pushed the admission_compaction branch from c711888 to 7507986 Compare June 6, 2022 18:47
@bananabrick bananabrick force-pushed the admission_compaction branch from 7507986 to 870072a Compare June 6, 2022 20:31
@sumeerbhola sumeerbhola requested a review from a team June 6, 2022 20:31
This is for using additional threads for Pebble compaction compression
and later may be used for adjusting the number of concurrent compactions.

Since these slots are not acquired or released in a very fine grained
manner, we've chosen to model these in a way that allows for
over-commitment. The SoftSlotGranter should be used for acquiring
and releasing such slots, instead of WorkQueue, since there is no
queueing for such slots.

Moderate slots can be adjusted using two separate mechanisms. We tune the
moderate slots in an additive increase/additive decrease manner based on
the current CPU load, and we also use a historic average of the runnable
count to clamp down on the moderate slots.

We use the runnable count average because work not accounted for by admission
control can still use up CPU resources and cause queueing of runnable
goroutines.

Release note: None
@bananabrick bananabrick force-pushed the admission_compaction branch from 870072a to 3f1b12b Compare June 6, 2022 22:05
Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r8, 1 of 3 files at r9, 2 of 2 files at r10, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


pkg/util/admission/granter.go line 1299 at r10 (raw file):

			switch g := coord.granters[i].(type) {
			case *slotGranter:
				kvsa := coord.cpuLoadListener.(*kvSlotAdjuster)

Accessing the kvSlotAdjuster like this doesn't seem great.

@bananabrick bananabrick requested a review from a team June 6, 2022 23:14
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r8, 1 of 3 files at r9, 2 of 2 files at r10, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @bananabrick and @sumeerbhola)


-- commits line 18 at r10:
can you change this phrase to the following, since the SQL work is accounted for by admission control -- its just that the accounting is indirect and isn't reflected in the KV slot usage:

... not accounted for by KV admission control slot usage can still ...


pkg/util/admission/granter.go line 1568 at r7 (raw file):

Previously, sumeerbhola wrote…

Can you cleanup #78519 (comment) and include most of the substance in that, including justification for this design, in a code comment here.

For the future, please add "Done" to the comments that have been addressed so its clearer when doing review passes (I'm assuming you are using reviewable).


pkg/util/admission/granter.go line 1299 at r10 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

Accessing the kvSlotAdjuster like this doesn't seem great.

Just add a comment like
// kvSlotAdjuster is the only production implementation of CPULoadListener for nodes with a KV layer.

@sumeerbhola sumeerbhola requested a review from irfansharif June 8, 2022 20:43
@sumeerbhola
Copy link
Collaborator Author

Added @irfansharif for awareness of this change -- moderate load slots may be something we end up enhancing and using for less important KV work too, if we find that scheduling latency is to blame for higher latency of more important user facing traffic.

@bananabrick github will not me approve this, because I opened the PR and did the initial moderate load/soft slots changes. But I am confident that the spirit of code reviewing has been fully met since we've reviewed each other's changes. So if github lets you stamp an approval, I don't think there is a need to wait.

Copy link
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @bananabrick, @irfansharif, and @sumeerbhola)


pkg/util/admission/granter.go line 1568 at r7 (raw file):

For the future, please add "Done" to the comments that have been addressed so its clearer when doing review passes (I'm assuming you are using reviewable).

Yea, reviewable used to do that automatically. I presume this was your pr, which I was modifying so the mechanics were different.

@abarganier abarganier removed the request for review from a team June 22, 2022 20:16
@bananabrick
Copy link
Contributor

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 6, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jul 6, 2022

Build succeeded:

@craig craig bot merged commit 264cd04 into cockroachdb:master Jul 6, 2022
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 20, 2023
We originally introduced these notions in admission control (cockroachdb#78519) for
additional threads for Pebble compaction compression. We envisioned
granting these "squishy" slots to background activities and permit work
only under periods of low load. In working through cockroachdb#86638 (as part of
\cockroachdb#75066), we observed experimentally that the moderate-slots count was
not sensitive enough to scheduling latency, and consequently latency
observed by foreground traffic. Elastic CPU tokens, the kind now being
used for backups, offer an alternative to soft slots. We've since
replaced uses of soft slots with elastic CPU tokens.

This PR just removes the now dead-code code around soft/moderate load
slots (it's better to minimize the number of mechanisms in the admission
package)

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 23, 2023
We originally introduced these notions in admission control (cockroachdb#78519) for
additional threads for Pebble compaction compression. We envisioned
granting these "squishy" slots to background activities and permit work
only under periods of low load. In working through cockroachdb#86638 (as part of
\cockroachdb#75066), we observed experimentally that the moderate-slots count was
not sensitive enough to scheduling latency, and consequently latency
observed by foreground traffic. Elastic CPU tokens, the kind now being
used for backups, offer an alternative to soft slots. We've since
replaced uses of soft slots with elastic CPU tokens.

This PR just removes the now dead-code code around soft/moderate load
slots (it's better to minimize the number of mechanisms in the admission
package). Fixes cockroachdb#95590.

Release note: None
craig bot pushed a commit that referenced this pull request Jan 23, 2023
95590: admission: remove soft/moderate load slots r=irfansharif a=irfansharif

We originally introduced these notions in admission control (#78519) for additional threads for Pebble compaction compression. We envisioned granting these "squishy" slots to background activities and permit work only under periods of low load. In working through #86638 (as part of \#75066), we observed experimentally that the moderate-slots count was not sensitive enough to scheduling latency, and consequently latency observed by foreground traffic. Elastic CPU tokens, the kind now being used for backups, offer an alternative to soft slots. We've since replaced uses of soft slots with elastic CPU tokens.

This PR just removes the now dead-code code around soft/moderate load slots (it's better to minimize the number of mechanisms in the admission package). Fixes #88032.

Release note: None

---

First commit is from #95007.

Co-authored-by: irfan sharif <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants