Radix-sorted, lookup-free, typesafe, nonblocking, priority-boosting pending tasks #85751

DaveCTurner · 2022-04-07T16:03:11Z

Today the master's pending task queue is just the
PriorityBlockingQueue<Runnable> belonging to the underlying
ThreadPoolExecutor. The reasons for this date back a long way but it
doesn't really reflect the structure of the queue as it exists today. In
particular, we must keep track of batches independently of the queue
itself, and must do various bits of unchecked casting to process
multiple items of the same type at once.

This commit introduces an new queueing mechanism, independent of the
executor's queue, which better represents the conceptual structure of
the master's pending tasks:

Today we use a priority queue to allow important tasks to preempt
less-important ones. However there are only a small number of priority
levels, so it is simpler to maintain a queue for each priority,
effectively replacing the sorting within the priority queue with a
radix sort.
Today when a task is submitted we perform a map lookup to see if it
can be added to an existing batch or not. With this change we allow
client code to create its own dedicated queue of tasks. The entries in
the per-priority-level queues are themselves queues, one for each
executor, representing the batches to be run.
Today each task in the queue holds a reference to its executor, but
the executor used to run a task may belong to a different task in the
same batch. In practice we know they're the same executor (that's how
batches are defined) but we cannot express this knowledge in the type
system so we have to do a bunch of unchecked casting to work around it.
With this change we associate each per-executor queue directly with its
executor, avoiding the need to do all this unchecked casting.
Today the master service must block its thread while waiting for each
task to complete, because otherwise the executor would start to process
the next task in the queue. This makes testing using a
DeterministicTaskQueue harder (see FakeThreadPoolMasterService).
With this change we avoid submitting a task to the ThreadPoolExecutor
until the previous task is complete, which means we can make the
implementation avoid blocking while a task is running and therefore run
the whole production implementation in deterministic tests¹.
Today it's possible for a steady drip of high-priority tasks to starve
low-priority tasks of access to the master for an extended period of
time. With this change we separate the queue of tasks from the queue
which determines the execution order. This allows us to implement more
intelligent execution policies. For instance, if we detect that the
queue has not processed any low-priority tasks for too long then we can
make the decision to boost their priorities².

Relates #81626

Not done yet but this is a step in the right direction. ↩
Not done yet but this is a step in the right direction. ↩

Today when submitting a cluster state update task the caller can indicate that it is not to be included in a batch by using an executor obtained from `ClusterStateTaskExecutor#unbatched()`. This is kind of weird: the caller must not re-use the executor, and also the master service has no practical way to handle this special case any differently. This commit introduces a dedicated API for unbatched tasks, opening the door to some future simplifications. Extracted from elastic#85751

Today in `ClusterStateChanges` we use mocks to capture tasks submitted to the master service in order to compute the relevant cluster state updates ourselves. In fact there's no need to do this, we can use the real master service with a customized executor. This commit does that.

Today when submitting a cluster state update task the caller can indicate that it is not to be included in a batch by using an executor obtained from `ClusterStateTaskExecutor#unbatched()`. This is kind of weird: the caller must not re-use the executor, and also the master service has no practical way to handle this special case any differently. This commit introduces a dedicated API for unbatched tasks, opening the door to some future simplifications.

…ending tasks Today the master's pending task queue is just the `PriorityBlockingQueue<Runnable>` belonging to the underlying `ThreadPoolExecutor`. The reasons for this date back a long way but it doesn't really reflect the structure of the queue as it exists today. In particular, we must keep track of batches independently of the queue itself, and must do various bits of unchecked casting to process multiple items of the same type at once. This commit introduces an new queueing mechanism, independent of the executor's queue, which better represents the conceptual structure of the master's pending tasks: * Today we use a priority queue to allow important tasks to preempt less-important ones. However there are only a small number of priority levels, so it is simpler to maintain a queue for each priority, effectively replacing the sorting within the priority queue with a radix sort. * Today when a task is submitted we perform a map lookup to see if it can be added to an existing batch or not. With this change we allow client code to create its own dedicated queue of tasks. The entries in the per-priority-level queues are themselves queues, one for each executor, representing the batches to be run. * Today each task in the queue holds a reference to its executor, but the executor used to run a task may belong to a different task in the same batch. In practice we know they're the same executor (that's how batches are defined) but we cannot express this knowledge in the type system so we have to do a bunch of unchecked casting to work around it. With this change we associate each per-executor queue directly with its executor, avoiding the need to do all this unchecked casting. * Today the master service must block its thread while waiting for each task to complete, because otherwise the executor would start to process the next task in the queue. This makes testing using a `DeterministicTaskQueue` harder (see `FakeThreadPoolMasterService`). With this change we avoid submitting a task to the `ThreadPoolExecutor` until the previous task is complete, which means we can make the implementation avoid blocking while a task is running and therefore run the whole production implementation in deterministic tests[^1]. * Today it's possible for a steady drip of high-priority tasks to starve low-priority tasks of access to the master for an extended period of time. With this change we separate the queue of tasks from the queue which determines the execution order. This allows us to implement more intelligent execution policies. For instance, if we detect that the queue has not processed any low-priority tasks for too long then we can make the decision to boost their priorities[^2]. [^1]: Not done yet but this is a step in the right direction. [^2]: Not done yet but this is a step in the right direction.

(conflicts not yet resolved fully)

DaveCTurner · 2022-11-30T14:52:09Z

Superseded by #92021

DaveCTurner added WIP >refactoring :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.3.0 labels Apr 7, 2022

DaveCTurner force-pushed the 2022-04-07-master-service-new-queues branch from 8d20e99 to db5e3bd Compare April 8, 2022 14:30

DaveCTurner added 2 commits April 8, 2022 16:55

DaveCTurner force-pushed the 2022-04-07-master-service-new-queues branch from db5e3bd to 8bc6804 Compare April 8, 2022 16:00

DaveCTurner force-pushed the 2022-04-07-master-service-new-queues branch from 8bc6804 to ffc3a97 Compare April 8, 2022 16:22

WIP working towards one processor to rule them all

e26a000

DaveCTurner force-pushed the 2022-04-07-master-service-new-queues branch from 76dc00e to e26a000 Compare April 9, 2022 20:27

DaveCTurner added 9 commits April 10, 2022 10:05

Track insertion time/order

ae8daaa

Fix unbatched timeout handler

da05096

TaskTimeoutHandler

143683d

Drop timedOut flag on unbatched tasks

f3ec22a

Merge branch 'master' into 2022-04-07-master-service-new-queues

ca6ee19

(conflicts not yet resolved fully)

Fixup

924b19f

Moar fix

facbd4d

Moar fix

f8350bb

Moar fix

3bbfadb

DaveCTurner mentioned this pull request Apr 28, 2022

Rework shutdown no-op tests to avoid mocks #86236

Open

craigtaverner added v8.4.0 and removed v8.3.0 labels May 25, 2022

DaveCTurner added >tech debt Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Jul 12, 2022

Test fixes

88e487c

kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022

DaveCTurner added 9 commits November 28, 2022 13:12

Merge branch 'main' into 2022-04-07-master-service-new-queues

e58250a

Spotless

6b6a279

Fixup after merge

939cbfa

Test fix

d66f673

TODO temp fix

929fb66

Merge branch 'main' into 2022-04-07-master-service-new-queues

8e2f21f

Make UnbatchedExecutor a class

bd51e30

Comments

391dbd5

Lazy BatchSummary again

315b2b5

DaveCTurner mentioned this pull request Nov 29, 2022

Ensure cached time elapses in ClusterServiceIT #91986

Merged

DaveCTurner added 14 commits November 29, 2022 09:18

Move constant

d8d61f5

Lazy grouping of tasks by source too

686a89a

Test already exists

de02f27

Tests for pending tasks APIs

b3c0643

Test behaviour on rejection/close

b3e21f2

Fix masterService.getMaxTaskWaitTime

8240db9

Fix tests that shut down master service too early

d25e6b8

Add test for timeouts

3aa2260

Migrate starvation watching

343d76b

Merge branch 'main' into 2022-04-07-master-service-new-queues

4e459e4

Trivial reverts

bf5c34e

Move

4c38603

Merge branch 'main' into 2022-04-07-master-service-new-queues

a325d1d

Catch exceptions from failure handler

a6bc582

DaveCTurner closed this Nov 30, 2022

DaveCTurner deleted the 2022-04-07-master-service-new-queues branch March 6, 2023 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Radix-sorted, lookup-free, typesafe, nonblocking, priority-boosting pending tasks #85751

Radix-sorted, lookup-free, typesafe, nonblocking, priority-boosting pending tasks #85751

DaveCTurner commented Apr 7, 2022 •

edited

Loading

DaveCTurner commented Nov 30, 2022

Radix-sorted, lookup-free, typesafe, nonblocking, priority-boosting pending tasks #85751

Radix-sorted, lookup-free, typesafe, nonblocking, priority-boosting pending tasks #85751

Conversation

DaveCTurner commented Apr 7, 2022 • edited Loading

Footnotes

DaveCTurner commented Nov 30, 2022

DaveCTurner commented Apr 7, 2022 •

edited

Loading