-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow scaling executors to reject tasks after shutdown #81856
Allow scaling executors to reject tasks after shutdown #81856
Conversation
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Hi @tlrx, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlrx great PR, thank you!
@@ -210,7 +210,7 @@ private static void logAndFailTest(Exception e) { | |||
private final ThreadPool threadPool = new TestThreadPool( | |||
"TrackedCluster", | |||
// a single thread for "client" activities, to limit the number of activities all starting at once | |||
new ScalingExecutorBuilder(CLIENT, 1, 1, TimeValue.ZERO, CLIENT) | |||
new ScalingExecutorBuilder(CLIENT, 1, 1, TimeValue.ZERO, true, CLIENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, can we import static constant here and use ZERO instead TimeValue.ZERO? The code will be cleaner. But its just my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep it the way it is :)
} | ||
|
||
protected final EsRejectedExecutionException newRejectedException(Runnable r, ThreadPoolExecutor executor, boolean isExecutorShutdown) { | ||
rejected.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what purpose the inc() method is called here? I mean the method name is newRejectedException, but in the method implementation we also have an increment. I think its not so clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I pushed 6e2d87a
builders.put(Names.GENERIC, new ScalingExecutorBuilder(Names.GENERIC, 4, genericThreadPoolMax, TimeValue.timeValueSeconds(30))); | ||
builders.put( | ||
Names.GENERIC, | ||
new ScalingExecutorBuilder(Names.GENERIC, 4, genericThreadPoolMax, TimeValue.timeValueSeconds(30), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move these values (4 and 30) to constants please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not change that so I would prefer to leave this as is. We would have to make all the numbers here constants if we did this and I think that would make the code here harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Henning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I left a few comments.
|
||
@Override | ||
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) { | ||
if (rejectAfterShutdown && executor.isShutdown()) { | ||
throw newRejectedException(r, executor, true); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a potential for a race condition here, though very unlikely? If we get into rejectedExecution
due to all threads active but pool is shutdown here and then all threads go inactive too before we put the runnable on the queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the code again I think you are right. I pushed a939ea6 to add another test that executes tasks while concurrently shutting down the executor, making it more likely to have the race condition (on slow machines, it only fails 1 on 10K on my workstation though).
I reworked the logic in 0d89228 to avoid the race condition, let me know what you think.
builders.put(Names.GENERIC, new ScalingExecutorBuilder(Names.GENERIC, 4, genericThreadPoolMax, TimeValue.timeValueSeconds(30))); | ||
builders.put( | ||
Names.GENERIC, | ||
new ScalingExecutorBuilder(Names.GENERIC, 4, genericThreadPoolMax, TimeValue.timeValueSeconds(30), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not change that so I would prefer to leave this as is. We would have to make all the numbers here constants if we did this and I think that would make the code here harder to read.
@@ -561,19 +561,25 @@ protected XPackLicenseState getLicenseState() { | |||
|
|||
public static ScalingExecutorBuilder[] executorBuilders(Settings settings) { | |||
final int processors = EsExecutors.allocatedProcessors(settings); | |||
// searchable snapshots cache thread pools should always reject tasks once they are shutting down, otherwise some threads might be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you look into adding a test provoking this specific issue consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reproduced the issue in a test but it did not fail consistently and was way much too complex to maintain, while the current black hole behavior is more easily reproducible consistently in a thread pool test, so I went this way.
@@ -190,6 +191,7 @@ public void testExecutionExceptionOnScalingESThreadPoolExecutor() throws Interru | |||
1, | |||
10, | |||
TimeUnit.SECONDS, | |||
false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can check both true and false here?
false, | |
randomBoolean(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I changed this in ad729c9
server/src/main/java/org/elasticsearch/common/util/concurrent/EsExecutors.java
Show resolved
Hide resolved
for (int i = 0; i < queuedAfterShutdown; i++) { | ||
execute(scalingExecutor, () -> {}, executed, rejected, failed); | ||
} | ||
assertThat(scalingExecutor.getQueue().size(), rejectAfterShutdown ? equalTo(queued) : equalTo(queued + queuedAfterShutdown)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also validate that rejected
has 0 or queuedAfterShutdown
dependent on rejectAfterShutdown
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I changed this in ad729c9
|
||
block.countDown(); | ||
|
||
assertBusy(() -> assertTrue(scalingExecutor.isTerminated())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also verify adding new tasks after termination are rejected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++, changed in ad729c9
server/src/test/java/org/elasticsearch/threadpool/ScalingThreadPoolTests.java
Outdated
Show resolved
Hide resolved
Sorry for the delay @henningandersen, the race condition took me some time to fix. Can you please have another look? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left a few smaller comments, but no need for another round.
assert executor.getQueue() instanceof ExecutorScalingQueue; | ||
executor.getQueue().put(r); | ||
assert queue instanceof ExecutorScalingQueue; | ||
queue.put(task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should check if it is shutdown prior to adding to the queue (in addition to the check after adding it)? That would avoid the risk of the task being picked up during shutdown and only leave this "risk" for concurrent races, where it would be perfectly OK to run the task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I changed that in 3a4e5f4
if (rejectAfterShutdown) { | ||
if (executor.isShutdown() && executor.remove(task)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would find this slightly more logical as:
if (rejectAfterShutdown) { | |
if (executor.isShutdown() && executor.remove(task)) { | |
if (rejectAfterShutdown && executor.isShutdown()) { | |
if (executor.remove(task)) { |
since that seems to be the special case. With the remove being mutating I like that in its own condition to not think about and/or and order of evaluation, but could also collapse into one if statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more logical I agree. (I collapsed the statements)
} | ||
|
||
protected final EsRejectedExecutionException newRejectedException(Runnable r, ThreadPoolExecutor executor, boolean isExecutorShutdown) { | ||
return new EsRejectedExecutionException("rejected execution of " + r + " on " + executor, isExecutorShutdown); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add info to the exception message about it being shutdown when isExecutorShutdown=true
? I think that logging the exception will not show the flag otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added (shutdown)
in the message for that.
|
||
final Matcher<Long> executionsMatcher = rejectAfterShutdown | ||
? equalTo((long) max + queued) | ||
: greaterThanOrEqualTo((long) max + queued); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also check that it is lessThanOrEqualTo(max + queued + queuedAfterShutdown)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
} | ||
|
||
assertBusy(() -> assertTrue(scalingExecutor.isTerminated())); | ||
assertThat(scalingExecutor.getCompletedTaskCount(), greaterThanOrEqualTo((long) max)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also here can we check less than or equal to (max + barrier.getParties() - 1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
assertThat(scalingExecutor.getCompletedTaskCount(), greaterThanOrEqualTo((long) max)); | ||
assertThat(scalingExecutor.getQueue().size(), equalTo(0)); | ||
assertThat(scalingExecutor.getActiveCount(), equalTo(0)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that scalingExecutor.getCompletedTaskCount() + rejected.get() == max + barrier.getParties() - 1
? To ensure every request is accounted for exactly once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
Today scaling thread pools never reject tasks but always add them to the queue of task the execute, even in the case the thread pool executor is shutting down or terminated. This behaviour does not work great when a task is blocked waiting for another task from another scaling thread pool to complete an I/O operation which will never be executed if the task was enqueued just before the scaling thread pool was shutting down. This situation is more likely to happen with searchable snapshots in which multiple threads can be blocked waiting for parts of Lucene files to be fetched and made available in cache. We saw tests failures in CI where Lucene 9 uses concurrent threads (to asynchronously checks indices) that were blocked waiting for cache files to be available and failing because of leaking files handles (see elastic#77017, elastic#77178). This pull request changes the `ForceQueuePolicy` used by scaling thread pools so that it now accepts a `rejectAfterShutdown` flag which can be set on a per thread pool basis to indicate when tasks should just be rejected once the thread pool is shut down. Because we rely on many scaling thread pools to be black holes and never reject tasks, this flag is set to `false` on most of them to keep the current behavior. In some cases where the rejection logic was already implemented correctly this flag has been set to `true`. This pull request also reimplements the interface `XRejectedExecutionHandler` into an abstract class `EsRejectedExecutionHandler` that allows to share some logic for rejections.
💔 Backport failed
You can use sqren/backport to manually backport by running |
Today scaling thread pools never reject tasks but always add them to the queue of task the execute, even in the case the thread pool executor is shutting down or terminated. This behaviour does not work great when a task is blocked waiting for another task from another scaling thread pool to complete an I/O operation which will never be executed if the task was enqueued just before the scaling thread pool was shutting down. This situation is more likely to happen with searchable snapshots in which multiple threads can be blocked waiting for parts of Lucene files to be fetched and made available in cache. We saw tests failures in CI where Lucene 9 uses concurrent threads (to asynchronously checks indices) that were blocked waiting for cache files to be available and failing because of leaking files handles (see elastic#77017, elastic#77178). This pull request changes the `ForceQueuePolicy` used by scaling thread pools so that it now accepts a `rejectAfterShutdown` flag which can be set on a per thread pool basis to indicate when tasks should just be rejected once the thread pool is shut down. Because we rely on many scaling thread pools to be black holes and never reject tasks, this flag is set to `false` on most of them to keep the current behavior. In some cases where the rejection logic was already implemented correctly this flag has been set to `true`. This pull request also reimplements the interface `XRejectedExecutionHandler` into an abstract class `EsRejectedExecutionHandler` that allows to share some logic for rejections.
Today scaling thread pools never reject tasks but always add them to the queue of task the execute, even in the case the thread pool executor is shutting down or terminated. This behaviour does not work great when a task is blocked waiting for another task from another scaling thread pool to complete an I/O operation which will never be executed if the task was enqueued just before the scaling thread pool was shutting down. This situation is more likely to happen with searchable snapshots in which multiple threads can be blocked waiting for parts of Lucene files to be fetched and made available in cache. We saw tests failures in CI where Lucene 9 uses concurrent threads (to asynchronously checks indices) that were blocked waiting for cache files to be available and failing because of leaking files handles (see #77017, #77178). This pull request changes the `ForceQueuePolicy` used by scaling thread pools so that it now accepts a `rejectAfterShutdown` flag which can be set on a per thread pool basis to indicate when tasks should just be rejected once the thread pool is shut down. Because we rely on many scaling thread pools to be black holes and never reject tasks, this flag is set to `false` on most of them to keep the current behavior. In some cases where the rejection logic was already implemented correctly this flag has been set to `true`. This pull request also reimplements the interface `XRejectedExecutionHandler` into an abstract class `EsRejectedExecutionHandler` that allows to share some logic for rejections.
Today scaling thread pools never reject tasks but always add them to the queue of task the execute, even in the case the thread pool executor is shutting down or terminated. This behaviour does not work great when a task is blocked waiting for another task from another scaling thread pool to complete an I/O operation which will never be executed if the task was enqueued just before the scaling thread pool was shutting down. This situation is more likely to happen with searchable snapshots in which multiple threads can be blocked waiting for parts of Lucene files to be fetched and made available in cache. We saw tests failures in CI where Lucene 9 uses concurrent threads (to asynchronously checks indices) that were blocked waiting for cache files to be available and failing because of leaking files handles (see #77017, #77178). This pull request changes the `ForceQueuePolicy` used by scaling thread pools so that it now accepts a `rejectAfterShutdown` flag which can be set on a per thread pool basis to indicate when tasks should just be rejected once the thread pool is shut down. Because we rely on many scaling thread pools to be black holes and never reject tasks, this flag is set to `false` on most of them to keep the current behavior. In some cases where the rejection logic was already implemented correctly this flag has been set to `true`. This pull request also reimplements the interface `XRejectedExecutionHandler` into an abstract class `EsRejectedExecutionHandler` that allows to share some logic for rejections. Backport of #81856
Today scaling thread pools never reject tasks but always add them to the queue of task the execute, even in the case the thread pool executor is shutting down or terminated. This behaviour does not work great when a task is blocked waiting for another task from another scaling thread pool to complete an I/O operation which will never be executed if the task was enqueued just before the scaling thread pool was shutting down.
This situation is more likely to happen with searchable snapshots in which multiple threads can be blocked waiting for parts of Lucene files to be fetched and made available in cache. We saw tests failures in CI where Lucene 9 uses concurrent threads (to asynchronously checks indices) that were blocked waiting for cache files to be available and failing because of leaking files handles (see #77017, #77178).
This pull request changes the
ForceQueuePolicy
used by scaling thread pools so that it now accepts arejectAfterShutdown
flag which can be set on a per thread pool basis to indicate when tasks should just be rejected once the thread pool is shut down. Because we rely on many scaling thread pools to be black holes and never reject tasks, this flag is set tofalse
on most of them to keep the current behavior. In some cases where the rejection logic was already implemented correctly this flag has been set totrue
.This pull request also reimplements the interface
XRejectedExecutionHandler
into an abstract classEsRejectedExecutionHandler
that allows to share some logic for rejections.