Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BulkProcessor can deadlock when bulk requests fail #47599

Closed
jakelandis opened this issue Oct 4, 2019 · 5 comments · Fixed by #48697
Closed

BulkProcessor can deadlock when bulk requests fail #47599

jakelandis opened this issue Oct 4, 2019 · 5 comments · Fixed by #48697

Comments

@jakelandis
Copy link
Contributor

jakelandis commented Oct 4, 2019

If the BulkProcessor results in failed bulk requests, they will be retried via the RetryHandler. In versions of Elasticsearch prior to 7.3.0 this can result in a deadlock.

The deadlock can happen due to the Scheduler which is shared between the Flush and Retry logic. The deadlock can happen because the Scheduler is configured with 1 worker thread which can be blocked in the Flush method. The Flush method is guarded by synchronized (BulkProcessor.this) , but the internalAdd(..) method is also blocked by the same (synchronized) monitor lock. What can happen is that a bulk request comes in through internalAdd obtains the lock, the bulk request is sent and a failure occurs, so the retry logic kicks in. The scheduler thread is blocked in the Flush method due to the internalAdd's hold on the synchronized block , so now when the retry attempts to schedule a retry, it can not because the Flush is blocking the only worker thread for the scheduler. So here Flush can not continue because it is waiting on internalAdd to finish, and internalAdd can not finish because it waiting on Retry, but Retry can not finish because it is waiting on a scheduler thread which it can not obtain because it is waiting on Flush to finish.

The change in 7.3.0 fixes this issue because it is much more selective about exactly what is locked, and no longer wraps the execution of the bulk request inside the lock.

Until 7.3.0 the only workaround is to set the BackOffPolicy to BackoffPolicy.noBackoff() so that the Retry does not kick in. The default backoff is BackoffPolicy.exponentialBackoff() which is used by Watcher and is not configurable and thus is susceptible to this bug pre 7.3.0

This is related to #41418 butthe fix for that does not fix this issue. This issue is fixed by #41451 in 7.3.0.

EDIT: see below for a case where this can still happen in [7.3.0-7.5.0) (if the flush documents themselves fail)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Watcher)

@jakelandis
Copy link
Contributor Author

Thanks to @suxinglee for finding this issue and providing a test case on #46790.

I logged this issue as reference point and will close out this issue as fixed by #41451 in 7.3.0. I will evaluate if Watcher needs to implement the workaround for older versions in a different PR.

jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Oct 4, 2019
7.3.0 fixes this issue by changing the locking strategy in elastic#41451.
However, that change is not part of 6.x and the change here is
a minimal workaround to prevent the potential of deadlock.

This change will no longer retry failed bulk requests that go
through the BulkProcessor for Watcher. Specifically this removes
the retry logic when adding Watcher history and Triggered watches
when the Bulk request failed.

Related elastic#47599
@suxinglee
Copy link

Thank you for your reply.

Until 7.3.0 the only workaround is to set the BackOffPolicy to BackoffPolicy.noBackoff() so that the Retry does not kick in.

PS: Another workaround is not to set the bulk flush interval.

jakelandis added a commit that referenced this issue Oct 7, 2019
7.3.0 fixes this issue by changing the locking strategy in #41451.
However, that change is not part of 6.x and the change here is
a minimal workaround to prevent the potential of deadlock.

This change will no longer retry failed bulk requests that go
through the BulkProcessor for Watcher. Specifically this removes
the retry logic when adding Watcher history and Triggered watches
when the Bulk request failed.

Related #47599
suxinglee added a commit to suxinglee/elasticsearch that referenced this issue Oct 14, 2019
suxinglee added a commit to suxinglee/elasticsearch that referenced this issue Oct 14, 2019
@suxinglee
Copy link

@jakelandis
After modifying the test case: when the Flush logic is executed first and a failure occurs. I found that it seems that a similar problem still occurs in the latest version of ES. #48013 is the newest test case in master branch. I am not sure and hope you can verify it.

@jakelandis
Copy link
Contributor Author

Re-opening issue since as @suxinglee points out this can still happen if the documents that flush is flushing fail.

In the original description and what is fixed in 7.3.0 is "normal requests" fail -> Flush kicks in but is blocked by the request that failured -> Retry tries to happens but is blocked by Flush. The more granular locking fixes that deadlock case.

Here, the question is what happens if the "Flush request" fail (which grabs the lock and only slot in the scheduler), and the Retry attempts to kick in. Retry can't run because the one and only slot in the scheduler is consumed by Flush, but Flush is waiting on the Retry to finish. Hence the deadlock.

The workarounds are the same, disable Flush (pass null) , or disable Retry (use BackoffPolicy.noBackoff()).

@suxinglee - thanks again for the test that illustrates this. I am still weighing the options of increasing single scheduler number of slots vs. using separate schedulers for flush and retry.

@jakelandis jakelandis reopened this Oct 29, 2019
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Oct 30, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes elastic#47599

Note - elastic#41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit that referenced this issue Oct 31, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes #47599

Note - #41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Nov 11, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes elastic#47599

Note - elastic#41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Nov 11, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes elastic#47599

Note - elastic#41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Nov 11, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes elastic#47599

Note - elastic#41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit that referenced this issue Nov 11, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes #47599

Note - #41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit that referenced this issue Nov 11, 2019
Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes #47599

Note - #41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
jakelandis added a commit that referenced this issue Nov 12, 2019
* Prevent deadlock by using separate schedulers (#48697)

Currently the BulkProcessor class uses a single scheduler to schedule
flushes and retries. Functionally these are very different concerns but
can result in a dead lock. Specifically, the single shared scheduler
can kick off a flush task, which only finishes it's task when the bulk
that is being flushed finishes. If (for what ever reason), any items in
that bulk fails it will (by default) schedule a retry. However, that retry
will never run it's task, since the flush task is consuming the 1 and
only thread available from the shared scheduler.

Since the BulkProcessor is mostly client based code, the client can
provide their own scheduler. As-is the scheduler would require
at minimum 2 worker threads to avoid the potential deadlock. Since the
number of threads is a configuration option in the scheduler, the code
can not enforce this 2 worker rule until runtime. For this reason this
commit splits the single task scheduler into 2 schedulers. This eliminates
the potential for the flush task to block the retry task and removes this
deadlock scenario.

This commit also deprecates the Java APIs that presume a single scheduler,
and updates any internal code to no longer use those APIs.

Fixes #47599

Note - #41451 fixed the general case where a bulk fails and is retried
that can result in a deadlock. This fix should address that case as well as
the case when a bulk failure *from the flush* needs to be retried.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants