client: fix waiting on preempted alloc #12779

schmichael · 2022-04-26T00:18:54Z

The bug

https://github.com/hashicorp/nomad-enterprise/issues/707

A user reported receiving the following error when an alloc was placed
that needed to preempt existing allocs:

[ERROR] client.alloc_watcher: error querying previous alloc:
alloc_id=28... previous_alloc=8e... error="rpc error: alloc lookup
failed: index error: UUID must be 36 characters"

The previous alloc (8e) was already complete on the client. This is
possible if an alloc stops after the scheduling decision was made to
preempt it, but before the node running both allocations was able to
pull and start the preemptor. While that is hopefully a narrow window of
time, you can expect it to occur in high throughput batch scheduling
heavy systems.

However the RPC error made no sense! previous_alloc in the logs was a
valid 36 character UUID!

The fix

The fix is:

-		prevAllocID:  c.Alloc.PreviousAllocation,
+		prevAllocID:  watchedAllocID,

The alloc watcher new func used for preemption improperly referenced
Alloc.PreviousAllocation instead of the passed in watchedAllocID. When
multiple allocs are preempted, a watcher is created for each with
watchedAllocID set properly by the caller. In this case
Alloc.PreviousAllocation="" -- which is where the UUID must be 36 characters
error was coming from! Sadly we were properly referencing
watchedAllocID in the log, so it made the error make no sense!

The repro

I was able to reproduce this with a dev agent with preemption enabled and lowered limits for ease of repro.

First I started a low priority count 3 job, then a high priority job that evicts 2 low priority jobs. Everything worked as expected.

However if I force it to use the remotePrevAlloc implementation, it reproduces the bug because the watcher references PreviousAllocation instead of watchedAllocID.

tgross

LGTM!

It looks like this is a very old bug (pre-0.9) but I spent a little time going thru lingering open issues and didn't manage to find one for it. #10200 looks like it could be related but we kind of went off in the weeds of the cluster topology in that issue so I'm not sure.

Don't forget the changelog entry and backport labels!

Fixes #10200 **The bug** A user reported receiving the following error when an alloc was placed that needed to preempt existing allocs: ``` [ERROR] client.alloc_watcher: error querying previous alloc: alloc_id=28... previous_alloc=8e... error="rpc error: alloc lookup failed: index error: UUID must be 36 characters" ``` The previous alloc (8e) was already complete on the client. This is possible if an alloc stops *after* the scheduling decision was made to preempt it, but *before* the node running both allocations was able to pull and start the preemptor. While that is hopefully a narrow window of time, you can expect it to occur in high throughput batch scheduling heavy systems. However the RPC error made no sense! `previous_alloc` in the logs was a valid 36 character UUID! **The fix** The fix is: ``` - prevAllocID: c.Alloc.PreviousAllocation, + prevAllocID: watchedAllocID, ``` The alloc watcher new func used for preemption improperly referenced Alloc.PreviousAllocation instead of the passed in watchedAllocID. When multiple allocs are preempted, a watcher is created for each with watchedAllocID set properly by the caller. In this case Alloc.PreviousAllocation="" -- which is where the `UUID must be 36 characters` error was coming from! Sadly we were properly referencing watchedAllocID in the log, so it made the error make no sense! **The repro** I was able to reproduce this with a dev agent with [preemption enabled](https://gist.github.com/schmichael/53f79cbd898afdfab76865ad8c7fc6a0#file-preempt-hcl) and [lowered limits](https://gist.github.com/schmichael/53f79cbd898afdfab76865ad8c7fc6a0#file-limits-hcl) for ease of repro. First I started a [low priority count 3 job](https://gist.github.com/schmichael/53f79cbd898afdfab76865ad8c7fc6a0#file-preempt-lo-nomad), then a [high priority job](https://gist.github.com/schmichael/53f79cbd898afdfab76865ad8c7fc6a0#file-preempt-hi-nomad) that evicts 2 low priority jobs. Everything worked as expected. However if I force it to use the [remotePrevAlloc implementation](https://github.com/hashicorp/nomad/blob/v1.3.0-beta.1/client/allocwatcher/alloc_watcher.go#L147), it reproduces the bug because the watcher references PreviousAllocation instead of watchedAllocID.

schmichael · 2022-04-26T20:07:03Z

@tgross Thanks for making me take another look at #10200. At first I assumed these were unrelated, but now I'm not so sure! At the very least #10200 exhibits the same UUID must be 36 characters log line due to the same bug where the wrong field is used when waiting on a previous/preempted allocation.

So I'm actually going the ambitious route of closing that one with this. If there's another compounding bug in #10200 hopefully someone will reopen and file a new issue. I just don't see us making progress on #10200 without a repro after this lands, and I didn't want folks to think there was no progress on it. (I'll leave a similar note on that issue when merging this.)

github-actions · 2022-10-15T02:45:43Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

schmichael requested review from shoenig and tgross April 26, 2022 00:18

vercel bot deployed to Preview – nomad-storybook-and-ui April 26, 2022 00:21 View deployment

tgross approved these changes Apr 26, 2022

View reviewed changes

shoenig approved these changes Apr 26, 2022

View reviewed changes

schmichael force-pushed the b-preempt-error branch from a606db0 to 5328f3c Compare April 26, 2022 19:56

schmichael added backport/1.1.x backport to 1.1.x release line backport/1.2.x backport to 1.1.x release line labels Apr 26, 2022

vercel bot deployed to Preview – nomad-storybook-and-ui April 26, 2022 19:58 View deployment

schmichael merged commit e7924e3 into main Apr 26, 2022

schmichael deleted the b-preempt-error branch April 26, 2022 20:14

This was referenced Apr 26, 2022

Backport of client: fix waiting on preempted alloc into release/1.1.x #12789

Merged

Backport of client: fix waiting on preempted alloc into release/1.2.x #12790

Merged

schmichael mentioned this pull request Apr 26, 2022

Alloc.GetAlloc fails on rebooted client #10200

Closed

github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client: fix waiting on preempted alloc #12779

client: fix waiting on preempted alloc #12779

schmichael commented Apr 26, 2022 •

edited

Loading

tgross left a comment

schmichael commented Apr 26, 2022 •

edited

Loading

github-actions bot commented Oct 15, 2022

client: fix waiting on preempted alloc #12779

client: fix waiting on preempted alloc #12779

Conversation

schmichael commented Apr 26, 2022 • edited Loading

tgross left a comment

Choose a reason for hiding this comment

schmichael commented Apr 26, 2022 • edited Loading

github-actions bot commented Oct 15, 2022

schmichael commented Apr 26, 2022 •

edited

Loading

schmichael commented Apr 26, 2022 •

edited

Loading