-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stale allocation data leads to incorrect (and even negative) metrics #5637
Conversation
per the comment above about using nomad/client/allocrunner/alloc_runner.go Lines 541 to 547 in e964c7f
|
client/client.go
Outdated
@@ -2766,6 +2766,8 @@ func (c *Client) allAllocs() map[string]*structs.Allocation { | |||
allocs := make(map[string]*structs.Allocation, len(ars)) | |||
for _, ar := range ars { | |||
a := ar.Alloc() | |||
as := ar.AllocState() | |||
a.ClientStatus = as.ClientStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is safe as you're modifying ar.alloc
. I think you need to copy it first.
This also gives allAllocs
pretty special behavior. It returns a view of each alloc not available anywhere else. At a minimum we should comment this unusual special case, but since there only appears to be a single caller I think we should probably scrap this method entirely and inline the behavior in the caller (or write a one-off function we can unit test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed with the criticism and the suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thanks for digging into this!
e964c7f
to
b3e28de
Compare
…loc count towards allocated resources
b3e28de
to
c7501c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, fwiw
CpuShares: 768, | ||
}, | ||
Memory: structs.AllocatedMemoryResources{ | ||
MemoryMB: 768, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out right here why you were doubling the resource settings through this test, and I like it. Maybe worth a comment up near the top just to draw attention to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i actually started to write a comment and it felt like i was ruining the joke ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such unfortunately subtle code. Thanks for cleaning this up!
Co-Authored-By: cgbaker <[email protected]>
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
client: was not using up-to-date client state in determining which alloc count towards allocated resources
this is the second issue where using
AllocRunner.Alloc()
instead ofAllocRunner.AllocState()
meant that theClientState
had the wrong data. @schmichael previously noted how nefarious this is, that we should look for a solution. i haven't done anything to address the underlying issue with this PR; I've just made the simplest fix to correct the metrics calculation. in fact, the call toAllocState()
may be a little more expensive than is necessary for this; we only need to know whether the allocation is still resident.fixes #5570