Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client/fingerprint/cpu: use fallback total compute value if cpu not detected #9589

Merged
merged 1 commit into from
Dec 9, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 14 additions & 5 deletions client/fingerprint/cpu.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ import (
"github.com/hashicorp/nomad/nomad/structs"
)

const (
// defaultCPUTicks is the default amount of CPU resources assumed to be
// available if the CPU performance data is unable to be detected. This is
// common on EC2 instances, where the env_aws fingerprinter will follow up,
// setting an accurate value.
defaultCPUTicks = 1000 // 1 core * 1 GHz
)

// CPUFingerprint is used to fingerprint the CPU
type CPUFingerprint struct {
StaticFingerprinter
Expand Down Expand Up @@ -64,12 +72,13 @@ func (f *CPUFingerprint) Fingerprint(req *FingerprintRequest, resp *FingerprintR
tt = cfg.CpuCompute
}

// Return an error if no cpu was detected or explicitly set as this
// node would be unable to receive any allocations.
// If we cannot detect the cpu total compute, fallback to a very low default
// value and log a message about configuring cpu_total_compute. This happens
// on Graviton instances where CPU information is unavailable. In that case,
// the env_aws fingerprinter updates the value with correct information.
if tt == 0 {
return fmt.Errorf("cannot detect cpu total compute. "+
"CPU compute must be set manually using the client config option %q",
"cpu_total_compute")
f.logger.Info("fallback to default cpu total compute, set client config option cpu_total_compute to override")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log line is a little unfortunate in that it's important to see in non-AWS cases but in AWS cases we're going to fix it in the fingerprinter, which logs at debug. But I don't see a good way around that other than dropping this one to debug... which I think would be a bad idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I could be convinced to pick any of the possible combinations of Info/Debug. Realistically the most correct solution to all of this would involve plumbing to enable fingerprinters to coordinate. But for this, that would be like driving a thumbtack with sledgehammer.

tt = defaultCPUTicks
}

resp.AddAttribute("cpu.totalcompute", fmt.Sprintf("%d", tt))
Expand Down