-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs for gomaxprops option #416
Conversation
docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc
Outdated
Show resolved
Hide resolved
[discrete] | ||
=== Limiting {agent} resources | ||
|
||
If you need to limit the amount of resource consumed by {agent} you can use the `agent.limits.go_max_procs` configuration option. This option sets the maximum number of CPUs that can be executing simultaneously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kilfoyle let's rewrite this section please. this configuration is not limiting CPU used by the agent, it's "limiting the CPU used by the underlying beats that are supervised by agent." So for example, setting agent.limits.go_max_procs to 1, would mean that only the beats supervised by the agent will be limited to 1 vCPU. "elastic-agent status" would show the user how many beats are being supervised by the agent.
@rdner does this value have to be an integer or are decimals also acceptable? (like half a vcpu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification Nima. I've updated the section as shown. Let me know if I've missed anything:
@rdner I'll add a note about whether the value has to be an integer once you have a chance to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this configuration is not limiting CPU used by the agent, it's "limiting the CPU used by the underlying beats that are supervised by agent." So for example, setting agent.limits.go_max_procs to 1, would mean that only the beats supervised by the agent will be limited to 1 vCPU. "elastic-agent status" would show the user how many beats are being supervised by the agent.
This is not accurate.
This parameter sets how many CPUs the Go runtime can schedule Go routines on. It does not guarantee that the the given CPU count is used. It might use more for internal purposes of the runtime.
This is what the official documentation says:
The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes the limit.
The value is set for both the agent and underlying Beats it runs, it does not affect Endpoint because it's not written in Go.
@rdner does this value have to be an integer or are decimals also acceptable? (like half a vcpu)
it's only integer, not decimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have this parameter exposed in the Beats configuration, by the way. It might make sense to make references and updates to that too https://www.elastic.co/guide/en/beats/filebeat/current/configuration-general-options.html#_max_procs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdner thanks. The "threads that can execute user-level Go code simultaneously" confuses me a bit. Can you explain what happens when each Beat under agent has GOMAXPROCS set to 1? let's just say we have 4 beats (ignoring that agent is also in GO. My understanding is that each Beat would then be limited to maximum of 1 thread. So in totality of the agent (again ignoring agent itself) we are limited to 4 CPUs. (of course each beat is limited to only 1).
@cmacknz can i get your eyes on this docs change also. thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nimarezainia there is no simple explanation, unfortunately. If we want to tell our customers the truth it has to be very complicated. The Go runtime is running its own OS threads on any amount of CPUs it has access to. For example, the garbage collector. It's a runtime implementation detail that might change in a later Go release.
What GOMAXPROCS=1
is limiting is what's running on that runtime, so the code that we build and run on it. For example, all the Filebeat code running in parallel will be scheduled only using a single CPU. However, the second CPU might be used by the garbage collector or any other runtime thread. So, it's not accurate to say that we limit everything to a single CPU.
@cmacknz might have a better explanation that I do but it does not change the facts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest explanation for GOMAXPROCS is that it limits the number of operating system threads that can be executing Go code simultaneously. The majority of users will not care for these low level details, as long as our explanation is approximately correct.
GOMAXPROCS accounts for all user level Go code as far as I can tell, that is the code that executes in kernel user space. We can link them back to the definition of GOMAXPROCS in the Go runtime and interested readers can dig as far as they want to. https://pkg.go.dev/runtime#GOMAXPROCS
For completeness, the GCCPUFraction parameter describes GOMAXPROCS in a way that indicates that it does include CPU time spent in the garbage collector.
// GCCPUFraction is the fraction of this program's available
// CPU time used by the GC since the program started.
//
// GCCPUFraction is expressed as a number between 0 and 1,
// where 0 means GC has consumed none of this program's CPU. A
// program's available CPU time is defined as the integral of
// GOMAXPROCS since the program started. That is, if
// GOMAXPROCS is 2 and a program has been running for 10
// seconds, its "available CPU" is 20 seconds. GCCPUFraction
// does not include CPU time used for write barrier activity.
//
// This is the same as the fraction of CPU reported by
// GODEBUG=gctrace=1.
GCCPUFraction [float64](https://pkg.go.dev/builtin#float64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cmacknz, that seems very clear! Here's some proposed text:
If you need to limit the amount of CPU consumption you can use the
agent.limits.go_max_procs
configuration option. This parameter limits the number of operating system threads that can be executing Go code simultaneously, thereby limiting the CPU used by both the agent and the underlying {beats} that it supervises. Theagent.limits.go_max_procs
option accepts an integer value not less than0
, which is the default value that stands for "all available CPUs".
The
agent.limits.go_max_procs
configuration option is similar to the {beats} {filebeat-ref}/configuration-general-options.html#_max_procs[max_procs
] setting. For more detail about the option, refer to the link:https://pkg.go.dev/runtime#GOMAXPROCS[GOMAXPROCS] function in the Go runtime documentation.
To enable the option, run a <<fleet-api-docs,{fleet} API>> request from the {kib} {kibana-ref}/console-kibana.html[Dev Tools console] to override your current {agent} policy and add the
go_max_procs
parameter. For example, to limit Go code to two operating system threads, run:
...API example...
@rdner based on your correction I've updated the text to the following, with a link to the Beats page. How does this look? BTW, the Beats docs are owned by the dev team but I'm happy to review any changes.. |
docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc
Outdated
Show resolved
Hide resolved
…asciidoc Co-authored-by: Denis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it needs to be clearer that the limit here applies to each process started by agent independently. I add an example to hopefully clarify this. Feel free to wordsmith my suggestions if needed, the way this works is a bit awkward. It is more of a stop gap limit until we can build a more intuitive one.
docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc
Outdated
Show resolved
Hide resolved
docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc
Outdated
Show resolved
Hide resolved
…asciidoc Co-authored-by: Craig MacKenzie <[email protected]>
…asciidoc Co-authored-by: Craig MacKenzie <[email protected]>
docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc
Outdated
Show resolved
Hide resolved
Thanks @cmacknz! I've added the suggestions and made a couple of minor cosmetic changes as well. I'm still a bit confused though: Does running two Beats result in three processes? Or should the example be that agent is supervising three Beats? |
There are three total processes: Elastic Agent itself and the two Beats. Agent is one of the processes here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Thanks for the further clarifications here. It looks good to me. It is a complex thing to have to explain and the answer is always dependent on what the user has deployed so I think this explanation takes us a long way. |
* Add docs for gomaxprops option * Update command description wrt Beats CPU * Fixup * Link to Filebeat max_procs; clarify must be integer * Update setting description * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Denis <[email protected]> * Update GOMAXPROCS description * touchup * touchup * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc * touchup * touchup --------- Co-authored-by: Denis <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit f279aaa)
* Add docs for gomaxprops option * Update command description wrt Beats CPU * Fixup * Link to Filebeat max_procs; clarify must be integer * Update setting description * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Denis <[email protected]> * Update GOMAXPROCS description * touchup * touchup * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc Co-authored-by: Craig MacKenzie <[email protected]> * Update docs/en/ingest-management/elastic-agent/install-elastic-agent.asciidoc * touchup * touchup --------- Co-authored-by: Denis <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit f279aaa) Co-authored-by: David Kilfoyle <[email protected]>
This adds documentation for the GOMAXPROPS option to limit CPU usage by Elastic Agent, implemented via elastic/elastic-agent#3179
See docs preview
@nimarezainia @rdner This seemed to me like the best spot in the docs for this, right below the minimum requirements for installing agent. Please let me know if I've missed anything or if the API call may need some fixing up.