Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for obtaining system health statistics #385

Closed
jphickey opened this issue Mar 29, 2023 · 4 comments · Fixed by #386
Closed

Add API for obtaining system health statistics #385

jphickey opened this issue Mar 29, 2023 · 4 comments · Fixed by #386
Assignees
Labels
enhancement New feature or request

Comments

@jphickey
Copy link
Contributor

Is your feature request related to a problem? Please describe.
CFS apps (such as HS in particular) need to monitor and report the health of the system, in particular CPU usage. Unfortunately this info can vary wildly and there is no standardized way of getting it via POSIX or other OS APIs - it is generally only obtainable via platform-specific access methods such as the /proc filesystem on Linux.

Describe the solution you'd like
Design an API that can obtain system health statistics. Initially this must support per-core CPU usage, but should be extendable to support arbitrary variables such as temperature, network+disk I/O stats, RAM+swap use, etc. Basically anything that is typically shown in a PC "health monitor" app.

Additional context
Initially the CPU usage stats would allow nasa/HS#3, nasa/HS#4, and nasa/HS#85 to be resolved.

Requester Info
Joseph Hickey, Vantage Systems, Inc.

@jphickey jphickey self-assigned this Mar 29, 2023
@jphickey jphickey added the enhancement New feature or request label Mar 29, 2023
@skliper
Copy link
Contributor

skliper commented Mar 29, 2023

Would it make sense to also consider adding APIs for task related info? Stack use, cpu use, cpu affinity, whatever else might be available on that OS?

@skliper
Copy link
Contributor

skliper commented Mar 29, 2023

I ask partly because there's cases where by design CPU use is 100%, but what's more important is how much CPU is being used by the lowest priority "background" task. And getting/monitoring stack use seems like it should be a requirement for complex embedded systems...

@jphickey
Copy link
Contributor Author

Yes, I'm thinking to make it based on key/value pairs with values as floats or fixed-point ints. Keys could be arbitrary strings. Then we'd just publish a list of "common" key names for things like system CPU use or temperature. HS could attempt to read variable(s) based on key name, and if it doesn't exist, it would simply return not implemented/unknown. But for the ones that do exist on that platform, HS could report the value (or whatever).

This would make it easy to extend with as many platform-specific sensors as you need (e.g. disk temperature is a thing too, that some hardware has a sensor for, but some does not).

I think by adding a "scope" argument of some type, the same API could be applied to individual tasks/processes too - such as reading the RAM use or CPU use of a particular task.

@jphickey
Copy link
Contributor Author

Or maybe have a set of variables named something like "<TASK_NAME>/<TASK_VAR>" for that type of stuff....?

jphickey added a commit to jphickey/PSP that referenced this issue Apr 6, 2023
Defines an "iodriver" interface with a simple module id + opcode +
argument interface, which can be extended as necessary for different
purposes.

Also adds a "linux_sysmon" module that implements this interface to
provide system monitoring capabilities.  This includes, but is not
limited to, the CPU utilization that HS needs.
jphickey added a commit to jphickey/PSP that referenced this issue Apr 6, 2023
Defines an "iodriver" interface with a simple module id + opcode +
argument interface, which can be extended as necessary for different
purposes.

Also adds a "linux_sysmon" module that implements this interface to
provide system monitoring capabilities.  This includes, but is not
limited to, the CPU utilization that HS needs.
jphickey added a commit to jphickey/PSP that referenced this issue Apr 6, 2023
Defines an "iodriver" interface with a simple module id + opcode +
argument interface, which can be extended as necessary for different
purposes.

Also adds a "linux_sysmon" module that implements this interface to
provide system monitoring capabilities.  This includes, but is not
limited to, the CPU utilization that HS needs.
jphickey added a commit to jphickey/PSP that referenced this issue Apr 7, 2023
jphickey added a commit to jphickey/PSP that referenced this issue Apr 7, 2023
Defines an "iodriver" interface with a simple module id + opcode +
argument interface, which can be extended as necessary for different
purposes.

Also adds a "linux_sysmon" module that implements this interface to
provide system monitoring capabilities.  This includes, but is not
limited to, the CPU utilization that HS needs.
jphickey added a commit to jphickey/PSP that referenced this issue Apr 7, 2023
Defines an "iodriver" interface with a simple module id + opcode +
argument interface, which can be extended as necessary for different
purposes.

Also adds a "linux_sysmon" module that implements this interface to
provide system monitoring capabilities.  This includes, but is not
limited to, the CPU utilization that HS needs.
dzbaker added a commit that referenced this issue Apr 10, 2023
Fix #385, adds generic driver interface and Linux sysmon module
jphickey added a commit to jphickey/PSP that referenced this issue Apr 11, 2023
Update the doxygen documentation to correct warnings, and correct some
inconsistent symbol names.
dzbaker added a commit that referenced this issue Apr 12, 2023
Fix #385, adds generic driver interface and Linux sysmon module
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants