Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecc errors new info type #89

Merged
merged 2 commits into from
Mar 10, 2025

Conversation

jsfakian
Copy link
Contributor

  • Added new types of ZInfoMsg protobuf messages for reporting the status of ECC memory

@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch from 3d6992b to eadcd38 Compare February 21, 2025 12:24
@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch 2 times, most recently from 5dc06dd to 9489472 Compare February 24, 2025 10:08
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a current message ZMetricMsg
I suggest you define a new message ZSlowMetricMsg and put these metrics in there.
(And make it be sent every 24 hours by default. with its own ConfigItem timer to adjust the frequency.)

@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch 2 times, most recently from 09b1142 to d022e7e Compare February 25, 2025 14:31
@jsfakian
Copy link
Contributor Author

I named the new metrics healthchecks because they make more sense to me. @eriknordmark, Should I address the issues with Apache Yetus or ignore them? Are you okay with the new message type?

@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch from d022e7e to 6adf0a9 Compare March 5, 2025 13:18
Comment on lines 21 to 22
// The message is secured within a TLS session, which is bound to the device
// certificate to ensure integrity and authenticity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be using AuthContainer here where the AuthContainer is signed using the device identity?
(We don't seem to be using that for the Info and Metrics messages but it would make sense to introduce it for new messages like this one.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea to use the AuthContainer. However, I need some help implementing it. Can you give me some pointers? I saw how we do it for CompoundEdgeDevConfig in proto/config/compound_devconfig.proto, but I am unsure how to do it in our case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I can provide pointers. I don't know if we have sending code in EVE-OS or only have the receiving code.
For now it makes sense to update the comments in this API PR.


import "google/protobuf/timestamp.proto";

package org.lfedge.eve.healthchecks;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW you can ignore the buflint complaints for this line.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits plus the key design question whether we should use AuthContainer to sign these.

@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch 3 times, most recently from dcc0f47 to a91bc64 Compare March 10, 2025 16:02
…are.

- Added a new type of protobuf messages, ZHardwareHealth to report issues
concerning the sensitive hardware components of a device
- Currently, we introduce a report for ECC error messages

Signed-off-by: Ioannis Sfakianakis <[email protected]>
Signed-off-by: Ioannis Sfakianakis <[email protected]>
@jsfakian jsfakian force-pushed the ECC-errors-new-info-type branch from a91bc64 to b54eb2e Compare March 10, 2025 16:14
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eriknordmark eriknordmark merged commit a1de6c4 into lf-edge:main Mar 10, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants