Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch SER Callback and Log Info #1307

Closed
wants to merge 4 commits into from
Closed

Switch SER Callback and Log Info #1307

wants to merge 4 commits into from

Conversation

JaiOCP
Copy link
Contributor

@JaiOCP JaiOCP commented Sep 14, 2021

Memory corruption can be classified into two buckets. One is hard failure and the other is soft failure. Soft failure/error is tend to cause by a single event upset (SEU) to the data memory. Hard failure is tend to be the result of raw defects in silicon wafer, natural aging or electrical over stress causing the P/N junction to break down permanently. Soft error is opposite from hard failure, that is, the memory corruption will go away after the device is reset or re- initialized.

This PR introduces a callback for Soft Error and Recovery mechanism and associated recovery type and log information.

Signed-off-by: Jai Kumar <[email protected]>
Signed-off-by: Jai Kumar <[email protected]>
inc/saiswitch.h Outdated Show resolved Hide resolved
@JaiOCP JaiOCP changed the title Patch 1 Switch SER Callback and Log Info Sep 15, 2021
inc/saiswitch.h Show resolved Hide resolved
*/
SAI_SWITCH_SOFT_ERROR_TYPE_ECC_DOUBLE_BIT = 3,

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be "all other soft error types"? If so, how does it differ from UNKNOWN? (if not, what is it?)

@kcudnik
Copy link
Collaborator

kcudnik commented Sep 22, 2021

please fix errors

/**
* @brief Soft error recovery log info flags
*/
sai_switch_ser_log_t flags;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be uint32_t if this will be combination of flags ? currently we have similar thing in stats_mode flags

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/**
* @brief Memory needs special correction handling
*/
SAI_SWITCH_CORRECTION_TYPE_SPECIAL = 5,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify if error is corrected or not?

@@ -5124,6 +5124,7 @@ void check_enum_range_base(

SKIP_ENUM(sai_attr_flags_t);
SKIP_ENUM(sai_stats_mode_t);
SKIP_ENUM(sai_switch_ser_log_t);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this change, i updated sainitycheck to use flags enum type

*
* Note enum values must be powers of 2 to be used as bit mask
*
* @flags Contains flags
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to change this to "@flags strict"

* @param[in] count Number of notifications
* @param[in] data Array of soft error recovery event types
*/
typedef void (*sai_switch_ser_event_notification_fn)(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this include the switch_id in the callback?

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants