-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error handling framework initial draft #391
Conversation
The requirements for error handling framework are: | ||
|
||
1.1.1 Provide registration/de-registration mechanism for applications to enable/disable error notifications on a specific table. More than one application can register for notifications on a given table. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can framework supports applications to register for notifications at attribute level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notifications can only be enabled per-table. Failure status code is reported per object. For example, port table has multiple objects like MTU, admin state. Notifications can be enabled at Port table level, but not on MTU failures specifically.
- Extensible to all types of errors in the system, not restricted to APP_DB definitions. | ||
- Efficient, as notifications are limited to failures in the DB. | ||
- Notification for delete failures can be supported even when corresponding objects are deleted from APP_DB. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the new DB approach, however here is my thought, why can't we have a single DB (APP DB) and separated by namespaces? ex: configured vs applied/error in the same table so that it could be easy to maintain one table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please provide more details here? Are you suggesting that ERROR tables can be stored in APP_DB, and there is no need to create ERROR_DB? We want to avoid modifying the existing ROUTE_TABLE schema in APP_DB - error handling can optionally be disabled and retain the current behavior.
- Translates it from SAI data types to ERROR_DB data types | ||
- Adds an entry in to error database. If the entry already exists, the corresponding failure code is updated. | ||
- Publishes the notifications to respective error listeners. | ||
3. Error listener waits for the incoming notifications, filters them and invokes the application callback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe how does Error listener filters the notifications ? what is the criteria supported? please add the use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we will add more details on this.
| Create failure | Delete failure | Remove the entry from database and notify the registered applications | | ||
| Create failure | Update success | Remove the entry from the database and notify the registered applications | | ||
| Create success | Delete failure | Add the entry to the database and notify the registered applications | | ||
| Delete failure | Create success | Remove the entry from the database and notify the registered applications | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the applications get out of order notifications from feedback loop? How to handle in the case of it? Ex: User does create/delete/create and do you expect the error feedback come in order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order of notifications will be preserved - because changes to APP_DB and ASIC_DB maintain the sequence. In case the same object fails multiple times, we need a unique transaction id to associate the operation and failure. To address this, we are looking at adding unique ID to each APP_DB operation and reporting the ID back as part of failure notification.
This document describes high level design details for Error Handling framework in SONiC.
Signed-off-by: Siva Mukka [email protected]