Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reboot-cause information to telemetry #669

Merged
merged 20 commits into from
Oct 22, 2020
Merged
89 changes: 42 additions & 47 deletions doc/system-telemetry/reboot-cause.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,78 +11,73 @@

### Enable sonic streaming telemetry agent to send Reboot-cause information

##### Part 1
For 1st part, process-reboot-cause copies the previous-reboot-cause.txt
to "/host/reboot-cause/previous-reboot-cause/" with adding timestamp
at the end of file name after processing the reboot-cause.
#### Part 1
During the boot, the process-reboot-cause processes the last reboot-cause based on the hardware reboot-cause
and the software reboot-cause information and creates previous-reboot-cause.txt with the information.
To log upto 10 entries of the previous reboot-cause, `process-reboot-cause` will save the previous reboot cause information
sujinmkang marked this conversation as resolved.
Show resolved Hide resolved
to "/host/reboot-cause/previous-reboot-cause/" with adding timestamp at the end of file name after processing
the reboot-cause and create a symbolic link.
sujinmkang marked this conversation as resolved.
Show resolved Hide resolved
Currently previous-reboot-cause.txt is plain text format but this file content will be formatted to be parsed easily.
Read each reboot-cause information from saved previous-reboot-cause files
And update reboot-cause information up to 10 entries to state-DB.

The example shows the previous reboot-cause files stored in /host/reboot-cause/previous-reboot-cause/.
jleveque marked this conversation as resolved.
Show resolved Hide resolved
```
$ls /host/reboot-cause/previous-reboot-cause/
previous-reboot-cause-20200903T232033.txt
previous-reboot-cause-20200902T101105.txt
previous-reboot-cause-20200902T015048.txt
...
```
The following example shows the content of the previous reboot-cause file - previous-reboot-cause-20200903T232033.txt.
```
TIMESTAMP: "20200903T232033"
CAUSE: "reboot"
USER: "admin"
TIME: "Thu 03 Sep 2020 11:15:30 PM UTC"
COMMENT: "User issued 'reboot' command [User: admin, Time: Thu 03 Sep 2020 11:15:30 PM UTC]"
COMMENT: "User issued 'reboot' command [User: admin, Time: Thu 03 Sep 2020 11:15:30 PM UTC]"
```

Details of CLI and state-DB given below.
Cli command to retrieve the Reboot-cause information
```
$ show reboot-cause
```
state-DB schema to store the Reboot-cause information
#### Part 2
A new service which will retrieve the saved reboot-cause files and read each reboot-cause information from the files
and update reboot-cause information up to 10 entries to state-DB.
Verify the information from state-DB data is available via the cli command `show reboot-cause history` which is extended from `show reboot-cause`.
jleveque marked this conversation as resolved.
Show resolved Hide resolved

##### Reboot Cause Schema in state-DB

Here is the definition of Reboot-cause schema which will be stored in state-DB.
```
; Defines information for reboot-cause
key = REBOOT_CAUSE|timestamp ; last reboot-cause processing time
key = REBOOT_CAUSE|<timestamp> ; last reboot-cause processing time
; field = value
cause = STRING ; last reboot cause
time = STRING ; time when the last reboot was initiated
user = STRING ; user who the last reboot initiated
comment = STRING ; unstructured json format data
```
Along with data new entry for timestamp will be added up to 10 entries in state_db:

```
REBOOT_CAUSE|timestamp
```

##### Part 2
Verify that from state-DB data is available via telemetry agent

##### CLI output and corresponding structure in state-DB for reboot-cause information

###### reboot-cause information

Currently `show reboot-cause` displays the last reboot-cause and performing `cat /host/reboot-cause/previous-reboot-cause.txt` to show the reboot-cause.
With new design, the reboot-cause will be read from state-DB and displayed with new format.
`show reboot-cause history` will be added and displays the previous `reboot-cause` up to 10 entries from state-DB.
Currently `show reboot-cause` displays the last reboot-cause and performing `cat /host/reboot-cause/previous-reboot-cause.txt` to show the reboot-cause.
This will be same as current design.
With new design, `show reboot-cause history` will be added to display the previous `reboot-cause` up to 10 entries from state-DB.

The example shows the output of `show reboot-cause` which is same as current output and displays only the last reboot-cause.
```
$ show reboot-cause
User issued 'reboot' command [User: admin, Time: Thu 03 Sep 2020 11:15:30 PM UTC]
```
Above output will be stored inside state-DB as follows for the reboot-cause information
Above output will be stored in the previous-reboot-cause.txt file and the reboot-cause information is also stored in state-DB as follows.
```
REBOOT_CAUSE|20200903T112033
"cause"
"reboot"
"time"
"Thu 03 Sep 2020 11:15:30 PM UTC"
"user"
"admin"
"comment"
"User issued 'reboot' command [User: admin, Time: Thu 03 Sep 2020 11:15:30 PM UTC]"
"cause"
"reboot"
"time"
"Thu 03 Sep 2020 11:15:30 PM UTC"
"user"
"admin"
"comment"
"User issued 'reboot' command [User: admin, Time: Thu 03 Sep 2020 11:15:30 PM UTC]"
```

The example shows the output of `show reboot-cause history` and the previous reboot cause stored in state-DB in addition to the last reboot-cause.
Expand All @@ -96,23 +91,23 @@ TIMESTAMP REBOOT-CAUSE Details
Above output will be stored inside state-DB as follows for the previous reboot-cause in addition to the last reboot-cause
```
REBOOT_CAUSE|20200902T101105
"cause"
"Unknown"
"time"
"cause"
"Unknown"
"time"
""
"user"
"user"
""
"comment"
"comment"
"Unknown"
```
```
REBOOT_CAUSE|20200902T015048
"cause"
"fast-reboot"
"time"
"Wed 02 Sep 2020 01:48:33 AM UTC"
"user"
"admin"
"comment"
"User issued 'fast-reboot' command [User: admin, Time: Wed 02 Sep 2020 01:48:33 AM UTC]"
"cause"
"fast-reboot"
"time"
"Wed 02 Sep 2020 01:48:33 AM UTC"
"user"
"admin"
"comment"
"User issued 'fast-reboot' command [User: admin, Time: Wed 02 Sep 2020 01:48:33 AM UTC]"
```