-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Auto-Techsupport] Issues related to Multiple Cores crashing handled #1948
Conversation
Signed-off-by: Vivek Reddy Karri <[email protected]>
@ganglyu @qiluo-msft Please help review |
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
scripts/coredump_gen_handler.py
Outdated
matches = re.findall(TS_PTRN, ts_stdout) | ||
if matches: | ||
return matches[-1] | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else is not necessary here.
tests/coredump_gen_handler_test.py
Outdated
if "show techsupport --since '2 days ago'" in cmd_str: | ||
patcher.fs.create_file("/var/dump/sonic_dump_random3.tar.gz") | ||
return 0, "", "" | ||
print(cmd_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is print used for debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Will remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…1948) #### What I did **Issues seen when multiple cores are crashed in very quick succession:** 1) The **rate_limit_interval** is not honored. Because, i previously was finding out the last created tech-support using the glob pattern `sonic_dump_*tar*`, which will not include the dumps which are being currently run. These existing dump will not have .tar.gz extension. Thus, modified the `get_ts_dumps` to search based on the TS_ROOT i.e `sonic_dump_*` 2) **show auto-tech support history** is not showing all the created dumps. I've previously used to take the diff of tech support dumps before and after running the invocation and used to assign the diff as the corresponding techsupport for this core. This approach is prone to race condition as we can have multiple dumps in the diff found in the interval. Avoided this by parsing the stdout returned by `show techsupport` invocation #### How to verify it 1) Unit Tests 2) Generate core-dumps in very quick succession. Use the default rate limit interval. Should only see one entry in tech-support history 3) Set global rate limit interval to 0. Generate cores in quick succession. Should see a few entries in the history.
Signed-off-by: Vivek Reddy Karri [email protected]
What I did
Issues seen when multiple cores are crashed in very quick succession:
sonic_dump_*tar*
, which will not include the dumps which are being currently run. These existing dump will not have .tar.gz extension. Thus, modified theget_ts_dumps
to search based on the TS_ROOT i.esonic_dump_*
Avoided this by parsing the stdout returned by
show techsupport
invocationHow I did it
How to verify it
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)