Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix loganalyzer.py UnicodeDecodeError issue #6524

Merged
merged 1 commit into from
Oct 12, 2022

Conversation

ZhaohuiS
Copy link
Contributor

@ZhaohuiS ZhaohuiS commented Oct 12, 2022

Signed-off-by: Zhaohui Sun [email protected]

Description of PR

Summary:
Fixes # (issue)
If syslog contains some non-ascii character such as '\xc0\xaa\xd8w\xfc\x7f', then in python3, it can't decode it with utf-8. Please see this doc for more explanation.
https://blog.finxter.com/fixed-unicodedecodeerror-utf8-codec-cant-decode-byte-0xa5-in-position-0-invalid-start-byte/

    "start": "2022-10-11 03:39:57.519237", 
    "stderr": "Traceback (most recent call last):\n  File \"/tmp/loganalyzer.py\", line 809, in <module>\n    main(sys.argv[1:])\n  File \"/tmp/loganalyzer.py\", line 793, in main\n    analyzer.place_marker(log_file_list, analyzer.create_end_marker(), wait_for_marker=True)\n  File \"/tmp/loganalyzer.py\", line 250, in place_marker\n    if self.wait_for_marker(marker) is False:\n  File \"/tmp/loganalyzer.py\", line 228, in wait_for_marker\n    for l in fp:\n  File \"/usr/lib/python3.9/codecs.py\", line 322, in decode\n    (result, consumed) = self._buffer_decode(data, self.errors, final)\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte", 
    "stderr_lines": [
        "Traceback (most recent call last):", 
        "  File \"/tmp/loganalyzer.py\", line 809, in <module>", 
        "    main(sys.argv[1:])", 
        "  File \"/tmp/loganalyzer.py\", line 793, in main", 
        "    analyzer.place_marker(log_file_list, analyzer.create_end_marker(), wait_for_marker=True)", 
        "  File \"/tmp/loganalyzer.py\", line 250, in place_marker", 
        "    if self.wait_for_marker(marker) is False:", 
        "  File \"/tmp/loganalyzer.py\", line 228, in wait_for_marker", 
        "    for l in fp:", 
        "  File \"/usr/lib/python3.9/codecs.py\", line 322, in decode", 
        "    (result, consumed) = self._buffer_decode(data, self.errors, final)", 
        "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte"
    ], 

Previous log looks like this:
/var/log/syslog.3.gz:Oct 13 04:26:29.526576 str-msn2700-03 ALERT dhcp_relay#dhcpmon[39]: dhcpmon detected disparity in DHCP Relay behavior. Duration: 432 (sec) for vlan: 'Agg-Vlan1000'

Currently log looks like this:
Oct 12 04:20:05.395531 str-msn2700-03 ALERT dhcp_relay#dhcpmon[39]: dhcpmon detected disparity in DHCP Relay behavior. Duration: 41634 (sec) for vlan: '\xc0\xaa\xd8w\xfc\x7f'

Changes are in this PR:
Add Structured Events w/ YANG Models by zbud-msft · Pull Request #12270 · sonic-net/sonic-buildimage (github.com)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

Fix "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte" when running loganalyzer.py on dut.
master and 202205 can hit this issue if syslog contains non-ascii characters.

How did you do it?

Add errors='ignore' parameter when open syslog file.

How did you verify/test it?

Run platform_tests/api/test_fan_drawer_fans.py::test_get_status and enable loganalyzer.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@azure-pipelines
Copy link

The pre-commit check detected issues in the files touched by this pull request.
The detected issues may be old or new. For new issues, please try to fix them.

For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame
author of this pull request. But if you can take extra effort to fix the old issues as well, that would be great!

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
autopep8.................................................................Failed
- hook id: autopep8
- files were modified by this hook
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

ansible/library/extract_log.py:3:1: F403 'from ansible.module_utils.basic import *' used; unable to detect undefined names
ansible/library/extract_log.py:125:11: F405 'locale' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:126:5: F405 'locale' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:126:22: F405 'locale' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:144:5: F405 'locale' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:144:22: F405 'locale' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:149:16: E741 ambiguous variable name 'l'
ansible/library/extract_log.py:167:25: E741 ambiguous variable name 'l'
ansible/library/extract_log.py:245:32: E712 comparison to False should be 'if cond is False:' or 'if not cond:'
ansible/library/extract_log.py:276:14: F405 'AnsibleModule' may be undefined, or defined from star imports: ansible.module_utils.basic
ansible/library/extract_log.py:293:5: E722 do not use bare 'except'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:13:121: E501 line too long (312 > 120 characters)
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:27:1: F401 'pprint' imported but unused
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:39:34: W605 invalid escape sequence 's'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:39:38: W605 invalid escape sequence 'd'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:163:13: E741 ambiguous variable name 'l'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:218:33: E741 ambiguous variable name 'l'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:229:21: E741 ambiguous variable name 'l'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:262:59: W605 invalid escape sequence 'd'
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:270:9: E265 block comment should start with '# '
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:272:13: F841 local variable 'original_string' is assigned to but never used
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:273:13: E265 block comment should start with '# '
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:283:9: E265 block comment should start with '# '
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:485:121: E501 line too long (124 > 120 characters)
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:550:121: E501 line too long (125 > 120 characters)
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:558:121: E501 line too long (127 > 120 characters)
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:728:121: E501 line too long (173 > 120 characters)
ansible/roles/test/files/tools/loganalyzer/loganalyzer.py:767:121: E501 line too long (132 > 120 characters)

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@ZhaohuiS ZhaohuiS merged commit 94ac4a0 into sonic-net:master Oct 12, 2022
wangxin pushed a commit that referenced this pull request Oct 14, 2022
What is the motivation for this PR?
Fix "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte" when running loganalyzer.py on dut.
master and 202205 can hit this issue if syslog contains non-ascii characters.

How did you do it?
Add errors='ignore' parameter when open syslog file.

How did you verify/test it?
Run platform_tests/api/test_fan_drawer_fans.py::test_get_status and enable loganalyzer.

Signed-off-by: Zhaohui Sun <[email protected]>
ZhaohuiS added a commit that referenced this pull request Oct 20, 2022
Blueve pushed a commit that referenced this pull request Oct 20, 2022
wangxin pushed a commit that referenced this pull request Oct 21, 2022
ZhaohuiS added a commit that referenced this pull request Oct 25, 2022
StormLiangMS pushed a commit that referenced this pull request Oct 25, 2022
…" (#6611)

Reverts #6577
It blocks smoke test since the image fix sonic-net/sonic-buildimage#12425 is not in internal branch.
wangxin pushed a commit that referenced this pull request Oct 25, 2022
…" (#6611)

Reverts #6577
It blocks smoke test since the image fix sonic-net/sonic-buildimage#12425 is not in internal branch.
allen-xf pushed a commit to allen-xf/sonic-mgmt that referenced this pull request Oct 28, 2022
What is the motivation for this PR?
Fix "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte" when running loganalyzer.py on dut.
master and 202205 can hit this issue if syslog contains non-ascii characters.

How did you do it?
Add errors='ignore' parameter when open syslog file.

How did you verify/test it?
Run platform_tests/api/test_fan_drawer_fans.py::test_get_status and enable loganalyzer.

Signed-off-by: Zhaohui Sun <[email protected]>
allen-xf pushed a commit to allen-xf/sonic-mgmt that referenced this pull request Oct 28, 2022
allen-xf pushed a commit to allen-xf/sonic-mgmt that referenced this pull request Oct 28, 2022
ZhaohuiS added a commit that referenced this pull request Nov 21, 2022
ZhaohuiS added a commit that referenced this pull request Nov 21, 2022
Reverts #6524.
Since the change sonic-net/sonic-buildimage#12425 was merged into internal branch. Revert this fix to make sure we can capture invalid format characters in syslog.
wangxin pushed a commit that referenced this pull request Nov 23, 2022
Reverts #6524.
Since the change sonic-net/sonic-buildimage#12425 was merged into internal branch. Revert this fix to make sure we can capture invalid format characters in syslog.
mannytaheri pushed a commit to mannytaheri/sonic-mgmt that referenced this pull request Jan 25, 2024
What is the motivation for this PR?
Fix "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2203: invalid start byte" when running loganalyzer.py on dut.
master and 202205 can hit this issue if syslog contains non-ascii characters.

How did you do it?
Add errors='ignore' parameter when open syslog file.

How did you verify/test it?
Run platform_tests/api/test_fan_drawer_fans.py::test_get_status and enable loganalyzer.

Signed-off-by: Zhaohui Sun <[email protected]>
mannytaheri pushed a commit to mannytaheri/sonic-mgmt that referenced this pull request Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants