Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully handle FileHistory decoding errors #1303

Merged
merged 1 commit into from
Jan 4, 2021

Conversation

op3
Copy link
Contributor

@op3 op3 commented Dec 8, 2020

In case of a corrupted history file, instead of raising a UnicodeDecodeError, malformed characters are replaced by U+FFFD (REPLACEMENT CHARACTER) during decoding.

Currently, if a history file is malformed and contains invalid unicode characters, a UnicodeDecodeError is raised, which prevents the prompt from working (and cannot be handled properly).

Minimal working example (before fix):

Create malformed history

echo -n -e '\xe9' >> history

Example code:

#!/usr/bin/env python

from prompt_toolkit.shortcuts import prompt
from prompt_toolkit.history import FileHistory

prompt(history=FileHistory("history"))

Execution and output before fix:

$ ./mwe.py
Traceback (most recent call last):
  File "/tmp/./mwe.py", line 6, in <module>
    prompt(history=FileHistory("history"))
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/shortcuts/prompt.py", line 1382, in prompt
    session: PromptSession[str] = PromptSession(history=history)
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/shortcuts/prompt.py", line 463, in __init__
    self.default_buffer = self._create_default_buffer()
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/shortcuts/prompt.py", line 498, in _create_default_buffer
    return Buffer(
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/buffer.py", line 316, in __init__
    self.history.load(new_history_item)
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/history.py", line 70, in load
    for item in self.load_history_strings():
  File "/usr/lib/python3.9/site-packages/prompt_toolkit/history.py", line 210, in load_history_strings
    line = line_bytes.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 0: unexpected end of data

In case of a corrupted history file, instead of raising a
UnicodeDecodeError, malformed characters are replaced by
U+FFFD (REPLACEMENT CHARACTER) during decoding.
@jonathanslenders
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants