-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes UnicodeDecodeError
for ES60 files [all tests ci]
#1215
Conversation
I'll add "[all tests ci]" to the PR title and closing-reopening the PR to force all tests to be run, to ensure there are no negative consequences with other EK/ES data. As it stands, it looks like none of the tests were actually run in the CI. |
UnicodeDecodeError
for ES60 files.UnicodeDecodeError
for ES60 files [all tests ci]
Codecov Report
@@ Coverage Diff @@
## dev #1215 +/- ##
==========================================
+ Coverage 83.13% 83.44% +0.31%
==========================================
Files 63 64 +1
Lines 5710 5672 -38
==========================================
- Hits 4747 4733 -14
+ Misses 963 939 -24
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Ping @emiliom, should we add a new test case using the data file provided here - #1195 (comment)? |
@praneethratna: thanks for looking in to this! I checked the example ES60 file and this happens in the I am a little reluctant to add a 100MB file into the test data set. Let's ping the person who raised this issue to see if they could help provide a smaller file. We can merge this once we decide whether we would add a smaller test data file into this (if such file exists). If not, we can add a comment that this error occurred for an ES60 file. |
@praneethratna I've gone ahead and uploaded the new small test file (#1195 (comment)) to our test folder on Google Drive. I created a new top-level folder, I think you can go ahead and create a test for it. Thanks! |
@praneethratna: I've just verified that the 11KB file does contain the type of byte string that was not previously decoded. Please go ahead to add a test for this file under |
Addresses #1195 by changing encoding to
unicode_escape
instead of using default encodingutf-8
.CC @leewujung