Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to generate HTML report: UnicodeDecodeError: 'ascii' codec can't decode byte #273

Open
asherkhb-ktx opened this issue Sep 5, 2023 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@asherkhb-ktx
Copy link

Running Cellbender 0.3.0 it reports "Unable to create report" with the following Traceback,

cellbender:remove-background: Unable to create report.                                                                                                                                                                                                                      
cellbender:remove-background: Traceback (most recent call last):                                                                                                                                                                                                            
  File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/cellbender/remove_background/run.py", line 349, in compute_output_denoised_counts_reports_metrics                                                                                                               
    run_notebook_make_html(                                                                                                                                                                                                                                                 
  File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/cellbender/remove_background/report.py", line 80, in run_notebook_make_html                                                                                                                                     
    _postprocess_html(                                                                                                                                                                                                                                                      
  File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/cellbender/remove_background/report.py", line 60, in _postprocess_html                                                                                                                                          
    html = f.read()                                                                                                                                                                                                                                                         
  File "/opt/conda/envs/pytorch/lib/python3.9/encodings/ascii.py", line 26, in decode                                                                                                                                                                                       
    return codecs.ascii_decode(input, self.errors)[0]                                                                                                                                                                                                                       
UnicodeDecodeError: 'ascii' codec can't decode byte 0xee in position 232667: ordinal not in range(128)

"pip install" did yield two Errors, which may be related but didn't seem to interfere elsewhere:

ERROR: nbconvert 6.5.4 has requirement jinja2>=3.0, but you'll have jinja2 2.10.1 which is incompatible.
ERROR: jupyter-events 0.7.0 has requirement jsonschema[format-nongpl]>=4.18.0, but you'll have jsonschema 3.2.0 which is incompatible.
@yfarjoun
Copy link

yfarjoun commented Sep 6, 2023

Having the same problem, installed yesterday, using conda for python=3.7 and then pip-install cellbender.

@sjfleming
Copy link
Member

Hey Yossi! Huh, I will try to replicate this myself. I probably need to pin a version of something that’s had a recent update. Very likely nbconvert

@sjfleming
Copy link
Member

@asherkhb-ktx thanks for reporting this. While I figure out the fix, if you wanted to, you could try to generate the report manually. All you need to do is open the Jupyter notebook in this repository

/cellbender/remove_background/report.ipynb

and manually modify the names of the input and output files and run the notebook. This is how the report gets generated (and it then gets converted to html).

@sjfleming sjfleming self-assigned this Sep 7, 2023
@sjfleming sjfleming added the bug Something isn't working label Sep 7, 2023
@sjfleming
Copy link
Member

Hm, so far I am not able to reproduce this. I am on a Mac, and if I do

(base) $ conda create -n test python=3.7
(base) $ conda activate test
(test) $ pip install cellbender

and then run this on the tiny_raw_feature_bc_matrix.h5ad created by $ python generate_tiny_10x_dataset.py (where generate_tiny_10x_dataset.py is from a clone of cellbender and can be found here), like this

(test) $ cellbender remove-background --input tiny_raw_feature_bc_matrix.h5ad --output test.h5

then I do get an HTML report:

cellbender:remove-background: Succeeded in writing CellRanger format output to file tiny_test.h5
cellbender:remove-background: Succeeded in writing CellRanger format output to file tiny_test_filtered.h5
cellbender:remove-background: Saved output metrics as tiny_test_metrics.csv
[NbConvertApp] Converting notebook tmp.report.ipynb to notebook
[NbConvertApp] Writing 416208 bytes to tmp.report.nbconvert.ipynb
[NbConvertApp] Converting notebook tmp.report.nbconvert.ipynb to html
[NbConvertApp] Writing 991091 bytes to tmp.report.nbconvert.html
cellbender:remove-background: Succeeded in writing report to tiny_test_report.html
cellbender:remove-background: Completed remove-background.
cellbender:remove-background: 2023-09-20 11:04:10

If I run

(test) $ pip list

I see

Package                           Version
--------------------------------- ------------
anndata                           0.8.0
anyio                             3.7.1
appnope                           0.1.3
argon2-cffi                       23.1.0
argon2-cffi-bindings              21.2.0
attrs                             23.1.0
backcall                          0.2.0
beautifulsoup4                    4.12.2
bleach                            6.0.0
cellbender                        0.3.0
certifi                           2022.12.7
cffi                              1.15.1
click                             8.1.7
comm                              0.1.4
cycler                            0.11.0
debugpy                           1.7.0
decorator                         5.1.1
defusedxml                        0.7.1
entrypoints                       0.4
exceptiongroup                    1.1.3
fastjsonschema                    2.18.0
fonttools                         4.38.0
h5py                              3.8.0
idna                              3.4
importlib-metadata                6.7.0
importlib-resources               5.12.0
ipykernel                         6.16.2
ipython                           7.34.0
ipython-genutils                  0.2.0
ipywidgets                        8.1.1
jedi                              0.19.0
Jinja2                            3.1.2
jsonschema                        4.17.3
jupyter                           1.0.0
jupyter_client                    7.4.9
jupyter-console                   6.6.3
jupyter-contrib-core              0.4.2
jupyter-contrib-nbextensions      0.7.0
jupyter_core                      4.12.0
jupyter-highlight-selected-word   0.2.0
jupyter-nbextensions-configurator 0.6.3
jupyter-server                    1.24.0
jupyterlab-pygments               0.2.2
jupyterlab-widgets                3.0.9
kiwisolver                        1.4.5
llvmlite                          0.39.1
loompy                            3.0.7
lxml                              4.9.3
MarkupSafe                        2.1.3
matplotlib                        3.5.3
matplotlib-inline                 0.1.6
mistune                           0.8.4
natsort                           8.4.0
nbclassic                         1.0.0
nbclient                          0.7.4
nbconvert                         6.5.4
nbformat                          5.8.0
nest-asyncio                      1.5.8
notebook                          6.5.6
notebook_shim                     0.2.3
numba                             0.56.4
numexpr                           2.8.6
numpy                             1.21.6
numpy-groupies                    0.9.22
opt-einsum                        3.3.0
packaging                         23.1
pandas                            1.3.5
pandocfilters                     1.5.0
parso                             0.8.3
pexpect                           4.8.0
pickleshare                       0.7.5
Pillow                            9.5.0
pip                               22.3.1
pkgutil_resolve_name              1.3.10
prometheus-client                 0.17.1
prompt-toolkit                    3.0.39
psutil                            5.9.5
ptyprocess                        0.7.0
pycparser                         2.21
Pygments                          2.16.1
pyparsing                         3.1.1
pyro-api                          0.1.2
pyro-ppl                          1.8.6
pyrsistent                        0.19.3
python-dateutil                   2.8.2
pytz                              2023.3.post1
PyYAML                            6.0.1
pyzmq                             24.0.1
qtconsole                         5.4.4
QtPy                              2.4.0
scipy                             1.7.3
Send2Trash                        1.8.2
setuptools                        65.6.3
six                               1.16.0
sniffio                           1.3.0
soupsieve                         2.4.1
tables                            3.7.0
terminado                         0.17.1
tinycss2                          1.2.1
torch                             1.13.1
tornado                           6.2
tqdm                              4.66.1
traitlets                         5.9.0
typing_extensions                 4.7.1
wcwidth                           0.2.6
webencodings                      0.5.1
websocket-client                  1.6.1
wheel                             0.38.4
widgetsnbextension                4.0.9
zipp                              3.15.0

What do you see?

@sjfleming
Copy link
Member

(
Note to self: might be some unexpected non-ascii characters in the html report (possibly due to input filename? or gene names?)... and maybe consider this
https://stackoverflow.com/questions/27243129/how-to-open-html-file-that-contains-unicode-characters

instead of this

with open(file, mode='r') as f:
html = f.read()
)

@ZhangMH2000
Copy link

ZhangMH2000 commented Sep 28, 2023

@asherkhb-ktx thanks for reporting this. While I figure out the fix, if you wanted to, you could try to generate the report manually. All you need to do is open the Jupyter notebook in this repository

/cellbender/remove_background/report.ipynb

and manually modify the names of the input and output files and run the notebook. This is how the report gets generated (and it then gets converted to html).

Hi sjfleming. I also encountered this issue, which showed a warning or an error:

 cellbender:remove-background: Unable to create report.
cellbender:remove-background: Traceback (most recent call last):
  File "/home/zhangminghe/project/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 351, in compute_output_denoised_counts_reports_metrics
    output=html_report_file,
  File "/home/zhangminghe/project/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/report.py", line 82, in run_notebook_make_html
    title=('CellBender: ' + os.path.basename(output).replace('_report.html', '')),
  File "/home/zhangminghe/project/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/report.py", line 60, in _postprocess_html
    html = f.read()
  File "/home/zhangminghe/project/miniconda3/envs/cellbender/lib/python3.7/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xee in position 232667: ordinal not in range(128)

However, I have a file named ‘tHCC15_cellbender_output_report.html’ in the output. This HTML file also contains information about the process. So, does this warning merely indicate that some characters cannot be encoded correctly, without affecting the overall report output? Thank you!

@sjfleming
Copy link
Member

Hi @ZhangMH2000 , yes, if you have tHCC15_cellbender_output_report.html as an output file, then the contents of that file are fine. The step that is failing (for you, and maybe for some other people) is a superficial sort of a step to rename the title of the HTML report so that it looks like this when you open it:
image

When the error you're seeing occurs, the only difference will be that you will probably see a different title when you look at the label on the tab.

I will fix this eventually though! @ZhangMH2000 do you know if you have non-ascii characters in the name of your file?

@sjfleming
Copy link
Member

Hm, or maybe the non-ascii characters could be part of feature names, I suppose

@ZhangMH2000
Copy link

Hi @sjfleming , the title of tHCC15_cellbender_output_report.html istmp.report.nbconvert. I don't see any non-ascii characters in the name of my input file. Other reports have correct names, such as Cellbender:tHCC16_cellbender_output. Thank you!

@sjfleming
Copy link
Member

Thanks @ZhangMH2000 , good to know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: To Do
Development

No branches or pull requests

4 participants