Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wr_hier.py fails with official gene2go and gaf files #142

Closed
matrs opened this issue Nov 6, 2019 · 8 comments
Closed

wr_hier.py fails with official gene2go and gaf files #142

matrs opened this issue Nov 6, 2019 · 8 comments

Comments

@matrs
Copy link

matrs commented Nov 6, 2019

Hello,
I'm trying to use the cli programs from goatools. I tried version 0.9.7 from conda and 0.9.9 from pip. Following the example from here:

wr_hier.py BP MF CC --gene2go=gene2go --taxid=9606 --dash_len=17 --concise -o human_BP_MF_CC.txt
  EXISTS: go-basic.obo
go-basic.obo: fmt(1.2) rel(2019-10-07) 47,285 GO Terms
HMS:0:00:08.996345 323,041 annotations READ: gene2go 
1 taxids stored: 9606
17529 IDs in loaded association branch, BP
INITIALIZING GoSubDag: 47285 sources in 47285 GOs rcnt(True). 2552 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth GO alt GO_name dcnt D1 tcnt tfreq tinfo id
             GoSubDag: relationships: set()
INITIALIZING GoSubDag: 12285 sources in 15653 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth GO alt GO_name dcnt D1 id
             GoSubDag: relationships: set()
Traceback (most recent call last):
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/bin/wr_hier.py", line 16, in <module>
    main()
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/bin/wr_hier.py", line 12, in main
    cli()
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/cli/wr_hierarchy.py", line 56, in cli
    objcli.wrtxt_hier(fout_txt)
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/cli/wr_hierarchy.py", line 125, in wrtxt_hier
    self.prt_hier(prt)
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/cli/wr_hierarchy.py", line 134, in prt_hier
    objwr.prt_hier_down(goid, prt)
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/gosubdag/rpt/write_hierarchy.py", line 37, in prt_hier_down
    obj.prt_hier_rec(goid)
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/rpt/write_hierarchy_base.py", line 49, in prt_hier_rec
    MARK=self.item_marks.get(item_id, self.mark_dflt)))
TypeError: a bytes-like object is required, not 'str'

when i try the gaf file from http://geneontology.org/gene-associations/goa_human.gaf.gz , i get this error:

wr_hier.py BP --gaf=goa_human.gaf --dash_len=17 --concise  -o human_BP_MF_CC.txt      
  EXISTS: go-basic.obo
go-basic.obo: fmt(1.2) rel(2019-10-07) 47,285 GO Terms
Traceback (most recent call last):
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/anno/init/reader_gaf.py", line 85, in _read_gaf_nts
    datobj = GafData(ver, allow_missing_symbol)
  File "/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/anno/init/reader_gaf.py", line 163, in __init__
    self.is_long = ver[0] == '2'
TypeError: 'NoneType' object is not subscriptable

  **FATAL-gaf: 'NoneType' object is not subscriptable

**FATAL-gaf: goa_human.gaf[1]:
UniProtKB       A0A024R0T9      APOC4-APOC2             GO:0006629      GO_REF:0000002  IEA     InterPro:IPR008019      P       Apolipoprotein C-II isoform 1   APOC4-APOC2|APOC2|hCG_20334 protein taxon:9606      20190907        InterPro

Both association files are decompressed and I tried the commands with less arguments, but I'm always getting the same errors.

Any hint?

Regards

@dvklopfenstein
Copy link
Collaborator

I tried your two example commands using the current version, which is not in pip or conda. Both runs of the script worked successfully.

Then I tried the pip version in both Python2 and Python3.6. It succeeds in Python 2, but fails in Python 3 with the errors that you show.

GOATOOLS v0.6.10 was released on October 6, 2019, but the pip version was updated Sept 29, 2016.

@tanghaibao , Can we create a new version for the current code? And then upload that to pip and conda?

Thank you very much for bringing this to our attention and for taking the time to write us.

@dvklopfenstein
Copy link
Collaborator

To try a fix quickly, if you have permission to edit, you can modify this file on line 124:

/media/mibu/Storage_SSD/miniconda3/envs/bif/lib/python3.6/site-packages/goatools/cli/wr_hierarchy.py

Change 'wb' to 'w on line 124:

WAS: with open(fout_txt, 'wb') as prt:
NOW: with open(fout_txt, 'w') as prt:

@dvklopfenstein
Copy link
Collaborator

dvklopfenstein commented Nov 7, 2019

Or if you cannot edit the file try piping the output into a file:

wr_hier.py BP MF CC --gene2go=gene2go --taxid=9606 --dash_len=17 --concise > human_BP_MF_CC.txt

@matrs
Copy link
Author

matrs commented Nov 7, 2019

Thanks @dvklopfenstein, I edited the file and now it's working for gene2go but when using the gaf file, still the same error (for me It isn't a problem, just reporting).
There are Modernize and Futurize that can help you with these kind of problems between python 2 and 3, probably you've heard of them (or you already are using them), but just in case:
https://docs.python.org/3/howto/pyporting.html

Thanks again for your help.

@dvklopfenstein
Copy link
Collaborator

Great. Thanks. I have not replicated the error you see with the GAF file with the latest files.

I have not yet replicated the error:

**FATAL-gaf: 'NoneType' object is not subscriptable

Does your gaf file have a line that looks like this:

!gaf-version: 2.1

@matrs
Copy link
Author

matrs commented Nov 7, 2019

My mistake about the gaf file, I'd done a egrep -v '^!' goa_human.gaf > tmp && mv tmp goa_human.gaf so my gaf file did't have any comments. Now it works.

Thanks!

@dvklopfenstein
Copy link
Collaborator

Terrific. Thank you so much for your interest in GOATOOLS and for taking the time to open this issue.

@dvklopfenstein
Copy link
Collaborator

@matrs
I too have modified gaf files (for tests) and have run into the same error. So to make reading GAF files an easier user experience, I modified the gaf reader code to set the default version of the gaf file being read to 2.1 if no version is found.

So now we don't need the version string in the gaf file.

Setting the default version to 2.1 is a safe move because there is a low probability of users reading a gaf file that contains no version line that is the (old) version of 1.0. Also, GPAD is the new format of annotation file that is preferred over GAF. So the current gaf format will likely stay at 2.1 for a while.

I am closing this. Please open a new issue if you need anything further. Thank you so much for your interest in GOATOOLS and for the feedback in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants