Crashing with TIR-Learner #4

Neato-Nick · 2019-06-21T23:01:20Z

Hi,

I copy and pasted the installation instructions from the README and am running the the script in the active EDTA environment. It seems that the EDTA.pl script chokes trying to use TIR-Learner. Looking at my output, all the correct folders and such are there. After crashing, the Helitron, MITE, and TIR folders are empty but the LTR folder is not. The only file in the parent output folder is genome.fasta.LTR.raw.fa.

Is there a way to run the Perl pipeline script but just not use TIR-Learner, or even just not call TIRs? I'm still interested in the other features, and even if I could just use EDTA for Helitrons, LTRs, MITEs, filtering, consensus calling, and repeat classifying I would be happy.

The lines before the crash start with what's seen in #2 (comment). Then it's a traceback starting from ~/bin/EDTA/bin/TIR-Learner1.12/Module1/Fullcov.py, line 52, in <module> ProcessHomology(genome_Name). After that, there's some cryptic errors including
cat: '*DTA-+-select.fa': No such file or directory
cat: '*-+-*-+-*.gff3': No such file or directory
There's a few more error traces after that, with each Traceback followed by various errors from files not being found by rm, cp, mv, cat.

TIR-Learner1.12/Module1 (above)
TIR-Learner1.12/Module1/Lowcomp_M1.py
TIR-Learner1.12/Module2/Lowcomp_M2.py
TIR-Learner1.12/

Lastly, in the last few lines before the crash, I get these lines which tell me that it certainly is a problem with TIR-Learner
FileNotFoundError: [Errno 2] No such file or directory: 'TIR-Learner_FinalAnn.gff3' mv: cannot stat 'TIR-Learner/*FinalAnn.gff3': No such file or directory mv: cannot stat 'TIR-Learner/*FinalAnn.fa': No such file or directory cp: cannot stat 'TIR-Learner-Result/TIR-Learner_FinalAnn.fa': No such file or directory Error: TIR results not found!

ERROR: Raw TIR results not found in genome.fasta.EDTA.raw/genome.fasta.TIR.raw.fa at ~bin/EDTA/EDTA.pl line 145.

While bug testing I've just been using the first two scaffolds of my genome. That file is attached.

Thanks!

PR-102_JGI_twoscafs.fasta.zip

The text was updated successfully, but these errors were encountered:

oushujun · 2019-06-22T04:07:41Z

Hi Nick,

We identified this bug in TIR-Learner as you described in detail. A testing version has been pushed in the EDTA branch named "TIR-Learner1.13". Please try that out under the same active EDTA environment (no need to reinstall). In particular, if you want to just test out TIR-Learner, you can:

nohup sh .....EDTA/bin/TIR-Learner1.13/TIR-Learner.sh genome.fa $CPU

For your other question, yes. To do so, you can run these initial TE finders separately, then feed them to the EDTA_process.pl pipeline to make the stage 1 library. If you don't have TIR-Learner results, you can use the MITE-Hunter result to feed the -tir parameter to trick the program.

Please let me know if you still encounter the same issue. Sorry for the inconvenience.

Best,
Shujun

Neato-Nick · 2019-06-24T16:48:31Z

I pulled from the TIR-Learner 1.13 branch and just ran TIR-Learner as you suggested. Looks like it's still crashing. Module1 did contain some results but there was still a temp file in there, so I'm not sure it finished running. The nohup.out is attached

CPU=4
nohup sh ~/bin/EDTA-TIR-Learner1.13/bin/TIR-Learner1.13/TIR-Learner.sh genome.fasta $CPU

nohup.TIR-Learner1.13.txt

oushujun · 2019-06-25T03:17:02Z

Hi Nick,

Thanks for testing, @weijiaweijia is working on this. I will update you once we have a new version.

Best,
Shujun

oushujun · 2019-06-25T04:47:54Z

Hi Nick,

If you are under a pressing need, you may run the updated TIR-Learner1.13 branch for your genome. I just temporarily removed the TIR-Learner module in EDTA, thus you should be able to run the rest of the pipeline. Note that due to the missing of TIR-Learner, large TIR elements and autonomous TIR elements will likely be dampened in the final library. However, the MITE-Hunter should be able to pick up most of short TIR elements and MITEs.

Best,
Shujun

DanJeffries · 2019-06-28T19:01:59Z

Hi Shujun,

I have an crash that seems similar to Nick's above. Here is the log file:

Wed Jun 26 13:40:03 CEST 2019   Dependency checking:
                All passed!
Wed Jun 26 13:40:14 CEST 2019   Obtain raw TE libraries using various structure-based programs:
FASTA-Reader: Ignoring invalid residues at position(s): On line 380: 5993-6353
FASTA-Reader: Ignoring invalid residues at position(s): On line 534: 225-229
FASTA-Reader: Ignoring invalid residues at position(s): On line 250: 1224-1597, 1622-1750
FASTA-Reader: Ignoring invalid residues at position(s): On line 252: 520-746
FASTA-Reader: Ignoring invalid residues at position(s): On line 386: 238-242, 1481-1485, 2106-2110, 3127-3131
FASTA-Reader: Ignoring invalid residues at position(s): On line 254: 1936-2299
FASTA-Reader: Ignoring invalid residues at position(s): On line 566: 11-15
FASTA-Reader: Ignoring invalid residues at position(s): On line 472: 268-272
Traceback (most recent call last):
  File "/stn4/djeffrie/EDTA/bin/TIR-Learner1.12/Module1/Fullcov.py", line 52, in <module>
    ProcessHomology(genome_Name)
  File "/stn4/djeffrie/EDTA/bin/TIR-Learner1.12/Module1/Fullcov.py", line 41, in ProcessHomology
    f = pd.read_csv(blast, header=None, sep="\t")
  File "/scratch/temporary/djeffrie/EDTAcondaenv/lib/python3.7/site-packages/pandas/io/parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/scratch/temporary/djeffrie/EDTAcondaenv/lib/python3.7/site-packages/pandas/io/parsers.py", line 429, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/scratch/temporary/djeffrie/EDTAcondaenv/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in __init__
    self._make_engine(self.engine)
  File "/scratch/temporary/djeffrie/EDTAcondaenv/lib/python3.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/scratch/temporary/djeffrie/EDTAcondaenv/lib/python3.7/site-packages/pandas/io/parsers.py", line 1853, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 545, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
cat: *DTA-+-select.fa: No such file or directory
cat: *DTC-+-select.fa: No such file or directory

The issue again seems to be in TIR-learner. Perhaps it is still a TIR learner bug, but one thing that might be worth noting is that I had an issue during installation where scikit-learn=0.19.0 would not install because of some conflict with multiprocesses. I got around this problem by installing them in a different order but I later realised that by default on my cluster, python 3.7 gets installed in the environment and then a version of multiprocess that is only compatible with python 3.7 is installed. I think then the issue with scikit-learn=0.19.0 was because it only works with python 3.6.

So do you think my issue above could be an installation issue, or a bug in TIR-learner?

One more thing, I am dealing with a large genome of about 5 Gb. The LTR programs completed fine, but took about 3 days. So I was wondering if it is possible to re-use these outputs, rather than waiting another 3 days to see if the pipeline will pass the next step?

Thanks a lot in advance for your help

Best

Dan

oushujun · 2019-06-30T01:13:57Z

Hi Dan,

Thanks for testing. Yes, this is the issue of TIR-Learner. We are working to make a better version so please wait a week or two.

For the conflicts between python, scikit-learn, and multiprocess, you may try different versions of python and multiprocess, but the trained models do require scikit-learn=0.19.0 to work properly.

The last suggestion is actually on my to-do list. Good idea!

Again, I am sorry for the bugs keeping you from getting meaningful results. We hope to resolve this issue in the near future.

Best,
Shujun

philippbayer · 2019-06-30T04:45:19Z

In my case with TIR-Learner, my installed CentOS did not have a installed realpath executable which TIR-Learner is calling on line 30 in TIR-Learner.sh

I fixed it like this:

#genomeFile=`realpath $rawFile` #the genome file with real path
genomeFile=`readlink -e $rawFile`

oushujun · 2019-07-01T05:04:26Z

Thanks @philippbayer!

I did some research and found this multi-platform solution:

resolve_link() {
  if type -p realpath >/dev/null; then
    realpath "$1"
  elif type -p greadlink >/dev/null; then
    greadlink -f "$1"
  else
    readlink -f "$1"
  fi
}

Ref: basherpm/basher#49 (comment)

Changes will be reflected in the next version.

Best,
Shujun

Neato-Nick · 2019-07-09T20:20:40Z

@oushujun would it be possible to push these changes to a development branch ahead of the release for your next version? Or do you think you are only 1-2 weeks away from your next release?

If my impatience is overwhelming, it might just be easier for me to fix as you and @philippbayer have suggested.

@philippbayer after you made those changes to TIR-Learner, did EDTA run properly?

philippbayer · 2019-07-10T08:14:02Z

@Neato-Nick The main branch didn't like my GLIBC, so I switched to the origin/TIR-Learner1.13 branch for testing and that one is happily chugging along, but it hasn't finished so far (14 threads, plant genome, still in the 'raw' stage. grf-main and blastall are happily consuming resources, but nothing has been written in a while. The TIR, LTR, and MITE directories have data inside them.

oushujun · 2019-07-10T16:02:44Z

Hi @Neato-Nick and @philippbayer,

Thank you for waiting patiently, and I am sorry for the prolonged time of development. I went to the Evolution meeting 2 weeks ago so there was some delay there.

I am working on a new version of EDTA, this version will have much better performance in both speed and quality. The main improvement is in TIR-Learner - @weijiaweijia and me are working together to make an improved, more generalized prediction model that fits most species; and also in the downstream filtering of TIR elements and Helitrons - I am working to provide more thorough filtering for raw predictions which will make the final library much smaller and better.

I should be able to push these updates in 1-2 weeks if things work well - - our HPC has been down for maintenance for 3 days, so I can do nothing but talking ...

Again, thank you for your interest and testing.

Best,
Shujun

philippbayer · 2019-07-11T02:58:22Z

No worries @oushujun :) I'm just playing around with this software, the outcome doesn't depend on anything. Take all the time you need!!

oushujun · 2019-08-01T21:38:01Z

Dear All,

Sorry for the delay of response. I just push a bulk update to EDTA and have tested it in different servers - it seems to work now. But I have not tested it in macOS, so some tiny differences could cause problems.

For testing purposes, please use a small file, ie. 20 Mb, for faster turn around. Please let me know if there are any issues.

Best,
Shujun

philippbayer · 2019-08-01T23:46:02Z

Thank you for the update! I'll give it a try this weekend :)

baozg · 2019-08-03T05:08:01Z

Hi, Shujun

I try the new release EDTA, the TIR_learner is still have error, the LTR, MITE and Helitron is fine. Is my genome (336M eudicots plant) have low percentage TIR?

TIR command is below

perl /data/software/EDTA/20190802/EDTA_raw.pl -genome genome.fa -species others -type tir -threads 24

Here is the error log

cat: *-+-DTA.fa: No such file or directory
cat: *-+-DTC.fa: No such file or directorycat: *-+-DTH.fa: No such file or directory
cat: *-+-DTM.fa: No such file or directory
cat: *-+-DTT.fa: No such file or directory
cat: *-+-NonTIR.fa: No such file or directory
cat: *-+-*-+-*.gff3: No such file or directory
rm: cannot remove ‘*-+-*-+-*.gff3’: No such file or directory
Traceback (most recent call last):
  File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3_New/CombineAll.py", line 90, in <module>
    keep=removeIRFhomo("%s.gff3"%(genome_Name+spliter+dataset),remove,"%sClean.gff3"%(genome_Name+spliter+dataset+spliter))
  File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3_New/CombineAll.py", line 76, in removeIRFhomo
    f=pd.read_csv(file,header=None,sep="\t")
  File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in __init__
    self._make_engine(self.engine)
  File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in __init__
    self._reader = parsers.TextReader(src, **kwds)  File "pandas/_libs/parsers.pyx", line 545, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
Traceback (most recent call last):
  File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3/GetAllSeq.py", line 62, in <module>    file=open(f,"r+")
FileNotFoundError: [Errno 2] No such file or directory: 'TIR-Learner_FinalAnn.gff3'

WeijiaSu · 2019-08-03T05:37:48Z

Hi Zhugui Thanks for testing the program. Looks like your gff file is empty. It either because your genome doesn’t have intact TIR (less likely) or the previous steps have some problems. For the second case, could you check if you have a file ends with predi.fa-+-200 and what’s the size of this file? Thanks Weijia

…

On Sat, Aug 3, 2019 at 12:08 AM Zhigui Bao ***@***.***> wrote: Hi, Shujun I try the new release EDTA, the TIR_learner is still have error, the LTR, MITE and Helitron is fine. Is my genome (336M eudicots plant) have low percentage TIR? TIR command is below perl /data/software/EDTA/20190802/EDTA_raw.pl -genome genome.fa -species others -type tir -threads 24 Here is the error log cat: *-+-DTA.fa: No such file or directory cat: *-+-DTC.fa: No such file or directorycat: *-+-DTH.fa: No such file or directory cat: *-+-DTM.fa: No such file or directory cat: *-+-DTT.fa: No such file or directory cat: *-+-NonTIR.fa: No such file or directory cat: *-+-*-+-*.gff3: No such file or directory rm: cannot remove ‘*-+-*-+-*.gff3’: No such file or directory Traceback (most recent call last): File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3_New/CombineAll.py", line 90, in <module> keep=removeIRFhomo("%s.gff3"%(genome_Name+spliter+dataset),remove,"%sClean.gff3"%(genome_Name+spliter+dataset+spliter)) File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3_New/CombineAll.py", line 76, in removeIRFhomo f=pd.read_csv(file,header=None,sep="\t") File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 429, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in __init__ self._make_engine(self.engine) File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1122, in _make_engine self._engine = CParserWrapper(self.f, **self.options) File "/data/software/Anaconda3/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1853, in __init__ self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 545, in pandas._libs.parsers.TextReader.__cinit__ pandas.errors.EmptyDataError: No columns to parse from file Traceback (most recent call last): File "/data/software/EDTA/20190802/bin/TIR-Learner1.19/Module3/GetAllSeq.py", line 62, in <module> file=open(f,"r+") FileNotFoundError: [Errno 2] No such file or directory: 'TIR-Learner_FinalAnn.gff3' — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=AHUQO6SFSTPN2JW4BJQ2FRDQCUHDDA5CNFSM4H2VKO7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3PHHVQ#issuecomment-517895126>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHUQO6QP5KR5MKUKRFKPIS3QCUHDDANCNFSM4H2VKO7A> .

baozg · 2019-08-03T05:50:43Z

Hi, Weijiai

Thanks for reply.I didn't find the predi.fa-+-200. Where's the result should be?

I find only Module3_New directory have result, the other directory are empty.

.
├── temp
│   ├── TIR-Learner-+-Chr10.fasta
│   ├── TIR-Learner-+-Chr10-+-GRFmite.fa
│   ├── TIR-Learner-+-Chr10-+-GRFmite.fa-+-p
├── TIR-Learner
│   ├── TIR-Learner-+-Chr10-+-GRFmite.fa-+-p
│   ├── TIR-Learner-+-Chr1-+-GRFmite.fa-+-p
│   ├── TIR-Learner-+-Chr2-+-GRFmite.fa-+-p
│   ├── TIR-Learner-+-Chr3-+-GRFmite.fa-+-p

WeijiaSu · 2019-08-03T05:59:44Z

It should be in a sub folder of Module3_New, named by your genome name. Module3_New/YourGenomeName. If you don’t have this file, please keep this issue open, we will check the process. Thanks Weijia

…

On Sat, Aug 3, 2019 at 12:50 AM Zhigui Bao ***@***.***> wrote: Hi, Weijiai Thanks for reply.I didn't find the predi.fa-+-200. Where's the result should be? I find only Module3_New directory have result, the other directory are empty. . ├── temp │ ├── TIR-Learner-+-Chr10.fasta │ ├── TIR-Learner-+-Chr10-+-GRFmite.fa │ ├── TIR-Learner-+-Chr10-+-GRFmite.fa-+-p ├── TIR-Learner │ ├── TIR-Learner-+-Chr10-+-GRFmite.fa-+-p │ ├── TIR-Learner-+-Chr1-+-GRFmite.fa-+-p │ ├── TIR-Learner-+-Chr2-+-GRFmite.fa-+-p │ ├── TIR-Learner-+-Chr3-+-GRFmite.fa-+-p — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=AHUQO6XGTDATP64YLH7B23LQCUMDJA5CNFSM4H2VKO7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3PHYLA#issuecomment-517897260>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHUQO6VZ3ZPW3FSUAYN3B6LQCUMDJANCNFSM4H2VKO7A> .

baozg · 2019-08-03T06:14:32Z

Hi Weijia,

Thanks for check.
Modules3_Newonly have temp ,TIR-Learner and TIR-Learner-Result.

WeijiaSu · 2019-08-03T06:16:09Z

Oh, ok...can you show me the file/size in TIR-Learner?

…

On Sat, Aug 3, 2019 at 1:14 AM Zhigui Bao ***@***.***> wrote: Hi Weijia, Thanks for check. Modules3_Newonly have temp ,TIR-Learner and TIR-Learner-Result. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=AHUQO6U2CP32QT3AYL2LKM3QCUO4RA5CNFSM4H2VKO7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3PICTA#issuecomment-517898572>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHUQO6TKJTZVBMXVZW4RISTQCUO4RANCNFSM4H2VKO7A> .

baozg · 2019-08-03T06:51:24Z

Hi Weijia,

Totatl size are 1.1G

baozg · 2019-08-03T09:35:22Z

Hi Weijia,

I find the reason why I miss the predi.fa-+-200 result. The new release of the EDTA have update the install requirement, the script getDataset.py need the tensorflow and keras. I will install a brand new enviroment for EDTA.
Thanks for the update. I will update the issue if have any new result.

Cheers,
Zhigui

baozg · 2019-08-04T13:12:16Z

Hi all,

Just update the testing result. It seems that new release TIR can close this issue.

Please install a new env for the EDTA 20190802 release
Follow the step by the Shujun provided.

EDTA_raw
EDTA_processF
EDTA -step final

The time and resource of my plant genome (336M plant genome, 58% repeat estimated by the GenomeScope, 24 cores machine)

Step	maxvmem	time(h)	raw_fa size
Helitron	7.914GB	2.352222	1.3Mb
MITE	1.529GB	1.815278	4.9kb
TIR	42.127GB	4.895556	20Mb
LTR	19.049GB	1.417222	2.5Mb
EDTA_Final	19.388GB	19.42389	19Mb

Thanks for the developing.

Bests,
Zhigui

oushujun · 2019-08-16T21:47:53Z

I just pushed some new updates to EDTA, mainly to fix the TIR-Learner issue. Please reinstall EDTA and rerun it in the same work folder. Existing results will be reused so there is essentially no waste of time. Thank you for your patience and support!

oushujun · 2019-08-26T06:06:16Z

I consider this issue resolved. Please reopen it if it doesn't. Thank you all for testing. Shujun

Neato-Nick · 2019-08-30T17:23:21Z

I can't find TIR-Learner on github, but I'd rather be opening issues there. The errors I'm running into are typically encountered almost exclusively while running TIR-Learner, but EDTA itself is doing just fine. @oushujun Do you know if this is the right repo I should be posting to? https://github.com/weijiaweijia/TIR-Learner-Rice

oushujun · 2019-08-30T18:19:21Z

Hi Nick, Yes that was the original repo and the version is around v1.09, but the current TIR-Learner is at v1.23, we have improved the program substantially, so you may just use the EDTA version. Please update EDTA and try again. We push updates quite frequently at this point due to the improvement of these programs. You may open issues at the EDTA repo if you encounter any more issues. Thanks! Shujun

…

On Fri, Aug 30, 2019, 1:23 PM Nick Carleson ***@***.***> wrote: I can't find TIR-Learner on github, but I'd rather be opening issues there. The errors I'm running into are typically encountered almost exclusively while running TIR-Learner, but EDTA itself is doing just fine. @oushujun <https://github.com/oushujun> Do you know if this is the right repo I should be posting to? https://github.com/weijiaweijia/TIR-Learner-Rice — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4?email_source=notifications&email_token=ABNX4ND3KTB6D7BP6YXPKKTQHFJQTA5CNFSM4H2VKO7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5SIP2I#issuecomment-526682089>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABNX4NFQLHGAJK4Y3PQ345LQHFJQTANCNFSM4H2VKO7A> .

aaronphillips7493 · 2021-01-16T01:54:46Z

Hello, I installed EDTA following the instructions for a conda install. I have run EDTA using the following commands:
perl ../EDTA.pl --genome $GENOME --cds $CDS --curatedlib $CURATEDLIB --overwrite 0 --sensitive 1 --anno 1 --species Rice --evaluate 1 --threads 10

It works for LTR, but it crashes at the TIR step. Please see the error below:
Species: Rice
Traceback (most recent call last):
File "/hpcfs/users/a1779884/rice_genomics/EDTA/bin/TIR-Learner2.5/Module1/Fullcov.py", line 58, in
ProcessHomology(genome_Name)
File "/hpcfs/users/a1779884/rice_genomics/EDTA/bin/TIR-Learner2.5/Module1/Fullcov.py", line 47, in ProcessHomology
f = pd.read_csv(blast, header=None, sep="\t")
File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 946, in init
self._make_engine(self.engine)
File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1178, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 2008, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file
cat: *DTC-+-select.fa: No such file or directory
cat: *DTH-+-select.fa: No such file or directory
cat: *DTM-+-select.fa: No such file or directory
cat: *DTT-+-select.fa: No such file or directory

I have read this thread, but have not found anything to help me (I am a novice, so maybe that is why). Can someone please help me understand what is going on here, and help me figure out how to fix it?

Thank you,
Aaron :)

oushujun · 2021-01-16T02:09:26Z

Hi Aaron, Can you provide the conda command you used to install EDTA? Also, providing the output of the following comand executed under the EDTA env will be helpful: conda list > edta.env.list Best, Shujun

…

On Sat, Jan 16, 2021 at 9:54 AM aaronphillips7493 ***@***.***> wrote: Hello, I installed EDTA following the instructions for a conda install. I have run EDTA using the following commands: perl ../EDTA.pl --genome $GENOME --cds $CDS --curatedlib $CURATEDLIB --overwrite 0 --sensitive 1 --anno 1 --species Rice --evaluate 1 --threads 10 It works for LTR, but it crashes at the TIR step. Please see the error below: Species: Rice Traceback (most recent call last): File "/hpcfs/users/a1779884/rice_genomics/EDTA/bin/TIR-Learner2.5/Module1/Fullcov.py", line 58, in ProcessHomology(genome_Name) File "/hpcfs/users/a1779884/rice_genomics/EDTA/bin/TIR-Learner2.5/Module1/Fullcov.py", line 47, in ProcessHomology f = pd.read_csv(blast, header=None, sep="\t") File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 686, in read_csv return _read(filepath_or_buffer, kwds) File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 452, in _read parser = TextFileReader(fp_or_buf, **kwds) File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 946, in *init* self._make_engine(self.engine) File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 1178, in _make_engine self._engine = CParserWrapper(self.f, **self.options) File "/hpcfs/users/a1779884/.conda/envs/EDTA/lib/python3.6/site-packages/pandas/io/parsers.py", line 2008, in *init* self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.*cinit* pandas.errors.EmptyDataError: No columns to parse from file cat: *DTC-+-select.fa: No such file or directory cat: *DTH-+-select.fa: No such file or directory cat: *DTM-+-select.fa: No such file or directory cat: *DTT-+-select.fa: No such file or directory I have read this thread, but have not found anything to help me (I am a novice, so maybe that is why). Can someone please help me understand what is going on here, and help me figure out how to fix it? Thank you, Aaron :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNX4NBM6E7M74XVRDBYSKTS2DWXHANCNFSM4H2VKO7A> .

aaronphillips7493 · 2021-01-16T02:37:26Z

Hi Shujun,

Thank you for your hasty reply!

To install I did the following on Thursday 14th Jan 2021:

conda create -n EDTA

conda activate EDTA

conda config --env --add channels anaconda --add channels conda-forge --add channels bioconda

conda install -n EDTA -y cd-hit repeatmodeler muscle mdust blast openjdk perl perl-text-soundex multiprocess regex tensorflow=1.14.0 keras=2.2.4 scikit-learn=0.19.0 biopython pandas glob2 python=3.6 tesorter genericrepeatfinder genometools-genometools ltr_retriever ltr_finder numpy=1.16.4

git clone https://github.com/oushujun/EDTA

And then I ran the test, which worked.

Please find the list of packages in the EDTA env attached to this here.
edta.env.list.txt

I have just refreshed the EDTA page and the instructions for installation appear to be different now. Were they recently updated, and perhaps that is why I am having issues?

Thank you again,
Aaron :)

oushujun · 2021-01-16T03:07:30Z

Hi Aaron,

Thanks for the details. Something may be conflicted with keras and numpy, you may use the lasted installation version (the yml file) that is frozen from a successful one. You may need to modify the first line of that file, change "EDTA" to something else (eg. EDTA1.9.5) to avoid conflicts with your current env.

Best,
Shujun

aaronphillips7493 · 2021-01-16T04:00:38Z

Hey Shujun,

I reinstalled EDTA using the .yml file. I re-ran my analyses with the overwrite option switched off (to avoid redoing the LTR finding) and I got the same errors again. I am now trying to rerun EDTA with the overwrite option switched on, so will let you know how that goes.

Thanks again for your suggestions,
Aaron :)

oushujun · 2021-01-16T04:29:30Z

Hi Aaron,

That is one of my thoughts too, that you may have run multiple times on the same folder, and some erroneous runs have made the files weird and preventing new runs to proceed. Ovewriting the existing files will be a good choice. If you want to save the LTR results, you can run EDTA_raw with --type TIR --overwrite 1 to just overwrite the TIR results.

Best,
Shujun

aaronphillips7493 · 2021-01-17T03:21:48Z

Hey Shujun,

LTR step worked, but TIR failed again with the same errors.

I noticed that when I try to do just TIR with --type TIR --overwrite 1 I instantly get the error:
Failed to parse command line

Do you have any other suggestions?

Thank you again,
Aaron :)

oushujun · 2021-01-19T02:33:34Z

Hi Aaron, Did you use overwrite 1 on existing folders or start fresh? You my try the later one. Shujun

…

On Sun, Jan 17, 2021 at 11:22 AM aaronphillips7493 ***@***.***> wrote: Hey Shujun, LTR step worked, but TIR failed again with the same errors. Do you have any other suggestions? Thank you again, Aaron :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNX4NC62EMWHZH3FBGOOGLS2JJVRANCNFSM4H2VKO7A> .

Add main.nf into the new branch

oushujun added the bug Something isn't working label Jun 22, 2019

oushujun mentioned this issue Jun 22, 2019

Can't find label error #3

Closed

oushujun mentioned this issue Aug 21, 2019

Happy EDTA users with successful cases #15

Closed

oushujun closed this as completed Aug 26, 2019

jguhlin added a commit that referenced this issue Oct 30, 2024

Merge pull request #4 from jguhlin/nextflow_reboot_jg

d8d9ea2

Add main.nf into the new branch

Crashing with TIR-Learner #4

Crashing with TIR-Learner #4

Comments

Neato-Nick commented Jun 21, 2019 • edited Loading

oushujun commented Jun 22, 2019

Neato-Nick commented Jun 24, 2019

oushujun commented Jun 25, 2019

oushujun commented Jun 25, 2019

DanJeffries commented Jun 28, 2019

oushujun commented Jun 30, 2019

philippbayer commented Jun 30, 2019

oushujun commented Jul 1, 2019

Neato-Nick commented Jul 9, 2019

philippbayer commented Jul 10, 2019

oushujun commented Jul 10, 2019

philippbayer commented Jul 11, 2019

oushujun commented Aug 1, 2019

philippbayer commented Aug 1, 2019

baozg commented Aug 3, 2019

WeijiaSu commented Aug 3, 2019 via email

baozg commented Aug 3, 2019

WeijiaSu commented Aug 3, 2019 via email

baozg commented Aug 3, 2019

WeijiaSu commented Aug 3, 2019 via email

baozg commented Aug 3, 2019

baozg commented Aug 3, 2019

baozg commented Aug 4, 2019 • edited Loading

oushujun commented Aug 16, 2019

oushujun commented Aug 26, 2019

Neato-Nick commented Aug 30, 2019

oushujun commented Aug 30, 2019 via email

aaronphillips7493 commented Jan 16, 2021

oushujun commented Jan 16, 2021 via email

aaronphillips7493 commented Jan 16, 2021 • edited Loading

oushujun commented Jan 16, 2021

aaronphillips7493 commented Jan 16, 2021

oushujun commented Jan 16, 2021

aaronphillips7493 commented Jan 17, 2021 • edited Loading

oushujun commented Jan 19, 2021 via email

Neato-Nick commented Jun 21, 2019 •

edited

Loading

baozg commented Aug 4, 2019 •

edited

Loading

aaronphillips7493 commented Jan 16, 2021 •

edited

Loading

aaronphillips7493 commented Jan 17, 2021 •

edited

Loading