You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I have four nematode genome samples and I'm running BRAKER with genemark epmode to annotate with protein + genome, with RNA + genome, and then combine the two with TSEBRA. 3 of the genomes process perfectly fine but on another I keep running into a problem with the gmes_petap.pl step no matter the size or evolutionary distance of the protein file that I use (I've tried many, from sister species to all metazoa). Though the protein +genome run fails on this sample, the RNA + genome completes. When running esmode the protein annotation completes. I did these samples close in time, so there was no software updates or changes to my environment between samples.
Any idea as to why this one particular sample won't run like the others?
Error in braker.log:
RUNNING GENEMARK-EX
Preparing genemark_evidence file hints from manual hints...
Checking whether file /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_hintsfile.gff contains enough hints and sufficient multiplicity information...
WARNING:
The hints file(s) for GeneMark-EX contain less than 1000 introns. (In total, 6 unique introns are contained.)
Genemark-EX might fail due to the low number of hints.
WARNING:
The hints file(s) for GeneMark-EX contain less than 150 introns with multiplicity >= 4! (In total, 6 unique introns are contained. 0 have a multiplicity >= 4.)
Possibly, you are trying to run braker.pl on data that does not provide sufficient multiplicity information. This will e.g. happen if you try to use introns generated from assembled RNA-Seq transcripts; or if
you try to run braker.pl in epmode with mappings from proteins without sufficient hits per locus. Or if you use the example data set.
A low number of intron hints with sufficient multiplicity may result in a crash of GeneMark-EX (it should not crash with the example data set).
Running GeneMark-EP
changing into GeneMark-EP directory /home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP
cd /home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP
Running gmes_petap.pl
perl /home/data/jfierst/veggers/gmes_linux_64/gmes_petap.pl --verbose --seq /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genome.fa --EP /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_hintsfile.gff --c
ores=8 --gc_donor 0.001 --evidence /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_evidence.gff --soft_mask auto 1>/home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP.stdout 2>/home/data/jfierst
/veggers/DF5033_BRAKER_odb10/errors/GeneMark-EP.stderr
The GeneMark-EP.stderr file is empty
The text was updated successfully, but these errors were encountered:
Sorry for the late reply. Is this still an issue or were you able to find a solution? Judging from these error messages, one problem could be that this genome is too small.
I ran it in es mode for protein+genome and then paired that data with the RNA+genome and ran it through TSEBRA. I don't know if this is necessarily 'correct' or best practice but I got an output. ~14000 genes were reported compared with ~19000 for the other species. The genome is ~74Mb. Is this too small?
I was going to try braker3 that was released recently and see if that changed anything but haven't had the time.
Ultimately I have data but I still don't know why it isn't working when given a protein file.
14,000 could be a bit low considering C. elegans (~100 Mbp) has ~20,000 genes in the annotation.
Ultimately I have data but I still don't know why it isn't working when given a protein file.
Apart from BRAKER3, you can also try a new protein-based pipeline, GALBA (preprint available here). It employs miniprot to align the reference proteins and uses the alignments directly to train AUGUSTUS, so it can be helpful in cases when GeneMark-EP fails for whatever reason. I'd recommend extracting nematode protein from the new OrthoDB v11 release and supplementing the protein set with additional nematodes from RefSeq - to get better protein coverage.
Hi,
I have four nematode genome samples and I'm running BRAKER with genemark epmode to annotate with protein + genome, with RNA + genome, and then combine the two with TSEBRA. 3 of the genomes process perfectly fine but on another I keep running into a problem with the gmes_petap.pl step no matter the size or evolutionary distance of the protein file that I use (I've tried many, from sister species to all metazoa). Though the protein +genome run fails on this sample, the RNA + genome completes. When running esmode the protein annotation completes. I did these samples close in time, so there was no software updates or changes to my environment between samples.
Any idea as to why this one particular sample won't run like the others?
Error in braker.log:
Preparing genemark_evidence file hints from manual hints...
Checking whether file /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_hintsfile.gff contains enough hints and sufficient multiplicity information...
WARNING:
The hints file(s) for GeneMark-EX contain less than 1000 introns. (In total, 6 unique introns are contained.)
Genemark-EX might fail due to the low number of hints.
WARNING:
The hints file(s) for GeneMark-EX contain less than 150 introns with multiplicity >= 4! (In total, 6 unique introns are contained. 0 have a multiplicity >= 4.)
Possibly, you are trying to run braker.pl on data that does not provide sufficient multiplicity information. This will e.g. happen if you try to use introns generated from assembled RNA-Seq transcripts; or if
you try to run braker.pl in epmode with mappings from proteins without sufficient hits per locus. Or if you use the example data set.
A low number of intron hints with sufficient multiplicity may result in a crash of GeneMark-EX (it should not crash with the example data set).
Running GeneMark-EP
changing into GeneMark-EP directory /home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP
cd /home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP
Running gmes_petap.pl
perl /home/data/jfierst/veggers/gmes_linux_64/gmes_petap.pl --verbose --seq /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genome.fa --EP /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_hintsfile.gff --c
ores=8 --gc_donor 0.001 --evidence /home/data/jfierst/veggers/DF5033_BRAKER_odb10/genemark_evidence.gff --soft_mask auto 1>/home/data/jfierst/veggers/DF5033_BRAKER_odb10/GeneMark-EP.stdout 2>/home/data/jfierst
/veggers/DF5033_BRAKER_odb10/errors/GeneMark-EP.stderr
The GeneMark-EP.stderr file is empty
The text was updated successfully, but these errors were encountered: