Problems in RUNNING GENEMARK-EX #716

ChuanzhengWei · 2023-12-07T03:02:58Z

Dear Braker team,
I got an error when using brake3 built by Singularity for gene prediction. I am not sure whether there is a problem with gmetp.pl during the running process. My input file is a protein file and a bam file aligned with hisat2.
This is my input command:

singularity exec /public/home/weichuanzheng/software/singularity/braker3/braker3.sif braker.pl --JAVA_PATH=/public/home/weichuanzheng/software/jdk/bin --threads=8 --species=s349 \
    --genome=/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/s349.nextpolish.fasta.masked \
    --prot_seq=/public/home/weichuanzheng/project/11.Sorghum_genome/06.prot/structure_annotation1.fasta \
    --bam==/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/bamfile/SRR23260553.sorted.bam,............

The following is the specific content of the error report:
In 'braker.log':

# Wed Dec  6 10:37:36 2023: sorting RNA-Seq BAM files
# Wed Dec  6 12:42:05 2023: Running gmetp.pl
/usr/bin/perl /opt/ETP/bin/gmetp.pl --cfg /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP/etp_config.yaml --workdir /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP --bam /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP/etp_data/ --cores 8 --softmask  1>/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/errors/GeneMark-ETP.stdout 2>/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/errors/GeneMark-ETP.stderr

At the end of 'GeneMark-ETP.stderr':

WARNING: 'ptg000031l_np1212' does not match any sequence in the fasta file. Maybe the two files do not belong together.
error
error, file/folder not found: transcripts_merged.fasta.gff

In 'GeneMark-ETP.stdout':

GeneMarkS: error on last system call, error code 256
Abort program!!!

I would appreciate any suggestions.

The text was updated successfully, but these errors were encountered:

KatharinaHoff · 2023-12-07T09:26:34Z

What is inside your protein file? Is it a an OrthoDB partition? You have to provide a protein file with a large degree of redundancy in protein space, i.e. the proteins should come from one species, only.

…

On Thu, Dec 7, 2023 at 4:03 AM ChuanzhengWei ***@***.***> wrote: Dear Braker team, I got an error when using brake3 built by Singularity for gene prediction. I am not sure whether there is a problem with gmetp.pl during the running process. My input file is a protein file and a bam file aligned with hisat2. This is my input command: singularity exec /public/home/weichuanzheng/software/singularity/braker3/braker3.sif braker.pl --JAVA_PATH=/public/home/weichuanzheng/software/jdk/bin --threads=8 --species=s349 \ --genome=/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/s349.nextpolish.fasta.masked \ --prot_seq=/public/home/weichuanzheng/project/11.Sorghum_genome/06.prot/structure_annotation1.fasta \ --bam==/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/bamfile/SRR23260553.sorted.bam,............ The following is the specific content of the error report: In 'braker.log': # Wed Dec 6 10:37:36 2023: sorting RNA-Seq BAM files # Wed Dec 6 12:42:05 2023: Running gmetp.pl /usr/bin/perl /opt/ETP/bin/gmetp.pl --cfg /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP/etp_config.yaml --workdir /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP --bam /public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/GeneMark-ETP/etp_data/ --cores 8 --softmask 1>/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/errors/GeneMark-ETP.stdout 2>/public/home/weichuanzheng/project/08.pangenome/03.mask/s349/annotation/braker/errors/GeneMark-ETP.stderr At the end of 'GeneMark-ETP.stderr': WARNING: 'ptg000031l_np1212' does not match any sequence in the fasta file. Maybe the two files do not belong together. error error, file/folder not found: transcripts_merged.fasta.gff In 'GeneMark-ETP.stdout': GeneMarkS: error on last system call, error code 256 Abort program!!! I would appreciate any suggestions. — Reply to this email directly, view it on GitHub <#716>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJMC6JAWJTG4ITNTJ3MUFITYIEWW5AVCNFSM6AAAAABAKLQJLOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDSNZYGY3TINY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

ChuanzhengWei · 2023-12-11T03:20:28Z

My protein file includes sequences from 60 varieties of sorghum, one variety of rice, and one variety of maize. Initially, I faced a problem that did not seem to stem from the protein file itself. After renaming and shortening the headers of the sequences in the genome file, I successfully generated the braker.gff file.

However, I've encountered a new challenge: the generated GFF file does not contain UTRs (Untranslated Regions). I think this issue might be related to the limitations of the container environment, as I am running BRAKER through Singularity due to the lack of root privileges on my system.

Given these constraints, could you please advise on how I might obtain a GFF file that includes UTRs? Any guidance or suggestions you can offer would be greatly appreciated, as this is a critical component of my project.

Thank you in advance for your time and assistance. I look forward to your valuable input.

Best regards

ChuanzhengWei · 2023-12-11T08:01:25Z

This is the version I'm using
singularity exec braker3.sif braker.pl --version
braker.pl version 3.0.3

KatharinaHoff · 2023-12-11T11:29:43Z

See #587

ChuanzhengWei · 2023-12-12T03:29:53Z

I did not find GeneMark-ETP/rnaseq/stringtie/transcripts_merged.gff, so I need to reuse stringtie to obtain a new gff file, and then merge the stringtie.gff and braker.gtf files through stringtie2utr.py?

See #587

KatharinaHoff · 2023-12-12T13:24:18Z

Yes, you need to run stringtie. The script is not connected to BRAKER, yet.

ChuanzhengWei · 2023-12-13T08:57:07Z

thank you, I successfully obtained a GTF file containing UTRs using stringtie2utr.py, but I've encountered a new issue: there are multiple pieces of information generated for the 5' UTR or 3' UTR of the same gene.like this

    178 Chr01   stringtie2utr   five_prime_UTR  36899   36899   1000    -       .       transcript_id "g4.t2"; gene_id "g4";
    179 Chr01   stringtie2utr   five_prime_UTR  37358   37440   1000    -       .       transcript_id "g4.t2"; gene_id "g4";
    180 Chr01   stringtie2utr   five_prime_UTR  41705   41825   1000    -       .       transcript_id "g4.t2"; gene_id "g4";
    181 Chr01   stringtie2utr   five_prime_UTR  42029   42456   1000    -       .       transcript_id "g4.t2"; gene_id "g4"

I want to know if this situation is normal.

KatharinaHoff · 2023-12-19T17:55:04Z

This is not necessarily wrong. UTRs can be spliced ChuanzhengWei ***@***.***> schrieb am Mi. 13. Dez. 2023 um 09:57:

…

thank you, I successfully obtained a GTF file containing UTRs using stringtie2utr.py, but I've encountered a new issue: there are multiple pieces of information generated for the 5' UTR or 3' UTR of the same gene.like this 178 Chr01 stringtie2utr five_prime_UTR 36899 36899 1000 - . transcript_id "g4.t2"; gene_id "g4"; 179 Chr01 stringtie2utr five_prime_UTR 37358 37440 1000 - . transcript_id "g4.t2"; gene_id "g4"; 180 Chr01 stringtie2utr five_prime_UTR 41705 41825 1000 - . transcript_id "g4.t2"; gene_id "g4"; 181 Chr01 stringtie2utr five_prime_UTR 42029 42456 1000 - . transcript_id "g4.t2"; gene_id "g4" I want to know if this situation is normal. — Reply to this email directly, view it on GitHub <#716 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJMC6JAHSMAYQ2DG6BLW37TYJFUW5AVCNFSM6AAAAABAKLQJLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJTGUYDQNRXGU> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

KatharinaHoff assigned alexlomsadze Dec 7, 2023

KatharinaHoff added the ETP label Dec 7, 2023

KatharinaHoff self-assigned this Dec 7, 2023

KatharinaHoff closed this as completed Dec 11, 2023

spoonbender76 mentioned this issue Dec 14, 2023

stringtie2utr generates multiple five_prime_UTRs and three_prime_UTRs within a gene #723

Closed

pelotbdr mentioned this issue Oct 17, 2024

GeneMark error on the names in the genome file #877

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems in RUNNING GENEMARK-EX #716

Problems in RUNNING GENEMARK-EX #716

ChuanzhengWei commented Dec 7, 2023

KatharinaHoff commented Dec 7, 2023 via email

ChuanzhengWei commented Dec 11, 2023

ChuanzhengWei commented Dec 11, 2023

KatharinaHoff commented Dec 11, 2023

ChuanzhengWei commented Dec 12, 2023

KatharinaHoff commented Dec 12, 2023

ChuanzhengWei commented Dec 13, 2023

KatharinaHoff commented Dec 19, 2023 via email

Problems in RUNNING GENEMARK-EX #716

Problems in RUNNING GENEMARK-EX #716

Comments

ChuanzhengWei commented Dec 7, 2023

KatharinaHoff commented Dec 7, 2023 via email

ChuanzhengWei commented Dec 11, 2023

ChuanzhengWei commented Dec 11, 2023

KatharinaHoff commented Dec 11, 2023

ChuanzhengWei commented Dec 12, 2023

KatharinaHoff commented Dec 12, 2023

ChuanzhengWei commented Dec 13, 2023

KatharinaHoff commented Dec 19, 2023 via email