-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error generating Kallisto index #257
Comments
Hi @Upendra19993, that's very unfortunate, it seems like your process ended because it's missing the kallisto index for quantification. This is the main problem that's making everything downstream fail: Do you have Kallisto installed? Can you try indexing the Kallisto fasta file? |
I think there is a problem with latest version of kallisto... |
Hi Carolinamonzó and eprdz, I had installed Kallisto but had this issue. Then I also tried with a different version (kallisto0.48.0) and it worked and got the results without any error. Thank you both of you! |
Hi,
I want to run sqanti3 for my dataset. But to get familiar with the tool, I first tried the tool with the example dataset you have provided. I ran sqanti3 quality control step and I am getting an error. The whole message I get is as below.
The command I used is: sqanti3_qc.py UHR_chr22.gtf gencode.v38.basic_chr22.gtf GRCh38.p13_chr22.fasta -o UHR_chr22 -d SQANTI3_output --short_reads UHR_chr22_short_reads.fofn --cpus 4 --report both
The progress of the job and error messages are as below.
(base) [uqwwijes@bun025 SQANTI3_reinstallation_2]$ sqanti3_qc.py UHR_chr22.gtf gencode.v38.basic_chr22.gtf GRCh38.p13_chr22.fasta -o UHR_chr22 -d SQANTI3_output --short_reads UHR_chr22_short_reads.fofn --cpus 4 --report both
Rscript (R) version 4.3.1 (2023-06-16)
ERROR: genome fasta /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_reinstallation_2/GRCh38.p13_chr22.fasta doesn't exist. Abort!
(base) [uqwwijes@bun025 SQANTI3_reinstallation_2]$ cd ..
(base) [uqwwijes@bun025 Exampla_data]$ sqanti3_qc.py UHR_chr22.gtf gencode.v38.basic_chr22.gtf GRCh38.p13_chr22.fasta -o UHR_chr22 -d SQANTI3_output --short_reads UHR_chr22_short_reads.fofn --cpus 4 --report both
Rscript (R) version 4.3.1 (2023-06-16)
Write arguments to /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/UHR_chr22.params.txt...
**** Running SQANTI3...
**** Parsing provided files....
Reading genome fasta /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/GRCh38.p13_chr22.fasta....
Skipping aligning of sequences because GTF file was provided.
Indels will be not calculated since you ran SQANTI3 without alignment step (SQANTI3 with gtf format as transcriptome input).
**** Predicting ORF sequences...
**** Parsing Reference Transcriptome....
**** Parsing Isoforms....
**** Running STAR for calculating Short-Read Coverage.
START running STAR...
Running indexing...
/sw/local/rocky8/noarch/qcif/software/miniconda3/envs/sqanti3_5.2/bin/STAR-avx2 --runThreadN 4 --runMode genomeGenerate --genomeDir /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_index/ --genomeFastaFiles /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/GRCh38.p13_chr22.fasta --outTmpDir /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_index//_STARtmp/
STAR version: 2.7.11b compiled: 2024-01-29T15:15:38+0000 :/opt/conda/conda-bld/star_1706541070242/work/source
Feb 27 12:20:52 ..... started STAR run
Feb 27 12:20:52 ... starting to generate Genome files
!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=50818468, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 11
Feb 27 12:20:53 ... starting to sort Suffix Array. This may take a long time...
Feb 27 12:20:53 ... sorting Suffix Array chunks and saving them to disk...
Feb 27 12:21:02 ... loading chunks from disk, packing SA...
Feb 27 12:21:02 ... finished generating suffix array
Feb 27 12:21:02 ... generating Suffix Array index
Feb 27 12:21:11 ... completed Suffix Array index
Feb 27 12:21:11 ... writing Genome to disk ...
Feb 27 12:21:11 ... writing Suffix Array to disk ...
Feb 27 12:21:11 ... writing SAindex to disk
Feb 27 12:21:11 ..... finished successfully
Indexing done.
Mapping for UHR_Rep1_chr22.R1 : in progress...
Mapping for UHR_Rep1_chr22.R1 : done.
/sw/local/rocky8/noarch/qcif/software/miniconda3/envs/sqanti3_5.2/bin/STAR-avx2 --runThreadN 4 --genomeDir /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_index/ --readFilesIn /scratch/project/qaafi-cnafs/upendra/Sqanti3/Example/UHR_Rep1_chr22.R1.fastq /scratch/project/qaafi-cnafs/upendra/Sqanti3/Example/UHR_Rep1_chr22.R2.fastq --outFileNamePrefix /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping/UHR_Rep1_chr22.R1 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterType BySJout --outSAMunmapped Within --outFilterMultimapNmax 20 --outFilterMismatchNoverLmax 0.04 --outFilterMismatchNmax 999 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --sjdbScore 1 --genomeLoad NoSharedMemory --outSAMtype BAM SortedByCoordinate --twopassMode Basic
STAR version: 2.7.11b compiled: 2024-01-29T15:15:38+0000 :/opt/conda/conda-bld/star_1706541070242/work/source
Feb 27 12:21:12 ..... started STAR run
Feb 27 12:21:12 ..... loading genome
Feb 27 12:21:12 ..... started 1st pass mapping
Feb 27 12:21:36 ..... finished 1st pass mapping
Feb 27 12:21:36 ..... inserting junctions into the genome indices
Feb 27 12:21:44 ..... started mapping
Feb 27 12:22:09 ..... finished mapping
Feb 27 12:22:09 ..... started sorting BAM
Feb 27 12:22:09 ..... finished successfully
Mapping for UHR_Rep2_chr22.R1 : in progress...
Mapping for UHR_Rep2_chr22.R1 : done.
/sw/local/rocky8/noarch/qcif/software/miniconda3/envs/sqanti3_5.2/bin/STAR-avx2 --runThreadN 4 --genomeDir /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_index/ --readFilesIn /scratch/project/qaafi-cnafs/upendra/Sqanti3/Example/UHR_Rep2_chr22.R1.fastq /scratch/project/qaafi-cnafs/upendra/Sqanti3/Example/UHR_Rep2_chr22.R2.fastq --outFileNamePrefix /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping/UHR_Rep2_chr22.R1 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterType BySJout --outSAMunmapped Within --outFilterMultimapNmax 20 --outFilterMismatchNoverLmax 0.04 --outFilterMismatchNmax 999 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --sjdbScore 1 --genomeLoad NoSharedMemory --outSAMtype BAM SortedByCoordinate --twopassMode Basic
STAR version: 2.7.11b compiled: 2024-01-29T15:15:38+0000 :/opt/conda/conda-bld/star_1706541070242/work/source
Feb 27 12:22:10 ..... started STAR run
Feb 27 12:22:10 ..... loading genome
Feb 27 12:22:10 ..... started 1st pass mapping
Feb 27 12:22:28 ..... finished 1st pass mapping
Feb 27 12:22:28 ..... inserting junctions into the genome indices
Feb 27 12:22:36 ..... started mapping
Feb 27 12:22:55 ..... finished mapping
Feb 27 12:22:55 ..... started sorting BAM
Feb 27 12:22:55 ..... finished successfully
Input pattern: /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping/.
The following files found and to be read as junctions:
/scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping/UHR_Rep2_chr22.R1SJ.out.tab
/scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping/UHR_Rep1_chr22.R1SJ.out.tab
6762 junctions read. 2 junctions added to both strands because no strand information from STAR.
Running calculation of TSS ratio
BAM files identified: ['/scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping//UHR_Rep1_chr22.R1Aligned.sortedByCoord.out.bam', '/scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/STAR_mapping//UHR_Rep2_chr22.R1Aligned.sortedByCoord.out.bam']
Temp files removed.
**** Performing Classification of Isoforms....
Number of classified isoforms: 3925
**** RT-switching computation....
Full-length read abundance files not provided.
**** Adding TSS ratio data... ****
**** Running Kallisto to calculate isoform expressions.
Running kallisto index /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/kallisto_output/kallisto_corrected_fasta.idx using as reference /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/UHR_chr22_corrected.fasta
**Running Kallisto quantification for UHR_Rep1_chr22.R1 sample
Error: kallisto index file not found /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/kallisto_output/kallisto_corrected_fasta.idx
Usage: kallisto quant [arguments] FASTQ-files
Required arguments:
-i, --index=STRING Filename for the kallisto index to be used for
quantification
-o, --output-dir=STRING Directory to write output to
Optional arguments:
-b, --bootstrap-samples=INT Number of bootstrap samples (default: 0)
--seed=INT Seed for the bootstrap sampling (default: 42)
--plaintext Output plaintext instead of HDF5
--single Quantify single-end reads
--single-overhang Include reads where unobserved rest of fragment is
predicted to lie outside a transcript
--fr-stranded Strand specific reads, first read forward
--rf-stranded Strand specific reads, first read reverse
-l, --fragment-length=DOUBLE Estimated average fragment length
-s, --sd=DOUBLE Estimated standard deviation of fragment length
(default: -l, -s values are estimated from paired
end data, but are required when using --single)
-t, --threads=INT Number of threads to use (default: 1)
--verbose Print out progress information every 1M proccessed reads
Running Kallisto quantification for UHR_Rep2_chr22.R1 sample
Error: kallisto index file not found /scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/kallisto_output/kallisto_corrected_fasta.idx
Usage: kallisto quant [arguments] FASTQ-files
Required arguments:
-i, --index=STRING Filename for the kallisto index to be used for
quantification
-o, --output-dir=STRING Directory to write output to
Optional arguments:
-b, --bootstrap-samples=INT Number of bootstrap samples (default: 0)
--seed=INT Seed for the bootstrap sampling (default: 42)
--plaintext Output plaintext instead of HDF5
--single Quantify single-end reads
--single-overhang Include reads where unobserved rest of fragment is
predicted to lie outside a transcript
--fr-stranded Strand specific reads, first read forward
--rf-stranded Strand specific reads, first read reverse
-l, --fragment-length=DOUBLE Estimated average fragment length
-s, --sd=DOUBLE Estimated standard deviation of fragment length
(default: -l, -s values are estimated from paired
end data, but are required when using --single)
-t, --threads=INT Number of threads to use (default: 1)
--verbose Print out progress information every 1M proccessed reads
Traceback (most recent call last):
File "/sw/local/rocky8/noarch/qcif/software/SQANTI3-5.2/sqanti3_qc.py", line 2542, in
main()
File "/sw/local/rocky8/noarch/qcif/software/SQANTI3-5.2/sqanti3_qc.py", line 2525, in main
run(args)
File "/sw/local/rocky8/noarch/qcif/software/SQANTI3-5.2/sqanti3_qc.py", line 1978, in run
exp_dict = expression_parser(expression_files)
File "/sw/local/rocky8/noarch/qcif/software/SQANTI3-5.2/sqanti3_qc.py", line 806, in expression_parser
reader = DictReader(open(exp_file), delimiter='\t')
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/project_mnt/S0030/upendra/Sqanti3/Exampla_data/SQANTI3_output/kallisto_output/UHR_Rep1_chr22.R1/abundance.tsv'**
Kindly let me know what I can do to fix this issue.
Many thanks,
Upendra.
The text was updated successfully, but these errors were encountered: