Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kallisto #120

Closed
ewels opened this issue Dec 9, 2015 · 13 comments
Closed

kallisto #120

ewels opened this issue Dec 9, 2015 · 13 comments

Comments

@ewels
Copy link
Member

ewels commented Dec 9, 2015

Module request: kallisto

@ewels ewels added module: new waiting: example data Needs example data before we can proceed labels Dec 9, 2015
@ewels ewels added this to the Module Backlog milestone Dec 9, 2015
@darogan
Copy link

darogan commented May 18, 2016

Is there anyway to bump this up on your to do list?

I can provide some example data if that helps?

@ewels
Copy link
Member Author

ewels commented May 18, 2016

Yeah absolutely - I just need some example logs and then it's very quick.

@darogan
Copy link

darogan commented May 18, 2016

I attached two kallisto examples run from my new clusterflow pipeline. Which reminds me, do you want my kallisto.cfmod file too?

HS002-PE-R00059_BD0U5YACXX.RHM066_CGATGT_L002_R1.fastq.gz_fastq_kallisto_clusterFlow.txt
HS002-PE-R00059_BD0U5YACXX.RHM067_CAGATC_L002_R1.fastq.gz_fastq_kallisto_clusterFlow.txt

@ewels
Copy link
Member Author

ewels commented May 18, 2016

Ok great, thanks! Does Kallisto write any log files itself or is it only this stderr that we have to go on? It's not the nicest log ever..

Would be great to have a module for Kallisto in Cluster Flow - if you could fork and then open a pull request with it that would be brilliant!

@ewels
Copy link
Member Author

ewels commented May 18, 2016

So the kallisto docs say that there should be a file called run_info.json which is "a json file containing information about the run". That sounds much easier to parse...

@darogan
Copy link

darogan commented May 18, 2016

I attached a run_info.json file and pasted below

{
     "n_targets": 176241,
     "n_bootstraps": 100,
     "n_processed": 34503308,
     "kallisto_version": "0.42.5",
     "index_version": 10,
     "start_time": "Sat May 14 20:47:31 2016",
     "call": "kallisto quant -t 1 --pseudobam -i 
/home/rsh46/scratch/Genomes/Homo_sapiens/GRCh38/Homo_sapiens.GRCh38.cdna.all.idx 
-o 
HS002-PE-R00059_BD0U5YACXX.RHM076_CAGATC_L005_R1_val_1.fq.gz_kallisto_output 
-b 100 HS002-PE-R00059_BD0U5YACXX.RHM076_CAGATC_L005_R1_val_1.fq.gz 
HS002-PE-R00059_BD0U5YACXX.RHM076_CAGATC_L005_R2_val_2.fq.gz"
}

But it doesn't include any pseudo-alignment metrics

[quant] finding pseudoalignments for the reads ... done
[quant] processed 34,503,308 reads, 29,228,276 reads pseudoaligned
[quant] estimated average fragment length: 169.568

and then a simple % alignment rate calculated from the above

@ewels
Copy link
Member Author

ewels commented May 18, 2016

Damn, that's a shame. A JSON file with a consistent filename is a lot easier to find and parse than a bit of stderr printed somewhere in the middle of some file.

Ok, I'll write the module up to look for the stderr as you say anyway. At least the stderr repeats the input filenames.

@ewels ewels removed the waiting: example data Needs example data before we can proceed label May 18, 2016
@ewels ewels closed this as completed in c9ae961 May 18, 2016
@darogan
Copy link

darogan commented May 18, 2016

Just tested the new kallisto module - works beautifully for me!

Another request though...

Could you add a kallisto plot for the insert sizes from the line

[quant] estimated average fragment length: 169.568

@ewels
Copy link
Member Author

ewels commented May 18, 2016

Apologies, that's already parsed but looks like I forgot to add it to the report. As it's a single value I'll probably add it to the General Statistics table instead if that's ok? It should already be in multiqc_data/multiqc_kallisto.txt as well.

@darogan
Copy link

darogan commented May 18, 2016

Yes, that will work for me

@ewels
Copy link
Member Author

ewels commented May 18, 2016

ok great! Let me know if you run into any other problems or would like any new features.

@ale07alvarez
Copy link

ale07alvarez commented Apr 6, 2017

I would very much appreciate if you could clarify which file from the kallisto output is parse? There are only three output files from kallisto and when I run multiqc .
I get the following

[INFO   ]         multiqc : This is MultiQC v1.0.dev0
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Searching '.'
[WARNING]         multiqc : No analysis results found. Cleaning up..
[INFO   ]         multiqc : MultiQC complete

thanks!

@ewels
Copy link
Member Author

ewels commented Apr 6, 2017

Hi @ale07alvarez, I replied on your new issue #440.

Phil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants