-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiQC: Update + fixing issues #1821
Conversation
- Update to version 1.5 - Add new supported tools
- Escape identifiers - Use os.path.join to generate path - Add more token as shared code
@bebatut I can't replicate #1624 anymore, so I guess I'd say it's "fixed". Conda dependency resolution is black magic to me, and I suspect that something changed with an underlying package (and not with MultiQC packages) that both broke and then subsequently fixed things... 🤷♂️ At any rate, this is some excellent work! Thanks. |
See also #1805 |
@nekrut I added the SnpEff module too and some tests 😄 |
Thanks @lparsons for the check! |
I think this PR is ready for review 😄 |
Fantastic work @bebatut! One thought I had was that for tools that put the input filename into the output (e.g. BCFTools, cutadapt, etc.) perhaps it would be a good "best practice" to first link the input files using the |
@lparsons you mean linking inside the cutadapt wrapper? |
@bgruening Yes, within the wrapper of tools that include the input filenames as part of the output, link the input files first to a name that includes the |
I feel a little bit if this is really needed and a best practice it should be part of the job staging and Galaxy should provide a variable that we can simply use. So Galaxy does the linking for us and a tool can use it. We should move this discussion to an other issue I guess. If you are fine with the PR please approve and/or merge it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Thanks a lot for working on this @bebatut I'm excited to use it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested adding some text into the "Report title" field and if I do that, I get an empty "0 bytes" html output (tested with a picard markduplicates output), if I add the same text to the "Custom comment" the report is correctly produced with the comment.
tools/multiqc/multiqc.xml
Outdated
@@ -647,6 +769,9 @@ sp: | |||
</when> | |||
</conditional> | |||
</repeat> | |||
<param name="title" type="text" value="" optional="true" label="Report title" help="It is rinted as page header"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/rinted/printed/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -411,15 +511,18 @@ sp: | |||
<!--<option value="salmon">Salmon</option>--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the module names from RSeQC here? "(bam_stat, gene_body_coverage, infer_experiment, ...)"
as why list them for rseqc and then not picard etc? I don't think they're needed here imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done... Thanks
@mblue9 Thanks for having tested the title and comment. I fixed the issue (change in the generated file names) and added some tests |
Thanks a lot @bebatut ! LGTM Will I merge it? |
HiCExplorer does not work if you use files with identical name in Galaxy. Just upload the following file twice. |
This isuse is probably not limited to HiCExplorer module. We probably need to fix that for all modules by adding a sub folder for each file (if we want to keep the file identifier intact) |
Just to check as am curious, are there use cases for analysing files with identical names? As that doesn't seem like a good idea to me, wouldn't it be better to recommend using unique file names? (saying that I do have the current situation where macs gives the same name to all files if using a control with collections which is not good imo) |
I appreciate that this is a special case. It happens e.g. if you run a workflow with multiple input (Batch processing) and all QC reports are collected in the end, if they have an identical Galaxy name, MultiQC crashes. It is quite easy to fix it, just add an unique prefix/suffix to each file and we are more stable. |
@joachimwolff My concern would be ensuring that the sample identifiers remain the unadulterated |
@joachimwolff that is a bigger issue as we want to preserver the As this was "broken" before and currently is I don't consider this as blocking and will merge now as it contains so many other useful fixes. Thanks @bebatut! |
@bebatut Fantastic work on MultiQC, I find it a very useful tool. I was wondering if you had any insight as to why Salmon is commented out. I was looking into enabling it in the Galaxy wrapper but thought I'd see if there was already some work done... Thanks! |
Solved with this PR:
-cvsStats
option of snpEff #1805Not solved with this PR:
element_identifier
as sample nameelement_identifier
+align
ID
(https://github.com/ewels/MultiQC/blob/v1.5/multiqc/modules/bcftools/stats.py#L41)Command line
(https://github.com/ewels/MultiQC/blob/v1.5/multiqc/modules/cutadapt/cutadapt.py#L104)element_identifier
(s) on which deepTools has been runFilename
Unknown status (the status of
element_identifier
parsing for these tools is unverified)element_identifier
(s) on which featureCounts has been runFile
Filename
Run File
(looks like a log output needs to be added to the kallisto wrapper to work with multiqc, see here log file from Kallisto MultiQC/MultiQC#440)INPUT
on the line starting with# picard.analysis
organism
element_identifier
(s) on which QUAST has been run