Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate in samtools, kraken, vcflib #1148

Merged
merged 40 commits into from
Apr 28, 2017

Conversation

martenson
Copy link
Member

part of galaxyproject/tools-devteam#406
xref: #1060

this is an effort to address part of galaxyproject/tools-devteam#416

<test>
<param ftype="bam" name="input1" value="bam_to_sam_in1.bam" />
<param name="header" value="-h" />
<output file="bam_to_sam_out1.sam" name="output1" sorted="True" />
Copy link
Member Author

@martenson martenson Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

travis: /tmp/tmp4XWX1b:35:0:ERROR:SCHEMASV:SCHEMAV_CVC_COMPLEX_TYPE_3_2_1: Element 'output', attribute 'sorted': The attribute 'sorted' is not allowed.

should this be just sort?
source https://docs.galaxyproject.org/en/latest/dev/schema.html#id74

@@ -0,0 +1,296 @@
<tool id="samtools_stats" name="Stats" version="2.0">
Copy link
Member Author

@martenson martenson Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file has

.. ERROR: Invalid tmpmXYONH found. Errors [/tmp/tmpmXYONH:218:0:ERROR:SCHEMASV:SCHEMAV_ELEMENT_CONTENT: Element 'filter': This element is not expected.]

problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martenson See my comment below.

<when value="cached">
<param name="ref_file" type="select" label="Using genome">
<options from_data_table="fasta_indexes" />
<filter type="data_meta" ref="input_file" key="dbkey" column="1" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be:

                            <options from_data_table="fasta_indexes">
                                <filter type="data_meta" ref="input_file" key="dbkey" column="1" />
                            </options>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought so, thanks for clarification, fixing

@martenson
Copy link
Member Author

both samtools_mpileup tests fail with

======================================================================
FAIL: MPileup ( samtools_mpileup ) > Test-2
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/tmpW4VvWu/galaxy-dev/test/functional/test_toolbox.py", line 302, in test_tool
    self.do_it( td )
  File "/tmp/tmpW4VvWu/galaxy-dev/test/functional/test_toolbox.py", line 78, in do_it
    raise e
JobOutputsError: [Errno 2] No such file or directory: '/home/travis/build/galaxyproject/tools-iuc/tool_collections/samtools/samtools_mpileup/test-data/samtools_mpileup_out_2.log'

@bgruening
Copy link
Member

@martenson can we remove all tool_dependencies.xml files and make sure all are condafied?

@martenson
Copy link
Member Author

@bgruening will do

<xml name="requirements">
<requirements>
<requirement type="package" version="0.10.6_eaf8fb68">kraken</requirement>
<requirement type="package" version="0.10.6-eaf8fb68">kraken</requirement>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davebx Do you know why is the requirement here twice? Once with dash and once with underscore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is because of the bioconda vs TS package names.

@martenson
Copy link
Member Author

thank you @yhoogstrate !!

@bgruening
Copy link
Member

it seems there is a wrong samtools version used:

[main] unrecognized command '--version'
[sam_header_read2] 1 sequences loaded.
sort: invalid option -- 'O'
sort: invalid option -- 'T'

@yhoogstrate
Copy link
Member

yhoogstrate commented Feb 15, 2017

This error is coming from samtools 0.1.19, installed by travis: https://github.com/galaxyproject/tools-iuc/blob/master/.travis.yml#L20 . I think Galaxy does not override it with 1.2.* properly and uses 0.1,19 that was/is necessary for the Galaxy backend, instead.

@jmchilton is trying to solve it in this PR: galaxyproject/galaxy#3620 , thanks 👍

Once this works, I would like all tools that make use of samtools, bcftools and ucsc-tools to re-trigger a travis test to make sure they were not tested with wrong dependencies.

yhoogstrate referenced this pull request in galaxyproject/galaxy Feb 15, 2017
This has never happened in my testing, but there have been multiple reports of Conda crafting environments for the set metadata tool that include Python. When this happens Galaxy's Python environment is lost and the tool cannot function properly.

Fixes #3238.
Fixes #3224.
Fixes https://biostar.usegalaxy.org/p/20865/.

Rebased to add a comment as suggested by @nsoranzo.
#else:
--fasta-input
#end if
"$forward_input" "$reverse_input"
Copy link
Member

@nsoranzo nsoranzo Feb 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            '${single_paired.forward_input}' '${single_paired.reverse_input}'

</macros>
<expand macro="requirements"/>
<expand macro="version_command"/>
<expand macro="stdio"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<expand macro="stdio"/>

should be moved between requirements and version_command

<param name="threshold" value="0"/>
<param name="kraken_database" value="test_db"/>

<output name="output" file="kraken_filter_test1_output.tab" ftype="tabular"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/"output"/"filtered_output"/


<expand macro="stdio"/>
<expand macro="version_command"/>
<expand macro="requirements" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<expand macro="requirements" />

should be moved before stdio.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, rmdup should stay on 0.1.19 or move to 1.3.*, see galaxyproject/tools-devteam#284 and galaxyproject/tools-devteam#286 .

</inputs>

<outputs>
<data name="output1" format="txt" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

<test>
<param name="input1" value="samtools_flagstat_input1.bam" ftype="bam" />

<output name="output1" file="samtools_flagstat_out1.txt" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation of the previous 2 lines.

0 + 0 with mate mapped to a different chr (mapQ>=5)

</help>
<expand macro="citations"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation of the previous 2 lines.

kraken-filter
@INPUT_DATABASE@
--threshold $threshold
'${input}' >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the > on the next line.


kraken-report
@INPUT_DATABASE@
'${kraken_output}' >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the > on the next line.

kraken-translate
@INPUT_DATABASE@
$mpa_format
'${input}' >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the > on the next line.

<command><![CDATA[
samtools reheader
'${input_header}'
'${input_file}' >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the > on the next line.

@martenson
Copy link
Member Author

@nsoranzo I addressed your comments

@yhoogstrate
Copy link
Member

Nice @martenson , planemo test --update_test_data and I think we're there 👍

<macros>
<xml name="requirements">
<requirements>
<requirement type="package" version="1.3">samtools</requirement>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/1.3/1.3.1/

1.3.1 is the latest and is already used by tools/htseq_count/htseq-count.xml

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this go together with a version bump?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all samtools tool versions should be bumped, for bam_to_cram, cram_to_bam and samtools_split the new version should be 1.3.1 (using a macro ideally), all other tools have already a major version 2 so I'd increase the 'patch' version number, hoping one day we can sync again with the upstream version.

</inputs>

<outputs>
<data name="output1" format="txt" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

@@ -0,0 +1,372 @@
<!DOCTYPE html>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be removed.

@@ -0,0 +1,43 @@
phiX174 1411 A 1 ^P. $
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file correct here?

@@ -0,0 +1,43 @@
phiX174 1411 A 0 1
phiX174 1412 G 0 2,1,1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the last failing test, the last column needs to be removed to pass.
Is this correct? @nekrut have you used mpileup and now what the last column is and if it is ok if it get's removed?

Copy link
Member

@yhoogstrate yhoogstrate Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fixed bug in samtools and results in backwards incompatibility (compared to 0.19).
The option -O returns the base positions of the corresponding reads.
The 4th column indicates zero bases are included in the mpileup, while the last column indicates multiple reads are present. To me the behaviour in 1.3 looks appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this file has multiple columns, the end is just a lot of tabulars afaik.

@@ -0,0 +1,252 @@
# ACGT content per cycle. The columns are: cycle, and A,C,G,T counts (percent)
1 21.50 29.50 33.50 15.50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file also needs to be updated.

@nsoranzo
Copy link
Member

It's green 🎉 , thanks @yhoogstrate!

@yhoogstrate yhoogstrate removed the wip label Apr 28, 2017
@bgruening
Copy link
Member

🎉 !!!

@bgruening bgruening merged commit 0b2d726 into galaxyproject:master Apr 28, 2017
@bgruening
Copy link
Member

Thanks all, this is awesome!

@nsoranzo
Copy link
Member

😟 samtools tool versions have not been updated for the new samtools 1.3 requirement, see https://github.com/galaxyproject/tools-iuc/pull/1148/files#r104935513

@nsoranzo
Copy link
Member

Luckily all repo updates have failed because @martenson hasn't given the necessary permissions to iuc yet, so we can still fix the tool versions.

@martenson
Copy link
Member Author

@nsoranzo yay! :)

@martenson
Copy link
Member Author

all repos from this PR have been added to IUC-permitted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants