Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dada2 #2483

Merged
merged 2 commits into from
Nov 8, 2019
Merged

dada2 #2483

merged 2 commits into from
Nov 8, 2019

Conversation

bernt-matthias
Copy link
Contributor

@bernt-matthias bernt-matthias commented Jul 5, 2019

moved over from here https://github.com/bernt-matthias/mb-galaxy-tools/pulls

TODOs:

  • generate yml based workflow with tests
    • workflow moved to IWC

FOR CONTRIBUTOR:

  • - I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • - License permits unrestricted use (educational + commercial)
  • - This PR adds a new tool or tool collection
  • - This PR updates an existing tool or tool collection
  • - This PR does something else (explain below)

@bernt-matthias
Copy link
Contributor Author

ping @shiltemann @yvanlebras

@bernt-matthias
Copy link
Contributor Author

Seems that the datatypes have are not yet in the Galaxy version used by planemo. They are alreadey in the Galaxy sources: galaxyproject/galaxy#7895 but I don't know if I was in time for 19.05.

I could backport them if it would help.

The data manager tests should actually work. Are they invoked correctly?

@bernt-matthias bernt-matthias changed the title dada2: initial commit dada2 Jul 8, 2019
tools/dada2/macros.xml Outdated Show resolved Hide resolved
@loraine-gueguen
Copy link
Contributor

hi @bernt-matthias (cc @lecorguille)
Thanks a lot for you great job on dada2! We are very interested to add these tools in our galaxy instance.
Can I help in any way to solve the issues and complete the development ?

@bernt-matthias
Copy link
Contributor Author

Hi @loraine-gueguen. At the moment you can install the tools from the test toolshed (I just made sure that they are at the latest version). I guess a review and eventually approval of this PR would be the best.

The errors only affect the data manager (aka dadamanager) .. and I did not understand them .. just tried to rebase to the latest master .. maybe this helps. My local tests are successful. Otherwise I will ask the IUC for help.

@loraine-gueguen
Copy link
Contributor

Thanks a lot for the testtoolshed update !
I am completely ignorant about dada2, but I can have a look at the "tool wrapper syntax" stuff of the PR if you want (except data manager unfortunately, I know nothing about data manager :-|)

@bernt-matthias
Copy link
Contributor Author

This would already help a lot.

@bernt-matthias
Copy link
Contributor Author

Surprise: v1.12 (not yet in the TTS)

would be really happy to bring this to an end soon. reviews would be great.

@loraine-gueguen
Copy link
Contributor

Sorry I didn't have time to review last 2 weeks (urgent projects and days off)... But I'll do it as soon as possible

@gregvonkuster
Copy link
Contributor

@bernt-matthias @bgruening Sorry I'm getting into this late, I have a new customer that needs this tool in Galaxy. Is there anything I can do to help move this PR along?

@yvanlebras
Copy link
Contributor

+1 I can try to test last version "as a biologist"

@gregvonkuster
Copy link
Contributor

@bernt-matthias This code is beautiful! Just a couple of nit-picky comments, but I don't feel that these necessarily warrant changes unless you agree.

  1. Could these lines in the 3 tools become a macro? https://github.com/galaxyproject/tools-iuc/pull/2483/files#diff-246b665596cf9b58d86b16d961ebf249R18-R19

  2. Maybe remove commented code in some places just to make the code as clean as possible.

I've restricted my review to the tool configs, including data managers, and R code. If nothing is missing form this PR to make the tools all work within Galaxy (I'm late to the game here), then I'm +1 on this when tests pass.

@bernt-matthias
Copy link
Contributor Author

hey @gregvonkuster thanks for the review. I removed the comments (in one case I decided to leave it, but format it properly).

I decided against a macro for the 2 lines, I had the feeling that it does not help with the readability of the code.

@bernt-matthias
Copy link
Contributor Author

Anyone problems with profile="19.09" (which is not even released yet)? Needed because of the new data types. If one would vote for an older Galaxy release I would need to backport the corresponding commit.

@bernt-matthias
Copy link
Contributor Author

The most important question to me is how well the output of the pipeline (i.e. of assignTaxonomy / makeSequenctable) as implemented in the wrapppers

  • can be fed to connect to possible downstream tools
  • are consistent with other amplicon approaches in Galaxy (qiime / mothur)

Where the first seems more important to me.

@gregvonkuster
Copy link
Contributor

Anyone problems with profile="19.09" (which is not even released yet)? Needed because of the new data types. If one would vote for an older Galaxy release I would need to backport the corresponding commit.

I'll need this on Galaxy main which states that it is currently running 19.09, but not sure since it is not yet released. ;)

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for finding the data manager test workaround. The fastq.gz files are relatively large, any chance that you could downsample them a bit more ?

tools/dada2/datatypes_conf.xml Outdated Show resolved Hide resolved
@bernt-matthias
Copy link
Contributor Author

The fastq.gz files are relatively large, any chance that you could downsample them a bit more ?

Consider it done. Do I need to rebase and squash the commits to actually affect the size of the git repo?

@nsoranzo
Copy link
Member

nsoranzo commented Nov 6, 2019

Do I need to rebase and squash the commits to actually affect the size of the git repo?

That would be better, otherwise whoever merges this PR will need to remember to select "Squash and merge".

@bgruening
Copy link
Member

Just send my two last comments. Please check the tool owners ... I can squash during merge.

@bernt-matthias
Copy link
Contributor Author

Will be a few days offline and implement the changes afterwards.

I think squashing is not necessary anymore - I rebased already.

Copy link
Contributor

@loraine-gueguen loraine-gueguen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents, but the tool configurations look good to me.

data_managers/data_manager_dada2/.shed.yml Outdated Show resolved Hide resolved
data_managers/data_manager_dada2/.shed.yml Outdated Show resolved Hide resolved
tools/dada2/.shed.yml Outdated Show resolved Hide resolved
tools/dada2/dada2_filterAndTrim.xml Outdated Show resolved Hide resolved
@loraine-gueguen
Copy link
Contributor

We have a few interested users that will test the suite. Is the code strictly limited to installation and use with release 19.09 ?

@bernt-matthias
Copy link
Contributor Author

Is the code strictly limited to installation and use with release 19.09 ?

The only aspect that requires 19.09 are the data types. You can add them manually to your datatypes.xml. I'm not sure if installation from the TS is checking for the version, but manual installation is definitely possible.

Let me know if I should update the version in the TTS.

@loraine-gueguen
Copy link
Contributor

It seems not possible to install a tool with a profile higher than the current galaxy profile, neither from the TS or manually (unless editing every tool config to change the profile, what I turned out to do).

From the TS, the tools appear as "invalid" in the admin "manage tools" section with no error messages. Manually, I had the following error messages:

  File "lib/galaxy/tools/__init__.py", line 446, in __init__
    raise e
Exception: The tool dada2_assignTaxonomyAddspecies targets version 19.09 of Galaxy, you should upgrade Galaxy to ensure proper functioning of this tool.

@bernt-matthias
Copy link
Contributor Author

@loraine-gueguen good to know. manually editing the tool xml files should boil down to something like:

find . -name "dada2_*xml" -exec sed -i -e 's/profile="19.09"/profile="19.01"/' {} ;

and adding the data types.

Could be added to the docs.

if there is more than 1 changed repo (eg 1 tool and 1 data manager)
then planemo is called per repo, which fails for data manager because
the path to the xml is needed.

hence if the repo is in the data_managers directory the path to the
xml file is determined with an additional call to `planemo
ci_find_tools` and then used in the planemo test call.
@bgruening
Copy link
Member

Ok, let's get this in, way too long open. Hope that is ok for everyone ...

@bgruening bgruening merged commit f8b6b6e into galaxyproject:master Nov 8, 2019
@loraine-gueguen
Copy link
Contributor

loraine-gueguen commented Nov 27, 2019

A few comments regarding our DADA2 installation in Galaxy from the TS:

For Galaxy releases older than 19.09, it seems not enough to manually change profile in the xml files (restrictions due to a profile parameter registered in the database ?). To make the tools available in our 18.05 instance, I had to add the tools manually in galaxy-dist/tools/dada2/ (in fact, I added symbolic links pointing to the repositories of each tool in shed_tools). The best would be to update our galaxy instance to 19.09, which is now released.

Moreover, a release of conda >4.5.11 is definitely needed to have the [email protected] conda environment installing correctly: I faced the same issue as bioconda/bioconda-recipes#13846 (comment), and it was ok after updating conda to 4.7.12 (and python).

Small stuffs: the image pairpipe.png is not displaying in the help section (text pairpipe.png instead), and the doi citation displays an error message Failed to fetch BibTeX for DOI (although the hyperlinks are ok). Do you want me to open an issue for these ?

A user is currently testing the tool suite.

@gregvonkuster
Copy link
Contributor

This tool suite is also currently being tested on Galaxy test https://test.galaxyproject.org, although the reference datasets are not yet installed for the tools that use them.

@bernt-matthias
Copy link
Contributor Author

For Galaxy releases older than 19.09, it seems not enough to manually change profile in the xml files (restrictions due to a profile parameter registered in the database ?). To make the tools available in our 18.05 instance, I had to add the tools manually in galaxy-dist/tools/dada2/ (in fact, I added symbolic links pointing to the repositories of each tool in shed_tools). The best would be to update our galaxy instance to 19.09, which is now released.

Odd. Do the dada tools end up in shed_tools.xml? Or in which of the tool.xml files?

Moreover, a release of conda >4.5.11 is definitely needed to have the [email protected] conda environment installing correctly: I faced the same issue as bioconda/bioconda-recipes#13846 (comment), and it was ok after updating conda to 4.7.12 (and python).

A small comment on the README in the repo would be nice. Could you do this?

Small stuffs: the image pairpipe.png is not displaying in the help section (pairpipe.png instead), and the doi citation displays an error message Failed to fetch BibTeX for DOI (although the hyperlinks are ok). Do you want me to open an issue for these ?

Images seem to work only when installed from the TS.

@bernt-matthias
Copy link
Contributor Author

@gregvonkuster cool :) .. manual tests I guess?

@gregvonkuster
Copy link
Contributor

A lab here on the Penn State campus is running some of the tools on their data. If all goes well, I'll try to get the tools installed on Galaxy main before the Christmas break or shortly after the new year.

@gregvonkuster
Copy link
Contributor

I need to get in touch with @jennaj to get the reference datasets installed.

@gregvonkuster
Copy link
Contributor

This @bernt-matthias I've opened this issue galaxyproject/usegalaxy-playbook#273 requesting dada2 reference databases to be installed on Galaxy test / main, but I'm not quite sure which databases are the most used. So if I've chosen incorrectly, please let me know.

@loraine-gueguen
Copy link
Contributor

loraine-gueguen commented Nov 28, 2019

Odd. Do the dada tools end up in shed_tools.xml? Or in which of the tool.xml files?

The dada tools were correctly added in a shed_tool_conf file dedicated to dada2 tools (shed_tool_conf_dada2.xml) which has been added in the tool_config_file parameter in galaxy.yml. And the installation went fine for dada2 v1.10 from the testtoolshed in the same shed tool_conf file.

A small comment on the README in the repo would be nice. Could you do this?

done: #2699

Images seem to work only when installed from the TS.

ok. For DOI citation as well ?

Thanks

@bernt-matthias
Copy link
Contributor Author

Any immediate requests for dada2 could be added here: #2705

@bernt-matthias bernt-matthias deleted the topic/dada2 branch March 8, 2020 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants