interaction:fixed effects and corrections modified #5928

renu-pal · 2024-04-02T20:09:20Z

FOR CONTRIBUTOR:
[ *] I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.

License permits unrestricted use (educational + commercial)
This PR adds a new tool or tool collection
[ *] This PR updates an existing tool or tool collection
This PR does something else (explain below)

bgruening

One question inline. Please also bump the tool version.

bgruening · 2024-04-02T20:26:13Z

tools/maaslin2/maaslin2.xml

@@ -55,7 +55,7 @@ cd outputFolder && mkdir -p figures/ && cp *.pdf figures
    <inputs>
        <param name="input_data" type="data" format="tabular" label="Data (or features) file"/>
        <param name="input_metadata" type="data" format="tabular" label="Metadata file"/>
-        <param argument="--fixed_effects" type="select" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">
+        <param argument="--fixed_effects" type="text" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">


Why should this be a text paraeter? We should probably fix the help instead?

We can do that now but if more association factors get added in the future, we might have to make changes again as we are doing now. So instead take it as a text parameter.

In fact, the fixed_effects as well as the random_effects should be of type: data_column (see:

tools-iuc/tools/bedtools/overlapBed.xml

Line 17 in 4e3719a

<param name="cols" type="data_column" data_ref="input" multiple="true"

for an example).
Since for both multiple columns from the input can be chosen. The original optional values are nonsense, since they hard-coded the column names of the test data.

I am getting column numbers instead of names after using type="data_column" . Do we have any other type where we can get names of columns directly instead of numbers?

Have you tried setting numerical=false? https://docs.galaxyproject.org/en/latest/dev/schema.html#id51

setting numerical="false" did not work instead setting use_header_names="true" worked and UI shows column names now but the test cases are failing. Do not understand why it is still accepting numerical values as valid options for tests. So working on that now.

tool_test_output.json

I could be wrong, but the use_header_names="true" option is really just cosmetic and for the UI, you still need to give numbers in the tests.

Very good idea to use use_header_names="true"!!!

hi @renu-pal it was a surprise for me, that there is not easy way to get the column_names available for the command line from Galaxy, however after some fideling I found a way to get them from the input file:

## get column names of fixed and random effect from the input file, since galaxy ## can only return indices with type="data_column" ## get header #set $input = open(str($input_metadata), 'r') #set $header = $input.readlines()[0].split('\t') ## get fixed effects #set $fixed_effects_val = [] #for $i in $fixed_effects: #silent $fixed_effects_val.append($header[int($i)]) #end for #set $fixed_effects = ','.join($fixed_effects_val) ## get random effects #set $random_effects_val = [] #for $i in $random_effects: #silent $random_effects_val.append($header[int($i)]) #end for #set $random_effects = ','.join($random_effects_val)

Maybe let the user know in the help, that the column names must be in the first line of the table or modify the code to skip comments when parsing the header

forget about that, we need to get the column names in the command line not the cheetah code...

So update, this way, using awk the file is only parsed on script execution

## get column names of fixed and random effect from the input file, since galaxy ## can only return indices with type="data_column" ## using awk so that the file is only parsed on command line execution #set idx = [] #for $i in $fixed_effects: #silent idx.append(f'${i}') #end for #set idx_for_awk = ','.join(idx) fixed_effects=`awk -v OFS=',' -F"\t" 'NR == 1 { print $idx_for_awk}' '$input_metadata'` && echo 'Assigned fixed effects as:' \$fixed_effects && #set idx = [] #for $i in $random_effects: #silent idx.append(f'${i}') #end for #set idx_for_awk = ','.join(idx) random_effects=`awk -v OFS=',' -F"\t" 'NR == 1 { print $idx_for_awk}' '$input_metadata'` && echo 'Assigned random effects as:' \$random_effects &&

tools/maaslin2/maaslin2.xml

paulzierep

Thanks a lot @renu-pal ! Ones this is fixed, the tool is finally functional !
Ones they update the bioconda recipe: bioconda/bioconda-recipes#46915 we can also update the wrapper, but lets do this one first !

paulzierep · 2024-04-03T09:46:40Z

tools/maaslin2/maaslin2.xml

@@ -55,7 +55,7 @@ cd outputFolder && mkdir -p figures/ && cp *.pdf figures
    <inputs>
        <param name="input_data" type="data" format="tabular" label="Data (or features) file"/>
        <param name="input_metadata" type="data" format="tabular" label="Metadata file"/>
-        <param argument="--fixed_effects" type="select" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">
+        <param argument="--fixed_effects" type="text" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">


In fact, the fixed_effects as well as the random_effects should be of type: data_column (see:

tools-iuc/tools/bedtools/overlapBed.xml

Line 17 in 4e3719a

<param name="cols" type="data_column" data_ref="input" multiple="true"

for an example).
Since for both multiple columns from the input can be chosen. The original optional values are nonsense, since they hard-coded the column names of the test data.

paulzierep · 2024-04-03T09:48:27Z

tools/maaslin2/maaslin2.xml

+            <option value="hochberg">hochberg</option>
+            <option value="hommel">hommel</option>
+            <option value="bonferroni">bonferroni</option>
+            <option value="BH">BH</option>


Can you add the full names for BH and BY as well, maybe also add a link to the explanation to the help and check if they all work, maybe add a test or two as well

paulzierep · 2024-04-03T09:54:20Z

Thanks @renu-pal.

One question inline. Please also bump the tool version.

@renu-pal that means you need to increase the VERSION_SUFFIX by 1: https://github.com/renu-pal/tools-iuc/blob/ea393cd20a2651a554f59c8afa1620b5792951f8/tools/maaslin2/macros.xml#L4

The TOOL_VERSION can be increased to match the latest bioconde recipe ones its updated: https://forum.biobakery.org/t/update-maaslin2-bioconda-recipe/6830 and bioconda/bioconda-recipes#46915

…ion bumped

paulzierep

Please update to work with no columns selected

paulzierep · 2024-04-18T12:08:46Z

tools/maaslin2/macros.xml

@@ -1,6 +1,6 @@
 <?xml version="1.0"?>
 <macros>
-    <token name="@TOOL_VERSION@">0.99.12</token>
+    <token name="@TOOL_VERSION@">1.7.3</token>


@atm we are still at 0.99.12, no ?

paulzierep · 2024-04-18T12:10:35Z

tools/maaslin2/maaslin2.xml

+## can only return indices with type="data_column" 
+## using awk so that the file is only parsed on command line execution
+
+#set idx = []


@renu-pal you said Yh it did a bit, but it's returning the complete list in case of 'random effect " when user does not choose any option
Could you try to fix that here, i.e. add an if statement, that does not choose any columns if no columns are specified by the user - and also add a test for it ...?

okay, working on that

renu-pal · 2024-04-19T21:36:30Z

warning : the correction methods other than BH and BY will not work due to older version of maaslin.

bgruening · 2024-04-20T10:49:55Z

@renu-pal the newer version of https://anaconda.org/bioconda/maaslin2/files is online.

renu-pal · 2024-04-22T12:52:50Z

Thanks for the update @bgruening ! I will bump the version to 1.16.0 but will need some time to push the updated version as test cases are failing because of the version update. So I need to look into that and make necessary changes.

paulzierep · 2024-04-23T07:32:39Z

Since for the current version the wrapper is broken (not logically usable), would it make sense to merge this version first and then bump to the next version in a new PR ? @bgruening any preference ?

bgruening · 2024-04-23T09:07:49Z

tools/maaslin2/maaslin2.xml

@@ -245,7 +278,7 @@ cd outputFolder && mkdir -p figures/ && cp *.pdf figures
        <test expect_num_outputs="5">
            <param name="input_data" value="HMP2_taxonomy.tsv"/>
            <param name="input_metadata" value="HMP2_metadata.tsv"/>
-            <param name="fixed_effects" value="diagnosis,dysbiosisnonIBD"/>
+            <param name="fixed_effects" value="c4: diagnosis,c9: dysbiosisnonIBD"/>


@mvdbeek does this look like a regression in Galaxy to you? I thought that either the column name or the c4 should work, but in combination, it looks wired to me.

The combination is not necessary. I mentioned header names in test cases to make it easier for us to recognize which values we are taking. I will change it to just numbers.

…n tests

renu-pal · 2024-04-24T17:55:24Z

I changed the values to just numbers. I was wondering if I should disable the other options other than BY and BH in correction methods so the user does not choose it until we do not update the version.

tools/maaslin2/maaslin2.xml

bgruening · 2024-04-25T12:56:08Z

@renu-pal it was a good idea. I was just confused and learned now that this is also possible. If you like you can add a comment to say what those columns are. Overall this looks like its ready to go. I hope I fixed the failing test.

renu-pal · 2024-04-25T19:12:51Z

Ready to go from my end too!!

paulzierep · 2024-04-26T07:38:59Z

tools/maaslin2/maaslin2.xml

@@ -347,6 +474,9 @@ Output
        - It only includes associations with q-values <= to the threshold.
    - Data frame with residuals for each feature (R data file)
        - This file contains a data frame with residuals for each feature.
+
+Correction methods to compute the q-value : https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/p.adjust


If this is called for the correction, why can it only take BH / BY ? Or is this fixed now ?

Can you check this link.
https://forum.biobakery.org/t/invalid-p-value-correction-options/3321

It seems there was some issue from their end. So I believe if we update the version, it should work but I am not sure if they fixed the bug. Any way to check that ?

OK good to know, then the wrapper is fine as is, but you can remove this text from the help section (it will confuse the user since the option is not there); and put it back when we update the tool :)

Okay, will update that

renu-pal · 2024-04-26T12:59:46Z

I have updated the correction help. Please let me know if anything else needs to be updated.

paulzierep · 2024-04-29T09:10:28Z

Looks good from my side, can we merge it @bgruening ?

bgruening · 2024-04-29T10:10:57Z

Thanks @renu-pal! Great work!

renu-pal · 2024-04-29T10:17:53Z

Thanks for the guidance @paulzierep and @bgruening :)

interaction:fixed effects and corrections modified

3918739

bgruening reviewed Apr 2, 2024

View reviewed changes

paulzierep reviewed Apr 3, 2024

View reviewed changes

fixed effects, random effects , corrections methods modified and vers…

8fa089e

…ion bumped

renu-pal force-pushed the branch1 branch from 52ff3d0 to 8fa089e Compare April 17, 2024 21:51

paulzierep reviewed Apr 18, 2024

View reviewed changes

maaslin2: test cases added

bfc500c

bgruening reviewed Apr 23, 2024

View reviewed changes

maaslin2 : updated fixed and random effects to include just numbers i…

02ab4b6

…n tests

renu-pal force-pushed the branch1 branch from 5eddb27 to 02ab4b6 Compare April 24, 2024 17:49

bgruening reviewed Apr 25, 2024

View reviewed changes

tools/maaslin2/maaslin2.xml Outdated Show resolved Hide resolved

Update tools/maaslin2/maaslin2.xml

18bb723

paulzierep reviewed Apr 26, 2024

View reviewed changes

maaslin2: update correction help

621439a

bgruening merged commit c682728 into galaxyproject:main Apr 29, 2024
11 checks passed

renu-pal deleted the branch1 branch May 6, 2024 23:15

interaction:fixed effects and corrections modified #5928

interaction:fixed effects and corrections modified #5928

Conversation

renu-pal commented Apr 2, 2024

bgruening left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulzierep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulzierep commented Apr 3, 2024

paulzierep left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

renu-pal commented Apr 19, 2024

bgruening commented Apr 20, 2024

renu-pal commented Apr 22, 2024

paulzierep commented Apr 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

renu-pal commented Apr 24, 2024

bgruening commented Apr 25, 2024

renu-pal commented Apr 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

renu-pal commented Apr 26, 2024

paulzierep commented Apr 29, 2024

bgruening commented Apr 29, 2024

renu-pal commented Apr 29, 2024