Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New biobb_pytorch Molecular dynamics autoencoder wrapper #173

Merged
merged 20 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions tools/biobb_pytorch/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: biobb_pytorch
owner: chemteam
description: "biobb_pytorch is the Biobb module collection to create and train ML & DL models using the popular [PyTorch](https://pytorch.org/) Python library."
homepage_url: https://github.com/bioexcel/biobb_pytorch
long_description: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

biobb_pytorch is the Biobb module collection to create and train ML & DL models using the popular [PyTorch](https://pytorch.org/) Python library.
Biobb (BioExcel building blocks) packages are Python building blocks that
create new layer of compatibility and interoperability over popular
bioinformatics tools.
The latest documentation of this package can be found in our readthedocs site:
[latest API documentation](http://biobb-pytorch.readthedocs.io/en/latest/).
remote_repository_url: https://github.com/galaxycomputationalchemistry/galaxy-tools-compchem/tree/master/tools/biobb_pytorch
type: unrestricted
categories:
- biobb
maintainers:
bgruening marked this conversation as resolved.
Show resolved Hide resolved
- PauAndrio
- gbayarri
- adamhospital
103 changes: 103 additions & 0 deletions tools/biobb_pytorch/biobb_apply_mdae.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<tool id="biobb_pytorch_apply_mdae" name="ApplyMdae" version="@TOOL_VERSION@" profile="22.05">
<description>Apply a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.</description>
<macros>
<token name="@TOOL_VERSION@">4.2.1</token>
</macros>

<requirements>
<requirement type="package" version="@TOOL_VERSION@">biobb_pytorch</requirement>
</requirements>

<command detect_errors="exit_code"><![CDATA[

ln -s '$input_data_npy_path' ./input_data_npy_path.$input_data_npy_path.ext &&
ln -s '$input_model_pth_path' ./input_model_pth_path.$input_model_pth_path.ext &&
#if $config_json:
ln -s '$config_json' ./config_json.$config_json.ext &&
#end if

apply_mdae

#if $config_json:
--config ./config_json.$config_json.ext
#end if

--input_data_npy_path ./input_data_npy_path.$input_data_npy_path.ext
--input_model_pth_path ./input_model_pth_path.$input_model_pth_path.ext
--output_reconstructed_data_npy_path $outname_output_reconstructed_data_npy_path
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
#if $outname_output_latent_space_npy_path:
--output_latent_space_npy_path $outname_output_latent_space_npy_path
#end if
;

mv '$outname_output_reconstructed_data_npy_path' '$output_reconstructed_data_npy_path' &&
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
if test -f '$outname_output_latent_space_npy_path'; then mv '$outname_output_latent_space_npy_path' '$output_latent_space_npy_path'; fi;

]]>
</command>

<inputs>
<param name="input_data_npy_path" type="data" format="npy" optional="False" label="Input NPY file" help="Input data file"/>
<param name="input_model_pth_path" type="data" format="pth" optional="False" label="input PTH file" help="Path to the input model file. Format: [input].pth"/>
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
<param name="outname_output_reconstructed_data_npy_path" type="text" value="myapply_mdae.npy" optional="False" label="output NPY name" help="Path to the output reconstructed data file Format: [output].npy "/>
<param name="outname_output_latent_space_npy_path" type="text" value="myapply_mdae.npy" optional="True" label="output NPY name" help="Path to the reduced dimensionality file Format: [output].npy "/>
<param name="config_json" type="data" format="json" optional="True" label="Configuration file" help="File containing tool settings. See below for the syntax"/>
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
</inputs>

<outputs>
<data name="output_reconstructed_data_npy_path" format="npy" />
<data name="output_latent_space_npy_path" format="npy" />
</outputs>

<tests>
<test>
<param name="config_json" value="config_apply_mdae.json" ftype="json" />
<param name="input_data_npy_path" value="train_mdae_traj.npy" ftype="npy" />
<param name="input_model_pth_path" value="ref_output_model.pth" />
<param name="outname_output_reconstructed_data_npy_path" value="output_reconstructed_data.npy" />
<param name="outname_output_latent_space_npy_path" value="output_latent_space.npy" />
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
<output name="output_reconstructed_data_npy_path" ftype="npy">
<assert_contents>
<has_size value="123k" delta="50k"/>
</assert_contents>
</output>
<output name="output_latent_space_npy_path" ftype="npy">
<assert_contents>
<has_size value="928" delta="200"/>
</assert_contents>
</output>
</test>
</tests>

<help>
.. class:: infomark

Check the syntax for the tool parameters at the original library documentation: https://biobb-pytorch.readthedocs.io/en/latest

-----

.. image:: http://mmb.irbbarcelona.org/biobb/assets/layouts/layout3/img/logo.png
:width: 150

**https://mmb.irbbarcelona.org/biobb**

.. image:: https://bioexcel.eu/wp-content/uploads/2019/08/Bioexcel_logo_no_subheading_660px.png
:width: 150

**https://bioexcel.eu**
</help>

<citations>
<citation type="bibtex">
@misc{githubbiobb,
author = {Andrio P, Bayarri, G., Hospital A, Gelpi JL},
year = {2019-21},
title = {biobb: BioExcel building blocks },
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/bioexcel/biobb_pytorch},
}
</citation>
<citation type="doi">10.1038/s41597-019-0177-4</citation>
</citations>
</tool>
114 changes: 114 additions & 0 deletions tools/biobb_pytorch/biobb_train_mdae.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
<tool id="biobb_pytorch_train_mdae" name="TrainMdae" version="@TOOL_VERSION@" profile="22.05">
<description>Train a Molecular Dynamics AutoEncoder (MDAE) PyTorch model.</description>
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
<macros>
<token name="@TOOL_VERSION@">4.2.1</token>
</macros>

<requirements>
<requirement type="package" version="@TOOL_VERSION@">biobb_pytorch</requirement>
</requirements>

<command detect_errors="exit_code"><![CDATA[

ln -s '$input_train_npy_path' ./input_train_npy_path.$input_train_npy_path.ext &&
#if $input_model_pth_path:
ln -s '$input_model_pth_path' ./input_model_pth_path.$input_model_pth_path.ext &&
#end if
#if $config_json:
ln -s '$config_json' ./config_json.$config_json.ext &&
#end if
Comment on lines +18 to +20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if $config_json:
ln -s '$config_json' ./config_json.$config_json.ext &&
#end if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s something we’re planning for a future release, where the most relevant properties will be integrated into Galaxy's UI through sliders, multi-select options, number validators, filters, etc., making the configuration process more user-friendly. However, for now, I’d like to keep things as simple as possible and focus on getting my first tool published in the Galaxy Toolshed.


train_mdae

#if $config_json:
--config ./config_json.$config_json.ext
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--config ./config_json.$config_json.ext
--config ./$train_config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s something we’re planning for a future release, where the most relevant properties will be integrated into Galaxy's UI through sliders, multi-select options, number validators, filters, etc., making the configuration process more user-friendly. However, for now, I’d like to keep things as simple as possible and focus on getting my first tool published in the Galaxy Toolshed.

#end if

--input_train_npy_path ./input_train_npy_path.$input_train_npy_path.ext
#if $input_model_pth_path:
--input_model_pth_path ./input_model_pth_path.$input_model_pth_path.ext
#end if
--output_model_pth_path $outname_output_model_pth_path
#if $outname_output_train_data_npz_path:
--output_train_data_npz_path $outname_output_train_data_npz_path
#end if
#if $outname_output_performance_npz_path:
--output_performance_npz_path $outname_output_performance_npz_path
#end if
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;


mv '$outname_output_model_pth_path' '$output_model_pth_path' &&
if test -f '$outname_output_train_data_npz_path'; then mv '$outname_output_train_data_npz_path' '$output_train_data_npz_path'; fi;
if test -f '$outname_output_performance_npz_path'; then mv '$outname_output_performance_npz_path' '$output_performance_npz_path'; fi;

]]>
</command>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<configfiles>
<configfile name="train_config">
{
"properties": {
"num_epochs": $num_epoch,
"seed": $seed
}
}
</configfile>
</configfiles>

This way you can create those configfiles on the fly and ask your users for the inputs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s something we’re planning for a future release, where the most relevant properties will be integrated into Galaxy's UI through sliders, multi-select options, number validators, filters, etc., making the configuration process more user-friendly. However, for now, I’d like to keep things as simple as possible and focus on getting my first tool published in the Galaxy Toolshed.

<inputs>
<param name="input_train_npy_path" type="data" format="npy" optional="False" label="input NPY file" help="Path to the input train data file. Format: [input].npy"/>
<param name="input_model_pth_path" type="data" format="pth" optional="True" label="input PTH file" help="Path to the input model file. Format: [input].pth"/>
<param name="outname_output_model_pth_path" type="text" value="mytrain_mdae.pth" optional="False" label="output PTH name" help="Path to the output model file Format: [output].pth "/>
<param name="outname_output_train_data_npz_path" type="text" value="mytrain_mdae.npz" optional="True" label="output train data NPZ name" help="Path to the output train data file Format: [output].npz "/>
<param name="outname_output_performance_npz_path" type="text" value="mytrain_mdae.npz" optional="True" label="output performance NPZ name" help="Path to the output performance file Format: [output].npz "/>
<param name="config_json" type="data" format="json" optional="True" label="Configuration file" help="File containing tool settings. See below for the syntax"/>
</inputs>

<outputs>
<data name="output_model_pth_path" />
<data name="output_train_data_npz_path" format="npz" />
<data name="output_performance_npz_path" format="npz" />
</outputs>

<tests>
<test>
<param name="config_json" value="config_train_mdae.json" ftype="json" />
<param name="input_train_npy_path" value="train_mdae_traj.npy" ftype="npy" />
<param name="outname_output_model_pth_path" value="output_model.pth" />
<param name="outname_output_train_data_npz_path" value="output_train_data.npz" />
<param name="outname_output_performance_npz_path" value="output_performance.npz" />
<output name="output_model_pth_path" file="ref_output_model.pth" compare="sim_size" />
<output name="output_train_data_npz_path">
<assert_contents>
<has_size value="1k" delta="500"/>
</assert_contents>
</output>
<output name="output_performance_npz_path">
<assert_contents>
<has_size value="124k" delta="50k"/>
</assert_contents>
</output>
</test>
</tests>

<help>
.. class:: infomark

Check the syntax for the tool parameters at the original library documentation: https://biobb-pytorch.readthedocs.io/en/latest

-----

.. image:: http://mmb.irbbarcelona.org/biobb/assets/layouts/layout3/img/logo.png
:width: 150

**https://mmb.irbbarcelona.org/biobb**

.. image:: https://bioexcel.eu/wp-content/uploads/2019/08/Bioexcel_logo_no_subheading_660px.png
:width: 150

**https://bioexcel.eu**
</help>

<citations>
<citation type="bibtex">
@misc{githubbiobb,
author = {Andrio P, Bayarri, G., Hospital A, Gelpi JL},
year = {2019-21},
title = {biobb: BioExcel building blocks },
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/bioexcel/biobb_pytorch},
}
</citation>
<citation type="doi">10.1038/s41597-019-0177-4</citation>
</citations>
</tool>
Binary file added tools/biobb_pytorch/test-data/.DS_Store
PauAndrio marked this conversation as resolved.
Show resolved Hide resolved
Binary file not shown.
5 changes: 5 additions & 0 deletions tools/biobb_pytorch/test-data/config_apply_mdae.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"properties": {
"batch_size": 1
}
}
6 changes: 6 additions & 0 deletions tools/biobb_pytorch/test-data/config_train_mdae.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"properties": {
"num_epochs": 50,
"seed": 1
}
}
Binary file not shown.
Binary file not shown.
Loading