Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tutorial for tool annotation with EDAM via bio.tools #4734

Merged
merged 11 commits into from
Mar 6, 2024
16 changes: 16 additions & 0 deletions topics/dev/tutorials/tool-annotation/add_edam_function.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
> 1. Click on **Function** tab
> 2. Click on **Add function**
> 3. Add EDAM operation terms
> 1. Click on **Add operation**
> 2. Search for operation in the filter or in the hierarchy
> 3. Click on the selected term
> 4. Repeat to add as many operation terms as needed
> 4. (Optional) Add input
> 1. Click on **Add input**
> 2. Add EDAM data term
> 1. Click on **Add data type**
> 2. Search for data type in the filter or in the hierarchy
> 3. Add EDAM format term
> 1. Click on **Add data format**
> 2. Search for data format in the filter or in the hierarchy
> 4. Repeat to add as inputs as needed
bebatut marked this conversation as resolved.
Show resolved Hide resolved
8 changes: 8 additions & 0 deletions topics/dev/tutorials/tool-annotation/add_edam_topic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
> 1. Click on **Labels** tab
> 2. Add EDAM topic terms
> 1. Click on **Add topic**
> 2. Search for topic in the filter or in the hierarchy
> 3. Click on the selected term
> 4. Repeat to add as many topic terms as needed
> 3. Add license
> 4. Fill in any known extra information
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions topics/dev/tutorials/tool-annotation/tutorial.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
@article{black2021edam,
title={EDAM: The bioscientific data analysis ontology (update 2021)[version 1; not peer reviewed]},
author={Black, Melissa and Lamothe, Lucie and Eldakroury, Hager and Kierkegaard, Mads and Priya, Ankita and Machinda, Anne and Singh Khanduja, Uttam and Patoliya, Drashti and Rathi, Rashika and Che Nico, Tawah Peggy and others},
year={2021},
publisher={F1000},
doi={10.7490/f1000research.1118900.1}
}

@article{ison2016tools,
title={Tools and data services registry: a community effort to document bioinformatics resources},
author={Ison, Jon and Rapacki, Kristoffer and M{\'e}nager, Herv{\'e} and Kala{\v{s}}, Mat{\'u}{\v{s}} and Rydza, Emil and Chmura, Piotr and Anthon, Christian and Beard, Niall and Berka, Karel and Bolser, Dan and others},
journal={Nucleic acids research},
volume={44},
number={D1},
pages={D38--D47},
year={2016},
publisher={Oxford University Press},
doi={10.1093/nar/gkv1116}
}

204 changes: 204 additions & 0 deletions topics/dev/tutorials/tool-annotation/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
---
layout: tutorial_hands_on
title: Annotation of a tool with EDAM ontology terms using bio.tools
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate title suggestion: Linking Galaxy tools to metadata in the bio.tools registry

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that we cover more than just Linking in the tutorial. We cover also the aspect of creating bio.tools and improving the annotation on bio.tools

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about: Adding and updating best practice metadata for Galaxy tools using the bio.tools registry?

bebatut marked this conversation as resolved.
Show resolved Hide resolved
level: Introductory
subtopic: tooldev
questions:
- How are Galaxy tools linked to EDAM ontology?
- How to connect Galaxy tools to bio.tools?
bebatut marked this conversation as resolved.
Show resolved Hide resolved
objectives:
- Identify Galaxy tools without bio.tools entry
- Create a bio.tools entry
- Update a bio.tools entry
- Add EDAM ontology terms to a bio.tools entry
- Link a Galaxy tool to its corresponding bio.tools entry
time_estimation: 1H
key_points:
- Galaxy tools can get EDAM ontology terms from bio.tools
- bio.tools entry can be created and modified to provide the best EDAM annotations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add a point here?

  • An up-to-date bio.tools entry provides readily accessible metadata that can help users to find and understand tools that are available on Galaxy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider using the "Suggestion Mode" feature of GitHub (see step 6).

By providing a suggestion using the proper suggestion mode:

  1. For authors, it is unambiguous what you are proposing
  2. It's also easier for them to simply accept the suggestion, PR authors prefer suggestions!
  3. You get credited in the Git commit helping us properly track attribution

bebatut marked this conversation as resolved.
Show resolved Hide resolved
- An up-to-date bio.tools entry provides readily accessible metadata that can help users find and understand tools that are available on Galaxy
- bio.tools entry can easily be added to a Galaxy tool
contributions:
authorship:
- bebatut
bebatut marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@supernord, @paulzierep add yourselves here

bebatut marked this conversation as resolved.
Show resolved Hide resolved
---

Galaxy offers thousands of tools. Many of these tools either have incomplete metadata or are not yet linked to sources of high-quality metadata such as [bio.tools](https://bio.tools/).

This prevents filtering for all tools in a specific research community or domain, and makes it all but impossible to employ advanced filtering with ontology terms like the ones from EDAM or to group tools based on an ontology to improve the Galaxy tool panel.

[EDAM](https://edamontology.org/page) ({% cite black2021edam %}) is a comprehensive ontology of well-established, familiar concepts that are prevalent within bioscientific data analysis and data management. It includes 4 main sections of concepts (sub-ontologies):

- **Topic**: A category denoting a rather broad domain or field of interest, of study, application, work, data, or technology. Topics have no clearly defined borders between each other.
- **Operation**> A function that processes a set of inputs and results in a set of outputs, or associates arguments (inputs) with values (outputs).
- **Data**: Information, represented in an information artefact (data record) that is "understandable" by dedicated computational tools that can use the data as input or produce it as output.
- **Format**: A defined way or layout of representing and structuring data in a computer file, blob, string, message, or elsewhere.

![Simplified data flow diagram in EDAM architecture: boxes for concepts, lines for relations. Streamlined data management.](./images/EDAMrelations.png "EDAM architecture is simple. Boxes indicate top-level concepts (sections, sub-ontologies), and lines indicate types of relations. Source: <a href="https://edamontology.org/page">EDAM website</a>")

The ontology can be navigated using [EDAM Browser](https://edamontology.github.io/edam-browser/):

<iframe id="edam" src="https://edamontology.github.io/edam-browser/#operation_0291" frameBorder="0" width="80%" height="600px"> ![EDAM ontology browser](./images/edam_browser.png) </iframe>
bebatut marked this conversation as resolved.
Show resolved Hide resolved

A tool or software can then be characterized by different EDAM terms:
- A topic term, *e.g.* [`Proteomics`](https://edamontology.github.io/edam-browser/#topic_0121),
- An operation (a specific scientific thing that a tool does) term, *e.g.* [`Peptide identification`](https://edamontology.github.io/edam-browser/#operation_3631),
- A data term for the type of biological data, *e.g.* [`Mass spectrum`](https://edamontology.github.io/edam-browser/#data_0943),
- A format term, *e.g.* [`Thermo RAW`](https://edamontology.github.io/edam-browser/#format_3712).

bebatut marked this conversation as resolved.
Show resolved Hide resolved
The annotation of tools can be done on [bio.tools](https://bio.tools/). bio.tools ({% cite ison2016tools %}) is a global portal for bioinformatics resources that helps researchers to find, understand, compare, and select resources suitable for their work. It relies on the EDAM ontology for standardizing the annotations.

In Galaxy, tools can be annotated with EDAM concepts, either by adding them directly to the `XML` wrapper or extracting them from their corresponding bio.tools entry by linking to it in the wrapper. The advantage of the second approach is that there is one source of truth (i.e. the bio.tools entry), which centralises the location for storage and update of metadata, including EDAM concepts, as well as preventing replication of metadata across multiple platforms.

The aim of this tutorial is to improve the annotation of a given Galaxy tool by either:

- Linking it to an existing bio.tools identifier,
- Creating a new bio.tools identifier first and then linking the Galaxy tool, or
- Updating an existing bio.tools entry with the proper EDAM concepts

> <agenda-title></agenda-title>
>
> In this tutorial, we will cover:
>
> 1. TOC
> {:toc}
>
{: .agenda}

# Choose a tool without a bio.tools identifier

To start, we need to select a tool without bio.tools identifier.
bebatut marked this conversation as resolved.
Show resolved Hide resolved

> <hands-on-title>Choose a tool without bio.tools identifier</hands-on-title>
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> 1. Open [the list of Galaxy tool](https://galaxyproject.github.io/galaxy_tool_metadata_extractor/)
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 2. Click on **Add Condition**
> 3. Select *bio.tool id* in **Data** drop-down
> 4. Select *Empty* in **Condition** drop-down
> 5. Select a tool in the list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> 5. Select a tool in the list
> 5. Select a tool in the list, and proceed to the next section of the tutorial

>
{: .hands_on}

# Determine if a bio.tools entry exists for your chosen tool

Now let's search for our selected tool on bio.tools.

> <hands-on-title>Search a tool in bio.tools</hands-on-title>
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> 1. Open [bio.tools](https://bio.tools/)
> 2. Type the name of your tool in the "Search bio.tools" bar on the top
>
bebatut marked this conversation as resolved.
Show resolved Hide resolved
{: .hands_on}

{% include _includes/cyoa-choices.html option1="No existing entry" option2="Existing entry" default="No bio.tool entry" text="Have you found the tool in bio.tools?" disambiguation="biotool"%}

<div class="No-existing-entry" markdown="1">

# Create a bio.tools entry for a tool

If the tool is not on bio.tools, we need to create a new entry populate it with metadata.
bebatut marked this conversation as resolved.
Show resolved Hide resolved

> <hands-on-title> Create a bio.tools entry with minimum metadata </hands-on-title>
>
> 1. Sign up to bio.tools
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 2. Select **Add a tool** from the drop-down **Menu**
> 3. Fill in general information
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 1. Fill in **Tool name**
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 2. Fill in **Description**
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 3. Fill in **Homepage URL**
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 4. Add EDAM operation concepts, as well as EDAM data concepts for both inputs and outputs
>
> {% include topics/dev/tutorials/tool-annotation/add_edam_function.md %}
>
> 5. Add EDAM topic concepts
>
> {% include topics/dev/tutorials/tool-annotation/add_edam_topic.md %}
>
> 6. Add the tool type, language, and other metadata as needed
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 7. Click on **Validate** on the top
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 7. Click on **Save** to create the bio.tools entry
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 8. Copy the bio.tools id
bebatut marked this conversation as resolved.
Show resolved Hide resolved
{: .hands_on}

</div>

<div class="Existing-entry" markdown="1">

# Review and update the EDAM terms for an existing bio.tools entry

Before linking a Galaxy tool with its corresponding bio.tools entry, we need to check if the tool is correctly annotated with EDAM concepts.

> <hands-on-title>Check EDAM terms in a bio.tools entry</hands-on-title>
>
> 1. Open the bio.tools entry for the tool
> 2. Check the EDAM topic terms by looking at the green boxes (if they exist) below the tool name, URL, and available versions
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 3. Check the EDAM operation terms by looking at the blue boxes (if they exist) below the tool description
bebatut marked this conversation as resolved.
Show resolved Hide resolved
{: .hands_on}

{% include _includes/cyoa-choices.html option1="Terms to be modified" option2="Correct terms" default="Terms to be modified" text="What do you think about the EDAM terms in the bio.tools entry?" disambiguation="edamUpdate"%}

<div class="Terms-to-be-modified" markdown="1">

To modify EDAM terms in a bio.tools entry, we need to request editing rights and then modify this entry.

> <hands-on-title>Modify EDAM terms in a bio.tools entry</hands-on-title>
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> 1. Sign up for bio.tools
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 2. Click on **Request editing rights** on the bottom of bio.tools entry page
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 3. Wait for the request to be approved
> 4. Click on **Update this record**
> 5. Update, or add, EDAM operation term(s) and EDAM data term(s) for both inputs and outputs
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> {% include topics/dev/tutorials/tool-annotation/add_edam_function.md %}
>
> 6. Update, or add, EDAM topic term(s)
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> {% include topics/dev/tutorials/tool-annotation/add_edam_topic.md %}
> 7. Update, or add, the metadata for tool type, language, and other fields as needed
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 8. Click on **Validate** on the top
> 8. Click on **Save** to create the bio.tools entry
> 9. Copy the bio.tools ID
{: .hands_on}

</div>

</div>

# Linking a Galaxy tool to a bio.tools entry

To link a Galaxy tool to its corresponding bio.tools entry, we need to first find the source of the wrapper.

> <hands-on-title>Find the Galaxy wrapper</hands-on-title>
>
> 1. Go to the tool on any Galaxy server
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the parsed source folder instead: galaxyproject/galaxy_codex#65
Which will be correct for all tools in our list.
For many tools I checked the location in the .shed file is wrong... that might confuse the helpers !

> 2. Click on the drop-down menu next to the **Run tool** button
> 3. Select **See in Tool Shed**
> 4. Once in the Tool Shed, click on the link to the development repository
> 5. Fork the repository
paulzierep marked this conversation as resolved.
Show resolved Hide resolved
{: .hands_on}

Now we have the wrapper, and can add the bio.tools entry.

> <hands-on-title>Add bio.tools entry to the Galaxy wrapper</hands-on-title>
>
> 1. Open the Galaxy tool XML file
> 2. Add the xref snippet indicated below:
>
> ```
> <xrefs>
> <xref type="bio.tools">biotool-id</xref>
> </xrefs>
> ```
>
> It should appear below the `macros` section and before the `requirements` section
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
> 3. Replace `biotool-id` in the example snippet above with the bio.tools ID for your tool
> 3. Commit the change on a new branch
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 4. Make a pull request (PR) against the original repository
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 5. Wait patiently for the PR to be merged, at which point the new bio.tools reference will be added to the Galaxy tool wrapper
bebatut marked this conversation as resolved.
Show resolved Hide resolved
> 6. Make sure to respond to any feedback from the owner of the wrapper
bebatut marked this conversation as resolved.
Show resolved Hide resolved
>
{: .hands_on}

# Conclusion
Loading