Skip to content

Commit

Permalink
Update for version 0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
iwc-workflows-bot committed Nov 22, 2024
0 parents commit db96f9c
Show file tree
Hide file tree
Showing 7 changed files with 597 additions and 0 deletions.
11 changes: 11 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /iwc-clinicalmp-database-generation.ga
testParameterFiles:
- /iwc-clinicalmp-database-generation-tests.yml
authors:
- name: Subina Mehta
orcid: 0000-0001-9818-0537
20 changes: 20 additions & 0 deletions .github/workflows/wftest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Periodic workflow test
on:
schedule:
- cron: '0 3 * * *'
workflow_dispatch:
jobs:
setup:
name: Setup cache
uses: galaxyproject/iwc/.github/workflows/setup.yml@main
with:
galaxy-fork: galaxyproject
test:
name: Test workflow
needs: setup
uses: galaxyproject/iwc/.github/workflows/test_workflows.yml@main
with:
galaxy-head-sha: ${{ needs.setup.outputs.galaxy-head-sha }}
galaxy-fork: galaxyproject
repository-list: '.'
check-outputs: true
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Changelog

## [0.1] 2024-11-18
First release.
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Clinical Metaproteomics 1: Database Generation
Metaproteomics involves the large-scale identification and analysis of all proteins expressed by microbiota. However, analyzing clinical samples using metaproteomics is complicated by the presence of abundant human (host) proteins, which can obscure the detection of less abundant microbial proteins.

To overcome this challenge, we developed a metaproteomics workflow using tandem mass spectrometry (MS/MS) and bioinformatics tools on the Galaxy platform. This workflow enables the characterization of metaproteomes in clinical samples.

The first step in this workflow is the Database Generation process. The Galaxy-P team has created a workflow that compiles a large database by downloading protein sequences of known disease-causing microorganisms. From this extensive database, a compact, relevant database is then created using the Metanovo tool.
A GTN has been developed for this workflow. [https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.html](https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.html)

## Inputs dataset

### Search Databases (FASTA) from [Zenodo](https://zenodo.org/records/14181725)
- `HUMAN SwissProt Protein_Database.fasta`
- `Species UniProt Protein Database FASTA.fasta`
- `Contaminants (cRAP) Protein Database.fasta`

### MSMS files download from [Zenodo](https://zenodo.org/records/14181725)
- `PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf`

## Input Values
For Metanovo
- Peptide Length
- Variable modifications
- Labeled element

## Processing
- Merge all the resultant FASTA files
47 changes: 47 additions & 0 deletions iwc-clinicalmp-database-generation-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
- doc: Test outline for iwc-clinicalmp-database-generation
job:
Human SwissProt Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/HUMAN-SwissProt-Protein-Database.fasta?download=1
filetype: fasta
Species UniProt Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/Species_UniProt_FASTA.fasta?download=1
filetype: fasta
Contaminants cRAP Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/Contaminants(cRAP)-Protein-Database.fasta?download=1
filetype: fasta
Tandem Mass Spectrometry (MS/MS) datasets:
class: Collection
collection_type: list
elements:
- class: File
identifier: PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf?download=1
outputs:
Human UniProt Microbial Proteins cRAP for MetaNovo:
asserts:
- that: has_text
text: ">sp|"
Metanovo Compact database:
asserts:
- that: has_text
text: ">sp|"
Metanovo Compact CSV database:
asserts:
- that: has_text
text: "index"
Human UniProt Microbial Proteins from MetaNovo cRAP:
asserts:
- that: has_text
text: ">sp|"
Loading

0 comments on commit db96f9c

Please sign in to comment.