Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rmmiller rescue and resolve #113

Merged
merged 83 commits into from
Dec 14, 2020
Merged

Rmmiller rescue and resolve #113

merged 83 commits into from
Dec 14, 2020

Conversation

rmmiller22
Copy link
Collaborator

@rmmiller22 rmmiller22 commented Dec 14, 2020

Rescue and Resolve protein inference algorithm is implemented within MetaMorpheus. Transcript CPM values are used to rescue proteins that would normally be eliminated during the parsimonious process. Unit tests are included.

PR checklist

  • This comment contains a description of changes (with reason)
  • CHANGELOG.md is updated
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated

kyuubi430 and others added 30 commits November 7, 2020 17:41
* Adds author name in README.md

* Adds author name in README.md

* Deletes temp file

* Adds author name in README.md
basic proteogenomic object info in metamorpheus
* add author name to readme.md

* add one line to refresh commit

* add author name

Co-authored-by: Michael Shortreed <[email protected]>
Co-authored-by: cgpu <[email protected]>
* Added authorname in README

Co-authored-by: cgpu <[email protected]>
* Adds Rachel Miller to the author names in the README

* Minor typo

Co-authored-by: cgpu <[email protected]>
* Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling

* aggregation of FL and CPM by cluster

Co-authored-by: Robert Millikin <[email protected]>
Co-authored-by: Gloria Sheynkman <[email protected]>
* Add author name in README.md

* orf calling updted to run from command line

Co-authored-by: gsheynkman <[email protected]>
* Adds nf-core template for nextflow pips

* Cleans up template main.nf and adds swag cli message

* Updates nextflow.config

* Adds Dockerfile and env yaml updates

* Removes redundant files from assets

* Deleted nf schema json

* Removes redundant configs

* Updates README with template structure

* Updates docs/

* Updates repo name in changelog

* Updates template test.config

* Adds bin folder and template wrapper R script

* Adds pbccs in env.yml

* Changes the location of pipeline info, logs

* Adds .github folder

* Removes redundant files from GH actions

* Removes AWS tests

* Adds misspelling test

* Removes linting.yml

* Removes igenomes config

* Adds tentative LICENSE (MIT)

* Adds nudge for asking help via GH issues
weighted protein inference in MetaMorpheus
This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies.
* update README contributions

* new readme
* update README contributions

* new readme

* fix readme errors
MetaMorpheus: excel compatible tsv by default
kyuubi430 and others added 21 commits November 18, 2020 14:25
* Adds author name in README.md

* Adds author name in README.md

* Deletes temp file

* Adds author name in README.md

* Modified README.md File in LR_TranscriptomeSummary

* Add files via upload

This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies.

* Update README.md

* Updated version of previous files with less typos

* Delete Transcriptomic_Proteomic_Comparison.ipynb

* Delete m_MMprocess.py

* Delete m_gen_maps.py

* Delete m_make_gene_length_table.py

* Delete m_sqantitable.py

* Delete m_squantitable.py

* Updated version with less typos

* Update README.md

* Preliminary module for analyzing peptide space

* Add files via upload

Update of peptide analysis jupyter notebook script

* Convert jupyter notebook into python

* Updated peptide_analysis script for review and added required files/tables

* Update peptide_analysis.py

* Updated .gitignore with a local data file

* Updated peptide_analysis.py to include new path info

* Delete gene_based_info.tsv

* Delete trans_to_gene.tsv
* Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling

* Started code for protein group mapping

* add toy tables for the protein inference mapping

* edited 6frm translate readme

* delete mock files for protein inference (protein group) comparisons. Rachel and Kyndalanne have continued to work on this and these may be outdated.

Co-authored-by: Robert Millikin <[email protected]>
Co-authored-by: Gloria Sheynkman <[email protected]>
* Separate module for greedy protein inference

* protein_inference bug fix

* added rescue to greedy algorithm

* connected peptides changed to set

* small bug fix. cleaned up notebook
* Adds author name in README.md

* Adds author name in README.md

* Deletes temp file

* Adds author name in README.md

* Modified README.md File in LR_TranscriptomeSummary

* Add files via upload

This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies.

* Update README.md

* Updated version of previous files with less typos

* Delete Transcriptomic_Proteomic_Comparison.ipynb

* Delete m_MMprocess.py

* Delete m_gen_maps.py

* Delete m_make_gene_length_table.py

* Delete m_sqantitable.py

* Delete m_squantitable.py

* Updated version with less typos

* Update README.md

* Preliminary module for analyzing peptide space

* Add files via upload

Update of peptide analysis jupyter notebook script

* Convert jupyter notebook into python

* Updated peptide_analysis script for review and added required files/tables

* Update peptide_analysis.py

* Updated .gitignore with a local data file

* Updated peptide_analysis.py to include new path info

* Delete gene_based_info.tsv

* Delete trans_to_gene.tsv

* Removed unnecessary files from Transcriptome Module

* Removed unnecessary files from Transcriptome module

* Removed unnecessary files from Transcriptome module

* Removed unnecessary files from Transcriptome module
…odules (#78)

* Files in progress to create three modules: ReferenceTables, TranscriptomeAnalysis, PeptideAnalysis. Also, debugged orf_calling.py, found that minus strand ORFs not included.

* Prepared a script that makes reference tables

* Updated Transcriptomic Script

* Updated Transcriptomic Script (#77)

Co-authored-by: kyuubi430 <[email protected]>

* Remove files for making three modules with simi.

* Cleaned up referencetable module, Simi to edit.

* Modified Reference Tables Script

* Deleted plots.

* Simi and Gloria finalized the prepare_reference_tables. Works on commandline. Correct outputs to results/PG_ReferenceTables.

* Small edits to peptide_analysis, not done, push to Simi.

* Modified the names out output files from Prepare Reference Tabe script

* Changed file names in reference tables script and modified the transcriptome summary

* Delete unneeded files in transcriptome summary module.

* Finalized ReferenceTables. tested Transcriptome Summary. Started modifying the PeptideAnalysis.

* Made the transcriptome summary script command line executable

* Made the peptide analysis script command line runnable

* In process of modifying MMprocessing script

* Move scripts between TranscriptomeSummary and PeptideAnalysis modules. Code related to MM peptide/protein processing will now be exclusively in PeptideAnalysis.

* Added fasta/tsv and the results directory to gitignore

* Delete jurkat_orf_refined.fasta

Don't want to include *fasta in pull request.

* Delete genes_in_refined.tsv

Don't want to include *tsv output file in PR.
Added *tsv to gitignore, so shouldn't upload in future PR.

Co-authored-by: kyuubi430 <[email protected]>
* Adds nf-core template for nextflow pips

* Cleans up template main.nf and adds swag cli message

* Updates nextflow.config

* Adds Dockerfile and env yaml updates

* Removes redundant files from assets

* Deleted nf schema json

* Removes redundant configs

* Updates README with template structure

* Updates docs/

* Updates repo name in changelog

* Updates template test.config

* Adds bin folder and template wrapper R script

* Adds pbccs in env.yml

* Changes the location of pipeline info, logs

* Adds .github folder

* Removes redendant files from GH actions

* Updates CONTRIBUTING.md

* Updates ISSUE_TEMPLATE

* Update PULL_REQUEST_TEMPLATE.md

* Removes AWS tests

* Adds misspelling test

* Removes linting.yml

* Corrects typo

* Removes igenomes config

* Fixes typos caught by review-dog

* Adds tentative LICENSE

* Adds environment.yml with pandas, numpy, biopython

* Adds CCS process

* Adds pbbam (required for ccs --chunk subsequent routine)

* Adds pbindex, ccs processes (w/ parallel --chunks)

* Removes redundant bai (pbi is needed)

* Adds temp process mock ccs and flag for testing

* Deletes commented out section

To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks",

* Makes the section note more informative
* Adds Rachel Miller to the author names in the README

* custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models

* Make protein inference analysis script command line executable

* spelling fixes

* Update PI_proteinInferenceAnalysis.py

fix merge conflicts
@bj8th bj8th merged commit 67f0ced into main Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants