Adds modules nextflow pseudocode #79

cgpu · 2020-11-24T20:32:41Z

This PR adds the nextflow specific files in the root of the repo. The files were auto-generated (using the nf-core create command) and editted to simplify and remove boilerplate.

…man-lab/Long-Read-Proteogenomics into adds-nextflow-boilerplate

environment.yml

main.nf

environment.yml

main.nf

To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks",

main.nf

@trishorts

* Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Adds Author Name in README (#15) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * add long read info basics * gui app manifest gitignore fix * Adds @trishorts name in README.md (#18) * add author name to readme.md * add one line to refresh commit * add author name Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> * Adds @gsheynkman name to README.md (#16) * Added authorname in README Co-authored-by: cgpu <[email protected]> * Adds Rachel Miller to the author names in the README (#14) * Adds Rachel Miller to the author names in the README * Minor typo Co-authored-by: cgpu <[email protected]> * added author and ORCID (#12) Co-authored-by: cgpu <[email protected]> * Refined database orig code (#29) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * aggregation of FL and CPM by cluster Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Bj8th orf calling (#25) * Add author name in README.md * orf calling updted to run from command line Co-authored-by: gsheynkman <[email protected]> * Adds nextflow files and folders based on nf-core template (#26) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redundant files from GH actions * Removes AWS tests * Adds misspelling test * Removes linting.yml * Removes igenomes config * Adds tentative LICENSE (MIT) * Adds nudge for asking help via GH issues * added pull scripts from the zenodo site * weighted protein inference * fixes * remove script to hopefully avoid merge conflict.. * update mzLib * require equal long read weight for indistinguishable proteins * add contrib * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * update main readme with author names (#38) * update README contributions * new readme * fix minor errors in readme (#40) * update README contributions * new readme * fix readme errors * excel compatible tsv by default * accept thermo license by default * Updated READMEs, uploaded scripts for isoform mapping, protein clustering. (#41) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * delete files that shouldn't be part of the previous commit, old scripts accidentally put in DatabaseAnnotation * updated contributions (#44) * Attempt to create a container for blast mapping process within nextflow. (#48) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * adding cpat and transdecoder containers (#36) * added pull scripts from the zenodo site * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Added commands to run CPAT. 'output' directory to .gitignore. (#53) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Add nextflow-related files, sundries. (#55) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Adding Simi's transcriptomic and peptide analysis scripts. (#56) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * argparse modification (#57) * adds lr_orfcalling.nf and lr_orfcalling_nextflow.config in the module LR_ORFCalling (#59) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * Protein Inference Analysis Module Custom Script (#64) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Add files via upload Update of peptide analysis jupyter notebook script * Need to merge with latest dev (disregarding differences in dev_gloria, which are outdated) (#67) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * Update orf_calling.py Removed a line to match spacing with the py file in dev. * Convert jupyter notebook into python * ORF Calling - bug fix (#70) * argparse modification * small import bug fix Co-authored-by: gsheynkman <[email protected]> * Update README.md (#65) * Update README.md * Update README.md spelling fixes * Update README.md * ORF Filtering bug fixes and RefineDB (#71) * argparse modification * small import bug fix * fixed bugs in orf_filter. module for refining db * refine orf working Co-authored-by: gsheynkman <[email protected]> * Adds Dockerfile, environment.yml for SQANTI3 (#72) * Adds Dockerfile, environment.yml for SQANTI3 * Improves container files Co-authored-by: EC2 Default User <[email protected]> * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Updated peptide analysis file (#66) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * 6frm readme (#45) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * Started code for protein group mapping * add toy tables for the protein inference mapping * edited 6frm translate readme * delete mock files for protein inference (protein group) comparisons. Rachel and Kyndalanne have continued to work on this and these may be outdated. Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Protein Inference (#74) * Separate module for greedy protein inference * protein_inference bug fix * added rescue to greedy algorithm * connected peptides changed to set * small bug fix. cleaned up notebook * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module (#75) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Simi and Gloria - Update referencetable, transcriptome, and peptide modules (#78) * Files in progress to create three modules: ReferenceTables, TranscriptomeAnalysis, PeptideAnalysis. Also, debugged orf_calling.py, found that minus strand ORFs not included. * Prepared a script that makes reference tables * Updated Transcriptomic Script * Updated Transcriptomic Script (#77) Co-authored-by: kyuubi430 <[email protected]> * Remove files for making three modules with simi. * Cleaned up referencetable module, Simi to edit. * Modified Reference Tables Script * Deleted plots. * Simi and Gloria finalized the prepare_reference_tables. Works on commandline. Correct outputs to results/PG_ReferenceTables. * Small edits to peptide_analysis, not done, push to Simi. * Modified the names out output files from Prepare Reference Tabe script * Changed file names in reference tables script and modified the transcriptome summary * Delete unneeded files in transcriptome summary module. * Finalized ReferenceTables. tested Transcriptome Summary. Started modifying the PeptideAnalysis. * Made the transcriptome summary script command line executable * Made the peptide analysis script command line runnable * In process of modifying MMprocessing script * Move scripts between TranscriptomeSummary and PeptideAnalysis modules. Code related to MM peptide/protein processing will now be exclusively in PeptideAnalysis. * Added fasta/tsv and the results directory to gitignore * Delete jurkat_orf_refined.fasta Don't want to include *fasta in pull request. * Delete genes_in_refined.tsv Don't want to include *tsv output file in PR. Added *tsv to gitignore, so shouldn't upload in future PR. Co-authored-by: kyuubi430 <[email protected]> * Adds modules nextflow pseudocode (#79) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redendant files from GH actions * Updates CONTRIBUTING.md * Updates ISSUE_TEMPLATE * Update PULL_REQUEST_TEMPLATE.md * Removes AWS tests * Adds misspelling test * Removes linting.yml * Corrects typo * Removes igenomes config * Fixes typos caught by review-dog * Adds tentative LICENSE * Adds environment.yml with pandas, numpy, biopython * Adds CCS process * Adds pbbam (required for ccs --chunk subsequent routine) * Adds pbindex, ccs processes (w/ parallel --chunks) * Removes redundant bai (pbi is needed) * Adds temp process mock ccs and flag for testing * Deletes commented out section To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks", * Makes the section note more informative * Dev rmmiller protein inference (#83) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Make protein inference analysis script command line executable * spelling fixes * Update PI_proteinInferenceAnalysis.py fix merge conflicts * Adds nextflow referencetable, notes to SQANTI module. (#88) * Initiated files for nextflowifying reference-table module. * Add nextflow code for reference table module. Successfully run on Lifebit/CloudOS. * Rename transcript script. Add notes on requirements and commands to run SQANTI. * Adds nextflow refined db (#80) * Moves refine_orfs.py * Restores tsv vs csv in refine_orfs.py * Adds Nextflow files for refined db generation * Refactor Dockerfile to eliminate duplication of env name * Updates README.md in the modules/LR_ORFCalling subdirectory and lr_orfcalling.nf and lr_orfcalling_nextflow.config (#63) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * added to README.md instructions for executing nextflow within a jupyter notebook for debugging * Update README.md * Update README.md * Updates ORF calling Nextflow files (#73) * Updates ORF calling Nextflow snippet * Deletes redundant global container definition * Removes superfluous new line * Clean up of comments * Updates channel syntax to simplify to 1 line * Adds TransDecoder in log.info summary * Adds nextflow_run.sh with commands, expected exits * Improves code readability, adds file exists check Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: bj8th <[email protected]> * Add nextflow io (#100) * added input output to readmes and modified to run via main * refine database readme * pull request changes made * Clean up peptideanalysismodule (#99) * Clean up peptide analysis module. * Remove README * Bj8th readme (#102) * added readme information to several modules. updated modules to run from command line * added source modules Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: rob <[email protected]> Co-authored-by: trishorts <[email protected]> Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: gsheynkman <[email protected]> Co-authored-by: rmmiller22 <[email protected]> Co-authored-by: Anne Deslattes Mays <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]>

@trishorts

* Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Adds Author Name in README (#15) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * add long read info basics * gui app manifest gitignore fix * Adds @trishorts name in README.md (#18) * add author name to readme.md * add one line to refresh commit * add author name Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> * Adds @gsheynkman name to README.md (#16) * Added authorname in README Co-authored-by: cgpu <[email protected]> * Adds Rachel Miller to the author names in the README (#14) * Adds Rachel Miller to the author names in the README * Minor typo Co-authored-by: cgpu <[email protected]> * added author and ORCID (#12) Co-authored-by: cgpu <[email protected]> * Refined database orig code (#29) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * aggregation of FL and CPM by cluster Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Bj8th orf calling (#25) * Add author name in README.md * orf calling updted to run from command line Co-authored-by: gsheynkman <[email protected]> * Adds nextflow files and folders based on nf-core template (#26) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redundant files from GH actions * Removes AWS tests * Adds misspelling test * Removes linting.yml * Removes igenomes config * Adds tentative LICENSE (MIT) * Adds nudge for asking help via GH issues * added pull scripts from the zenodo site * weighted protein inference * fixes * remove script to hopefully avoid merge conflict.. * update mzLib * require equal long read weight for indistinguishable proteins * add contrib * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * update main readme with author names (#38) * update README contributions * new readme * fix minor errors in readme (#40) * update README contributions * new readme * fix readme errors * excel compatible tsv by default * accept thermo license by default * Updated READMEs, uploaded scripts for isoform mapping, protein clustering. (#41) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * delete files that shouldn't be part of the previous commit, old scripts accidentally put in DatabaseAnnotation * updated contributions (#44) * Attempt to create a container for blast mapping process within nextflow. (#48) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * adding cpat and transdecoder containers (#36) * added pull scripts from the zenodo site * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Added commands to run CPAT. 'output' directory to .gitignore. (#53) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Add nextflow-related files, sundries. (#55) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Adding Simi's transcriptomic and peptide analysis scripts. (#56) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * argparse modification (#57) * adds lr_orfcalling.nf and lr_orfcalling_nextflow.config in the module LR_ORFCalling (#59) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * Protein Inference Analysis Module Custom Script (#64) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Add files via upload Update of peptide analysis jupyter notebook script * Need to merge with latest dev (disregarding differences in dev_gloria, which are outdated) (#67) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * Update orf_calling.py Removed a line to match spacing with the py file in dev. * Convert jupyter notebook into python * ORF Calling - bug fix (#70) * argparse modification * small import bug fix Co-authored-by: gsheynkman <[email protected]> * Update README.md (#65) * Update README.md * Update README.md spelling fixes * Update README.md * ORF Filtering bug fixes and RefineDB (#71) * argparse modification * small import bug fix * fixed bugs in orf_filter. module for refining db * refine orf working Co-authored-by: gsheynkman <[email protected]> * Adds Dockerfile, environment.yml for SQANTI3 (#72) * Adds Dockerfile, environment.yml for SQANTI3 * Improves container files Co-authored-by: EC2 Default User <[email protected]> * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Updated peptide analysis file (#66) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * 6frm readme (#45) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * Started code for protein group mapping * add toy tables for the protein inference mapping * edited 6frm translate readme * delete mock files for protein inference (protein group) comparisons. Rachel and Kyndalanne have continued to work on this and these may be outdated. Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Protein Inference (#74) * Separate module for greedy protein inference * protein_inference bug fix * added rescue to greedy algorithm * connected peptides changed to set * small bug fix. cleaned up notebook * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module (#75) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Simi and Gloria - Update referencetable, transcriptome, and peptide modules (#78) * Files in progress to create three modules: ReferenceTables, TranscriptomeAnalysis, PeptideAnalysis. Also, debugged orf_calling.py, found that minus strand ORFs not included. * Prepared a script that makes reference tables * Updated Transcriptomic Script * Updated Transcriptomic Script (#77) Co-authored-by: kyuubi430 <[email protected]> * Remove files for making three modules with simi. * Cleaned up referencetable module, Simi to edit. * Modified Reference Tables Script * Deleted plots. * Simi and Gloria finalized the prepare_reference_tables. Works on commandline. Correct outputs to results/PG_ReferenceTables. * Small edits to peptide_analysis, not done, push to Simi. * Modified the names out output files from Prepare Reference Tabe script * Changed file names in reference tables script and modified the transcriptome summary * Delete unneeded files in transcriptome summary module. * Finalized ReferenceTables. tested Transcriptome Summary. Started modifying the PeptideAnalysis. * Made the transcriptome summary script command line executable * Made the peptide analysis script command line runnable * In process of modifying MMprocessing script * Move scripts between TranscriptomeSummary and PeptideAnalysis modules. Code related to MM peptide/protein processing will now be exclusively in PeptideAnalysis. * Added fasta/tsv and the results directory to gitignore * Delete jurkat_orf_refined.fasta Don't want to include *fasta in pull request. * Delete genes_in_refined.tsv Don't want to include *tsv output file in PR. Added *tsv to gitignore, so shouldn't upload in future PR. Co-authored-by: kyuubi430 <[email protected]> * Adds modules nextflow pseudocode (#79) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redendant files from GH actions * Updates CONTRIBUTING.md * Updates ISSUE_TEMPLATE * Update PULL_REQUEST_TEMPLATE.md * Removes AWS tests * Adds misspelling test * Removes linting.yml * Corrects typo * Removes igenomes config * Fixes typos caught by review-dog * Adds tentative LICENSE * Adds environment.yml with pandas, numpy, biopython * Adds CCS process * Adds pbbam (required for ccs --chunk subsequent routine) * Adds pbindex, ccs processes (w/ parallel --chunks) * Removes redundant bai (pbi is needed) * Adds temp process mock ccs and flag for testing * Deletes commented out section To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks", * Makes the section note more informative * Dev rmmiller protein inference (#83) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Make protein inference analysis script command line executable * spelling fixes * Update PI_proteinInferenceAnalysis.py fix merge conflicts * Adds nextflow referencetable, notes to SQANTI module. (#88) * Initiated files for nextflowifying reference-table module. * Add nextflow code for reference table module. Successfully run on Lifebit/CloudOS. * Rename transcript script. Add notes on requirements and commands to run SQANTI. * Adds nextflow refined db (#80) * Moves refine_orfs.py * Restores tsv vs csv in refine_orfs.py * Adds Nextflow files for refined db generation * Refactor Dockerfile to eliminate duplication of env name * Updates README.md in the modules/LR_ORFCalling subdirectory and lr_orfcalling.nf and lr_orfcalling_nextflow.config (#63) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * added to README.md instructions for executing nextflow within a jupyter notebook for debugging * Update README.md * Update README.md * Updates ORF calling Nextflow files (#73) * Updates ORF calling Nextflow snippet * Deletes redundant global container definition * Removes superfluous new line * Clean up of comments * Updates channel syntax to simplify to 1 line * Adds TransDecoder in log.info summary * Adds nextflow_run.sh with commands, expected exits * Improves code readability, adds file exists check Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: bj8th <[email protected]> * Clean up peptide analysis module. * Remove README * Add nextflow io (#100) * added input output to readmes and modified to run via main * refine database readme * pull request changes made * Clean up peptideanalysismodule (#99) * Clean up peptide analysis module. * Remove README * Add module to make gencode protein database. * Add module to make PacBio CDS. * Bj8th readme (#102) * added readme information to several modules. updated modules to run from command line * added source modules Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: rob <[email protected]> Co-authored-by: trishorts <[email protected]> Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: rmmiller22 <[email protected]> Co-authored-by: Anne Deslattes Mays <[email protected]> Co-authored-by: bj8th <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]>

@trishorts

* Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Adds Author Name in README (#15) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * add long read info basics * gui app manifest gitignore fix * Adds @trishorts name in README.md (#18) * add author name to readme.md * add one line to refresh commit * add author name Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> * Adds @gsheynkman name to README.md (#16) * Added authorname in README Co-authored-by: cgpu <[email protected]> * Adds Rachel Miller to the author names in the README (#14) * Adds Rachel Miller to the author names in the README * Minor typo Co-authored-by: cgpu <[email protected]> * added author and ORCID (#12) Co-authored-by: cgpu <[email protected]> * Refined database orig code (#29) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * aggregation of FL and CPM by cluster Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Bj8th orf calling (#25) * Add author name in README.md * orf calling updted to run from command line Co-authored-by: gsheynkman <[email protected]> * Adds nextflow files and folders based on nf-core template (#26) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redundant files from GH actions * Removes AWS tests * Adds misspelling test * Removes linting.yml * Removes igenomes config * Adds tentative LICENSE (MIT) * Adds nudge for asking help via GH issues * added pull scripts from the zenodo site * weighted protein inference * fixes * remove script to hopefully avoid merge conflict.. * update mzLib * require equal long read weight for indistinguishable proteins * add contrib * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * update main readme with author names (#38) * update README contributions * new readme * fix minor errors in readme (#40) * update README contributions * new readme * fix readme errors * excel compatible tsv by default * accept thermo license by default * Updated READMEs, uploaded scripts for isoform mapping, protein clustering. (#41) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * delete files that shouldn't be part of the previous commit, old scripts accidentally put in DatabaseAnnotation * updated contributions (#44) * Attempt to create a container for blast mapping process within nextflow. (#48) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * adding cpat and transdecoder containers (#36) * added pull scripts from the zenodo site * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Added commands to run CPAT. 'output' directory to .gitignore. (#53) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Add nextflow-related files, sundries. (#55) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Adding Simi's transcriptomic and peptide analysis scripts. (#56) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * argparse modification (#57) * adds lr_orfcalling.nf and lr_orfcalling_nextflow.config in the module LR_ORFCalling (#59) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * Protein Inference Analysis Module Custom Script (#64) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Add files via upload Update of peptide analysis jupyter notebook script * Need to merge with latest dev (disregarding differences in dev_gloria, which are outdated) (#67) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * Update orf_calling.py Removed a line to match spacing with the py file in dev. * Convert jupyter notebook into python * ORF Calling - bug fix (#70) * argparse modification * small import bug fix Co-authored-by: gsheynkman <[email protected]> * Update README.md (#65) * Update README.md * Update README.md spelling fixes * Update README.md * ORF Filtering bug fixes and RefineDB (#71) * argparse modification * small import bug fix * fixed bugs in orf_filter. module for refining db * refine orf working Co-authored-by: gsheynkman <[email protected]> * Adds Dockerfile, environment.yml for SQANTI3 (#72) * Adds Dockerfile, environment.yml for SQANTI3 * Improves container files Co-authored-by: EC2 Default User <[email protected]> * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Updated peptide analysis file (#66) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * 6frm readme (#45) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * Started code for protein group mapping * add toy tables for the protein inference mapping * edited 6frm translate readme * delete mock files for protein inference (protein group) comparisons. Rachel and Kyndalanne have continued to work on this and these may be outdated. Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Protein Inference (#74) * Separate module for greedy protein inference * protein_inference bug fix * added rescue to greedy algorithm * connected peptides changed to set * small bug fix. cleaned up notebook * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module (#75) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Simi and Gloria - Update referencetable, transcriptome, and peptide modules (#78) * Files in progress to create three modules: ReferenceTables, TranscriptomeAnalysis, PeptideAnalysis. Also, debugged orf_calling.py, found that minus strand ORFs not included. * Prepared a script that makes reference tables * Updated Transcriptomic Script * Updated Transcriptomic Script (#77) Co-authored-by: kyuubi430 <[email protected]> * Remove files for making three modules with simi. * Cleaned up referencetable module, Simi to edit. * Modified Reference Tables Script * Deleted plots. * Simi and Gloria finalized the prepare_reference_tables. Works on commandline. Correct outputs to results/PG_ReferenceTables. * Small edits to peptide_analysis, not done, push to Simi. * Modified the names out output files from Prepare Reference Tabe script * Changed file names in reference tables script and modified the transcriptome summary * Delete unneeded files in transcriptome summary module. * Finalized ReferenceTables. tested Transcriptome Summary. Started modifying the PeptideAnalysis. * Made the transcriptome summary script command line executable * Made the peptide analysis script command line runnable * In process of modifying MMprocessing script * Move scripts between TranscriptomeSummary and PeptideAnalysis modules. Code related to MM peptide/protein processing will now be exclusively in PeptideAnalysis. * Added fasta/tsv and the results directory to gitignore * Delete jurkat_orf_refined.fasta Don't want to include *fasta in pull request. * Delete genes_in_refined.tsv Don't want to include *tsv output file in PR. Added *tsv to gitignore, so shouldn't upload in future PR. Co-authored-by: kyuubi430 <[email protected]> * Adds modules nextflow pseudocode (#79) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redendant files from GH actions * Updates CONTRIBUTING.md * Updates ISSUE_TEMPLATE * Update PULL_REQUEST_TEMPLATE.md * Removes AWS tests * Adds misspelling test * Removes linting.yml * Corrects typo * Removes igenomes config * Fixes typos caught by review-dog * Adds tentative LICENSE * Adds environment.yml with pandas, numpy, biopython * Adds CCS process * Adds pbbam (required for ccs --chunk subsequent routine) * Adds pbindex, ccs processes (w/ parallel --chunks) * Removes redundant bai (pbi is needed) * Adds temp process mock ccs and flag for testing * Deletes commented out section To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks", * Makes the section note more informative * Dev rmmiller protein inference (#83) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Make protein inference analysis script command line executable * spelling fixes * Update PI_proteinInferenceAnalysis.py fix merge conflicts * rescue algorithm implemented * eliminate added conflict files * config file remove * spelling fix Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: rob <[email protected]> Co-authored-by: trishorts <[email protected]> Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: gsheynkman <[email protected]> Co-authored-by: Anne Deslattes Mays <[email protected]> Co-authored-by: bj8th <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]>

@trishorts

* Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Adds Author Name in README (#15) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * add long read info basics * gui app manifest gitignore fix * Adds @trishorts name in README.md (#18) * add author name to readme.md * add one line to refresh commit * add author name Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> * Adds @gsheynkman name to README.md (#16) * Added authorname in README Co-authored-by: cgpu <[email protected]> * Adds Rachel Miller to the author names in the README (#14) * Adds Rachel Miller to the author names in the README * Minor typo Co-authored-by: cgpu <[email protected]> * added author and ORCID (#12) Co-authored-by: cgpu <[email protected]> * Refined database orig code (#29) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * aggregation of FL and CPM by cluster Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Bj8th orf calling (#25) * Add author name in README.md * orf calling updted to run from command line Co-authored-by: gsheynkman <[email protected]> * Adds nextflow files and folders based on nf-core template (#26) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redundant files from GH actions * Removes AWS tests * Adds misspelling test * Removes linting.yml * Removes igenomes config * Adds tentative LICENSE (MIT) * Adds nudge for asking help via GH issues * added pull scripts from the zenodo site * weighted protein inference * fixes * remove script to hopefully avoid merge conflict.. * update mzLib * require equal long read weight for indistinguishable proteins * add contrib * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * update main readme with author names (#38) * update README contributions * new readme * fix minor errors in readme (#40) * update README contributions * new readme * fix readme errors * excel compatible tsv by default * accept thermo license by default * Updated READMEs, uploaded scripts for isoform mapping, protein clustering. (#41) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * delete files that shouldn't be part of the previous commit, old scripts accidentally put in DatabaseAnnotation * updated contributions (#44) * Attempt to create a container for blast mapping process within nextflow. (#48) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * adding cpat and transdecoder containers (#36) * added pull scripts from the zenodo site * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Added commands to run CPAT. 'output' directory to .gitignore. (#53) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Add nextflow-related files, sundries. (#55) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Adding Simi's transcriptomic and peptide analysis scripts. (#56) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * argparse modification (#57) * adds lr_orfcalling.nf and lr_orfcalling_nextflow.config in the module LR_ORFCalling (#59) * adding cpat and transdecoder containers * adding the empty data directory with README.md explaining why it is empty on github * github actions caught my spelling error * left out in front of the conda commands for both these containers * added debugged containers * moved test.config to conf/executor/test.config * fixed syntax error executor -> executors * created local lr_orfcalling.nf and _nextflow.config to aid in local testing and debugging before final merge into pipeline * adding main.nf and nextflow.config * Protein Inference Analysis Module Custom Script (#64) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Add files via upload Update of peptide analysis jupyter notebook script * Need to merge with latest dev (disregarding differences in dev_gloria, which are outdated) (#67) * Updated various readmes. Uploaded scripts for isoform mappings, protein clustering, etc. * Add scripts that run blast and parse blast results to find at-length matches between isoforms. Need to combine with Anne's pipeline. * add scripts for blast mapping between protein databases * Attempt to create nextflow container to do blast searching for isoform accession mapping * correct spelling error * Made amendments to orf_calling.py. * CPAT commands. Added output directory to gitignore. Outupt directly to hold intermediate datafiles. * Gloria adding Simi's transcriptome analysis and peptide analysis scripts. * Update orf_calling.py Removed a line to match spacing with the py file in dev. * Convert jupyter notebook into python * ORF Calling - bug fix (#70) * argparse modification * small import bug fix Co-authored-by: gsheynkman <[email protected]> * Update README.md (#65) * Update README.md * Update README.md spelling fixes * Update README.md * ORF Filtering bug fixes and RefineDB (#71) * argparse modification * small import bug fix * fixed bugs in orf_filter. module for refining db * refine orf working Co-authored-by: gsheynkman <[email protected]> * Adds Dockerfile, environment.yml for SQANTI3 (#72) * Adds Dockerfile, environment.yml for SQANTI3 * Improves container files Co-authored-by: EC2 Default User <[email protected]> * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Updated peptide analysis file (#66) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * 6frm readme (#45) * Add initial code to extract and cluster pacbio protein sequences, based on input from LR_ORFCalling * Started code for protein group mapping * add toy tables for the protein inference mapping * edited 6frm translate readme * delete mock files for protein inference (protein group) comparisons. Rachel and Kyndalanne have continued to work on this and these may be outdated. Co-authored-by: Robert Millikin <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> * Protein Inference (#74) * Separate module for greedy protein inference * protein_inference bug fix * added rescue to greedy algorithm * connected peptides changed to set * small bug fix. cleaned up notebook * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module (#75) * Adds author name in README.md * Adds author name in README.md * Deletes temp file * Adds author name in README.md * Modified README.md File in LR_TranscriptomeSummary * Add files via upload This adds the genomic data compilation and comparison jupyter notebook script and adds several custom module dependencies. * Update README.md * Updated version of previous files with less typos * Delete Transcriptomic_Proteomic_Comparison.ipynb * Delete m_MMprocess.py * Delete m_gen_maps.py * Delete m_make_gene_length_table.py * Delete m_sqantitable.py * Delete m_squantitable.py * Updated version with less typos * Update README.md * Preliminary module for analyzing peptide space * Add files via upload Update of peptide analysis jupyter notebook script * Convert jupyter notebook into python * Updated peptide_analysis script for review and added required files/tables * Update peptide_analysis.py * Updated .gitignore with a local data file * Updated peptide_analysis.py to include new path info * Delete gene_based_info.tsv * Delete trans_to_gene.tsv * Removed unnecessary files from Transcriptome Module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Removed unnecessary files from Transcriptome module * Simi and Gloria - Update referencetable, transcriptome, and peptide modules (#78) * Files in progress to create three modules: ReferenceTables, TranscriptomeAnalysis, PeptideAnalysis. Also, debugged orf_calling.py, found that minus strand ORFs not included. * Prepared a script that makes reference tables * Updated Transcriptomic Script * Updated Transcriptomic Script (#77) Co-authored-by: kyuubi430 <[email protected]> * Remove files for making three modules with simi. * Cleaned up referencetable module, Simi to edit. * Modified Reference Tables Script * Deleted plots. * Simi and Gloria finalized the prepare_reference_tables. Works on commandline. Correct outputs to results/PG_ReferenceTables. * Small edits to peptide_analysis, not done, push to Simi. * Modified the names out output files from Prepare Reference Tabe script * Changed file names in reference tables script and modified the transcriptome summary * Delete unneeded files in transcriptome summary module. * Finalized ReferenceTables. tested Transcriptome Summary. Started modifying the PeptideAnalysis. * Made the transcriptome summary script command line executable * Made the peptide analysis script command line runnable * In process of modifying MMprocessing script * Move scripts between TranscriptomeSummary and PeptideAnalysis modules. Code related to MM peptide/protein processing will now be exclusively in PeptideAnalysis. * Added fasta/tsv and the results directory to gitignore * Delete jurkat_orf_refined.fasta Don't want to include *fasta in pull request. * Delete genes_in_refined.tsv Don't want to include *tsv output file in PR. Added *tsv to gitignore, so shouldn't upload in future PR. Co-authored-by: kyuubi430 <[email protected]> * Adds modules nextflow pseudocode (#79) * Adds nf-core template for nextflow pips * Cleans up template main.nf and adds swag cli message * Updates nextflow.config * Adds Dockerfile and env yaml updates * Removes redundant files from assets * Deleted nf schema json * Removes redundant configs * Updates README with template structure * Updates docs/ * Updates repo name in changelog * Updates template test.config * Adds bin folder and template wrapper R script * Adds pbccs in env.yml * Changes the location of pipeline info, logs * Adds .github folder * Removes redendant files from GH actions * Updates CONTRIBUTING.md * Updates ISSUE_TEMPLATE * Update PULL_REQUEST_TEMPLATE.md * Removes AWS tests * Adds misspelling test * Removes linting.yml * Corrects typo * Removes igenomes config * Fixes typos caught by review-dog * Adds tentative LICENSE * Adds environment.yml with pandas, numpy, biopython * Adds CCS process * Adds pbbam (required for ccs --chunk subsequent routine) * Adds pbindex, ccs processes (w/ parallel --chunks) * Removes redundant bai (pbi is needed) * Adds temp process mock ccs and flag for testing * Deletes commented out section To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks", * Makes the section note more informative * Dev rmmiller protein inference (#83) * Adds Rachel Miller to the author names in the README * custom script for the comparison of protein group output from MetaMorpheus searches using different protein database reference models * Make protein inference analysis script command line executable * spelling fixes * Update PI_proteinInferenceAnalysis.py fix merge conflicts * rescue algorithm implemented * eliminate added conflict files * config file remove * spelling fix * Update mzLib version- includes database parsing changes * Test Update Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: kyuubi430 <[email protected]> Co-authored-by: rob <[email protected]> Co-authored-by: trishorts <[email protected]> Co-authored-by: Michael Shortreed <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: gsheynkman <[email protected]> Co-authored-by: Anne Deslattes Mays <[email protected]> Co-authored-by: bj8th <[email protected]> Co-authored-by: Gloria Sheynkman <[email protected]> Co-authored-by: cgpu <[email protected]> Co-authored-by: EC2 Default User <[email protected]>

cgpu and others added 30 commits November 7, 2020 20:38

Adds nf-core template for nextflow pips

5080c4d

Absorbs dev latest changes

2998236

Cleans up template main.nf and adds swag cli message

137aa41

Updates nextflow.config

eac977e

Adds Dockerfile and env yaml updates

6bcc81a

Removes redundant files from assets

9075836

Deleted nf schema json

780a115

Removes redundant configs

420340e

Updates README with template structure

4495b05

Updates docs/

2a785f3

Updates repo name in changelog

75571d3

Updates template test.config

492d1ae

Adds bin folder and template wrapper R script

da3687d

Adds pbccs in env.yml

35de804

Changes the location of pipeline info, logs

5638c54

Adds .github folder

8fdecfd

Merge branch 'adds-nextflow-boilerplate' of https://github.com/sheynk…

6f812e9

…man-lab/Long-Read-Proteogenomics into adds-nextflow-boilerplate

Removes redendant files from GH actions

6fcae1c

Updates CONTRIBUTING.md

2ae1719

Updates ISSUE_TEMPLATE

dccbaad

Update PULL_REQUEST_TEMPLATE.md

e5c1ab3

Removes AWS tests

5fb8e1b

Adds misspelling test

d7da707

Removes linting.yml

91ea8d9

Corrects typo

2ae3e35

Removes igenomes config

10f556e

Merge branch 'adds-nextflow-boilerplate' of https://github.com/sheynk…

02fb11d

…man-lab/Long-Read-Proteogenomics into adds-nextflow-boilerplate

Fixes typos caught by review-dog

31744bb

Adds tentative LICENSE

2388635

Adds environment.yml with pandas, numpy, biopython

50b8058

cgpu added 6 commits November 8, 2020 11:28

Adds CCS process

7672e16

Adds pbbam (required for ccs --chunk subsequent routine)

f04dc7c

Adds pbindex, ccs processes (w/ parallel --chunks)

df6bd40

Removes redundant bai (pbi is needed)

f9b6153

Adds temp process mock ccs and flag for testing

76ab7a8

Absorbs latest changes from dev

e860d54

github-actions bot reviewed Nov 24, 2020

View reviewed changes

environment.yml Outdated Show resolved Hide resolved

main.nf Outdated Show resolved Hide resolved

cgpu commented Nov 24, 2020

View reviewed changes

environment.yml Outdated Show resolved Hide resolved

Corrects typo caught by reviewdog gh-action

5120695

cgpu commented Nov 24, 2020

View reviewed changes

main.nf Outdated Show resolved Hide resolved

Typo fix

34e3db5

github-actions bot reviewed Nov 24, 2020

View reviewed changes

main.nf Outdated Show resolved Hide resolved

cgpu commented Nov 24, 2020

View reviewed changes

main.nf Outdated Show resolved Hide resolved

Deletes commented out section

c5c8317

To respect the rule, "we do not choose to modify cod ebehaviour by commenting in and out code chunks",

cgpu commented Nov 24, 2020

View reviewed changes

main.nf Outdated Show resolved Hide resolved

Makes the section note more informative

38a5079

cgpu requested a review from gsheynkman November 24, 2020 20:41

gsheynkman approved these changes Nov 24, 2020

View reviewed changes

cgpu merged commit 6e42ac3 into dev Nov 24, 2020

cgpu deleted the adds-modules-nextflow-pseudocode branch November 24, 2020 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds modules nextflow pseudocode #79

Adds modules nextflow pseudocode #79

cgpu commented Nov 24, 2020

Adds modules nextflow pseudocode #79

Adds modules nextflow pseudocode #79

Conversation

cgpu commented Nov 24, 2020