Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.21

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/MultiQC/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.

        This report has been generated by the nf-core/metapep analysis pipeline. For information about how to interpret these results, please see the documentation.

        Report generated on 2024-09-06, 11:46 UTC based on data in: /home/kxmte01/metapep_tests/stats_multiqc/work/ed/28d16837b34e5a54407bf04abc53df


        Overall Summary

        Summary of the number of proteins, peptides, unique peptides, binders per condition, as well as binders per allele per condition. The last row corresponds to a total across all conditions.

        Showing 3/3 rows and 6/6 columns.
        Condition nameUnique proteinsTotal peptidesUnique peptides# Binder (best allele)# Binders for allele A*01:01# Binders for allele B*07:02
        cond_1
        702.0
        772098.0
        537947.0
        31983.0
        10041.0
        22072.0
        cond_2
        702.0
        772098.0
        537947.0
        31983.0
        10041.0
        22072.0
        total
        702.0
        772098.0
        537947.0
        31983.0
        10041.0
        22072.0

        Software Versions

        Software Versions lists versions of software tools extracted from file contents.

        GroupSoftwareVersion
        ASSIGN_NUCL_ENTITY_WEIGHTSpandas1.5.2
        python3.11.0
        CHECK_SAMPLESHEET_CREATE_TABLESepytope3.3.1
        mhcflurry1.4.3
        mhcnuggets2.3.2
        pandas1.3.5
        python3.7.12
        syfpeithi1.0
        COLLECT_STATSpandas1.5.2
        python3.11.0
        DOWNLOAD_PROTEINSbiopython1.78
        python3.9.1
        FINALIZE_MICROBIOME_ENTITIESpandas1.5.2
        python3.11.0
        GENERATE_PEPTIDESbiopython1.79
        numpy1.23.5
        pandas1.5.2
        python3.11.0
        GENERATE_PROTEIN_AND_ENTITY_IDSbiopython1.79
        numpy1.23.5
        pandas1.5.2
        python3.11.0
        MERGE_PREDICTIONSpandas1.5.2
        python3.11.0
        PLOT_ENTITY_BINDING_RATIOSR4.2.3
        data.table1.14.8
        dplyr1.1.2
        ggplot23.4.2
        ggpubr0.6.0
        optparse1.7.3
        stringr1.5.0
        PLOT_SCORE_DISTRIBUTIONR4.1.1
        data.table1.14.2
        dplyr1.0.7
        ggplot23.3.5
        stringr1.4.0
        PREDICT_EPITOPESepytope3.3.1
        mhcflurry1.4.3
        mhcnuggets2.3.2
        pandas1.3.5
        python3.7.12
        syfpeithi1.0
        PREPARE_ENTITY_BINDING_RATIOSpandas1.5.2
        python3.11.0
        PREPARE_SCORE_DISTRIBUTIONpandas1.5.2
        python3.11.0
        SPLIT_PRED_TASKSpandas1.5.2
        python3.11.0
        UNIFY_MODEL_LENGTHSepytope3.3.1
        pandas1.3.5
        python3.7.12
        syfpeithi1.0
        WorkflowNextflow24.4.3
        nf-core/metapep1.0.0

        nf-core/metapep Methods Description

        Suggested text and references to use when describing pipeline usage within the methods section of a publication.

        Methods

        Data was processed using nf-core/metapep v1.0.0 of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects. Briefly the pipeline uses prodigal (Hyatt, D., Chen, GL., LoCascio, P.F. et al., 2010) to predict proteins from the genomic input files or downloads the proteins from the taxid input directly using Entrez (Maglott et al., 2005). Peptides are generated in discrete lengths from proteins and predicted against chosen alleles using either SYFPEITHI (Rammensee et al., 1999), MHCFlurry (O'Donnel et al., 2020) or MHCnuggets (Shao et al., 2019), which are embedded in the epytope framework (Schubert et al., 2016). Resulting epitopeprediction scores distributions and entity binding ratios are plotted using R (R Core Team, 2022). The large amounts of data are handled using a python (Python Core Team, 2022) framework. All specific software versions and used libraries can be found in the following section and the CITATIONS.md file.

        The pipeline was executed with Nextflow v24.04.3 (Di Tommaso et al., 2017) with the following command:

        nextflow run ../../metapep -profile test_taxa_only,docker --outdir test

        Tools used in the workflow included: Entrez (Maglott et al. 2005), Prodigal (Hyatt et al. 2010), Python (Python Core Team 2022), R (R Core Team 2022), Pandas (The pandas development team 2022), Epytope (FRED2) (Schubert et al. 2016), SYFPEITHI (Rammensee et al. 1999) and MultiQC (Ewels et al. 2016). For a more detailed list check the references and tool versions .

        References

        • Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. doi: 10.1038/nbt.3820
        • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354
        • Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. doi: 10.1038/s41587-020-0439-x
        • Grüning, B., Dale, R., Sjödin, A., Chapman, B. A., Rowe, J., Tomkins-Tinch, C. H., Valieris, R., Köster, J., & Bioconda Team. (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: 10.1038/s41592-018-0046-7
        • Hyatt, D., Chen, GL., LoCascio, P. F., Land M. L., Larimer F. W., Hauser L. J. (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119. doi: 10.1186/1471-2105-11-119
        • Maglott D, Ostell J, Pruitt KD, Tatusova T. (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D54-8. Update in: Nucleic Acids Res. 2007 Jan;35(Database issue):D26-31. doi: 10.1093/nar/gki031.
        • O'Donnell T. J., Rubinsteyn A., Laserson U., (2020). MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing. Cell Systems 11, 42-48. doi: 10.1016/j.cels.2020.06.010.
        • Python Core Team (2022). Python: A dynamic, open source programming language. Python Software Foundation. https://www.python.org/.
        • Rammensee H., Bachmann J., Emmerich N. P., Bachor O. A., Stevanović S. (1999). SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 1999 Nov;50(3-4):213-9. doi: 10.1007/s002510050595.
        • R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
        • Schubert, B., Walzer, M., Brachvogel, H-P., Sozolek, A., Mohr, C., and Kohlbacher, O. (2016). FRED 2 - An Immunoinformatics Framework for Python. Bioinformatics 2016. doi: 10.1093/bioinformatics/btw113.
        • Shao X. M., Bhattacharya R., Huang J., Sivakumar I. K. A., Tokheim C., Zheng L., Hirsch D., Kaminow B., Omdahl A., Bonsack M., Riemer A. B., Velculescu V. E., Anagnostou V., Pagel K. A., Karchin R. (2020). High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggets. Cancer Immunol Res. 2020 Mar;8(3):396-408. doi: 10.1158/2326-6066.cir-19-0464.
        • The pandas development team. (2022). pandas-dev/pandas: Pandas (v1.5.2). Zenodo. doi: 10.5281/zenodo.7344967.
        • da Veiga Leprevost, F., Grüning, B. A., Alves Aflitos, S., Röst, H. L., Uszkoreit, J., Barsnes, H., Vaudel, M., Moreno, P., Gatto, L., Weber, J., Bai, M., Jimenez, R. C., Sachsenberg, T., Pfeuffer, J., Vera Alvarez, R., Griss, J., Nesvizhskii, A. I., & Perez-Riverol, Y. (2017). BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics (Oxford, England), 33(16), 2580–2582. doi: 10.1093/bioinformatics/btx192
        Notes:
        • If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used.
        • The command above does not include parameters contained in any configs or profiles that may have been used. Ensure the config file is also uploaded with your publication!
        • You should also cite all software used within this run. Check the "Software Versions" of this report to get version information.

        nf-core/metapep Workflow Summary

        - this information is collected when the pipeline is started.

        Core Nextflow options

        runName
        gloomy_lavoisier
        containerEngine
        docker
        launchDir
        /home/kxmte01/metapep_tests/stats_multiqc
        workDir
        /home/kxmte01/metapep_tests/stats_multiqc/work
        projectDir
        /home/kxmte01/metapep
        userName
        kxmte01
        profile
        test_taxa_only,docker
        configFiles
        N/A

        Input/output options

        input
        https://raw.githubusercontent.com/nf-core/test-datasets/metapep/samplesheets/v1.0/samplesheet.taxa_only.csv
        outdir
        test

        Institutional config options

        config_profile_name
        Test profile
        config_profile_description
        Minimal test dataset to check pipeline function

        Max job request options

        max_cpus
        2
        max_memory
        6 GB
        max_time
        2d