Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add busco 5.8.0 #1095

Merged
merged 3 commits into from
Nov 5, 2024
Merged

add busco 5.8.0 #1095

merged 3 commits into from
Nov 5, 2024

Conversation

Kincekara
Copy link
Collaborator

Closes #1087
All tools were updated with Busco. Sepp doesn't cause trouble anymore. Unfortunately, Augustus v3.4.0 caused problems in Jammy so I had to stay in focal.

diff busco/5.7.1 busco/5.8.0/
diff busco/5.7.1/Dockerfile busco/5.8.0/Dockerfile
1c1
< ARG BUSCO_VER="5.7.1"
---
> ARG BUSCO_VER="5.8.0"
3c3
< FROM ubuntu:focal as app
---
> FROM ubuntu:focal AS app
6,8c6,10
< ARG BBMAP_VER="39.06"
< ARG BLAST_VER="2.15.0"
< ARG MINIPROT_VER="0.12"
---
> ARG BBMAP_VER="39.10"
> ARG BLAST_VER="2.16.0"
> ARG MINIPROT_VER="0.13"
> ARG SEPP_VER="4.5.5"
> ARG METAEUK_VER="7-bba0d80"
28,29c30,31
<     hmmer \   
<     prodigal \    
---
>     hmmer \
>     prodigal \
31c33
<     r-cran-ggplot2 \      
---
>     r-cran-ggplot2 \
33c35
<     openjdk-8-jre-headless \ 
---
>     openjdk-8-jre-headless \
47,51c49,53
< # sepp (greengenes version)
< RUN wget https://raw.githubusercontent.com/smirarab/sepp-refs/54415e8905c5fa26cdd631c526b21f2bcdba95b5/gg/sepp-package.tar.bz &&\  
<     tar xvfj sepp-package.tar.bz && rm sepp-package.tar.bz &&\
<     cd sepp-package/sepp &&\
<     python setup.py config -c && chmod 755 run_*
---
> # sepp 
> RUN wget https://github.com/smirarab/sepp/archive/refs/tags/v${SEPP_VER}.tar.gz &&\
>     tar -xvf v${SEPP_VER}.tar.gz && rm v${SEPP_VER}.tar.gz &&\
>     cd sepp-${SEPP_VER} &&\
>     python setup.py config -c && python setup.py install
57c59
< RUN wget https://github.com/soedinglab/metaeuk/releases/download/6-a5d39d9/metaeuk-linux-sse41.tar.gz &&\
---
> RUN wget https://github.com/soedinglab/metaeuk/releases/download/${METAEUK_VER}/metaeuk-linux-sse41.tar.gz &&\
61c63
< RUN wget https://github.com/lh3/miniprot/releases/download/v0.12/miniprot-${MINIPROT_VER}_x64-linux.tar.bz2 &&\
---
> RUN wget https://github.com/lh3/miniprot/releases/download/v${MINIPROT_VER}/miniprot-${MINIPROT_VER}_x64-linux.tar.bz2 &&\
72,74c74,76
< ENV AUGUSTUS_CONFIG_PATH="/usr/share/augustus/config/"
< ENV PATH="${PATH}:/ncbi-blast-${BLAST_VER}+/bin:/sepp-package/sepp:/usr/share/augustus/scripts:/busco-${BUSCO_VER}/scripts"
< ENV LC_ALL=C
---
> ENV AUGUSTUS_CONFIG_PATH="/usr/share/augustus/config/" \
>     PATH="${PATH}:/ncbi-blast-${BLAST_VER}+/bin:/usr/share/augustus/scripts:/busco-${BUSCO_VER}/scripts" \
>     LC_ALL=C
81c83
< FROM app as test
---
> FROM app AS test
diff busco/5.7.1/README.md busco/5.8.0/README.md
6c6
< - BBTools 39.06
---
> - BBTools 39.10
9c9
< - BLAST+ 2.15.0
---
> - BLAST+ 2.16.0
11,12c11,12
< - MetaEuk (Release 6-a5d39d9)
< - SEPP 4.5.1
---
> - MetaEuk (Release 7-bba0d80)
> - SEPP 4.5.5
18c18
< - Miniprot 0.12
---
> - Miniprot 0.13

Pull Request (PR) checklist:

  • Include a description of what is in this pull request in this message.
  • The dockerfile successfully builds to a test target for the user creating the PR. (i.e. docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15 )
  • Directory structure as name of the tool in lower case with special characters removed with a subdirectory of the version number (i.e. spades/3.12.0/Dockerfile)
    • (optional) All test files are located in same directory as the Dockerfile (i.e. shigatyper/2.0.1/test.sh)
  • Create a simple container-specific README.md in the same directory as the Dockerfile (i.e. spades/3.12.0/README.md)
    • If this README is longer than 30 lines, there is an explanation as to why more detail was needed
  • Dockerfile includes the recommended LABELS
  • Main README.md has been updated to include the tool and/or version of the dockerfile(s) in this PR
  • Program_Licenses.md contains the tool(s) used in this PR and has been updated for any missing

@erinyoung
Copy link
Contributor

This test worked:

#15 [test 2/3] RUN wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/010/941/835/GCA_010941835.1_PDT000052640.3/GCA_010941835.1_PDT000052640.3_genomic.fna.gz  &&     gzip -d GCA_010941835.1_PDT000052640.3_genomic.fna.gz &&     busco --offline -l /busco_downloads/lineages/bacteria_odb10 -m genome -i GCA_010941835.1_PDT000052640.3_genomic.fna -o offline --cpu 4 &&     head offline/short_summary*.txt
#15 1.373 2024-11-04 20:20:54 INFO:	***** Start a BUSCO v5.8.0 analysis, current time: 11/04/2024 20:20:54 *****
#15 1.373 2024-11-04 20:20:54 INFO:	Configuring BUSCO with local environment
#15 1.374 2024-11-04 20:20:54 INFO:	Running genome mode
#15 1.377 2024-11-04 20:20:54 INFO:	Input file is /data/GCA_010941835.1_PDT000052640.3_genomic.fna
#15 1.377 2024-11-04 20:20:54 INFO:	Using local lineages directory /busco_downloads/lineages/bacteria_odb10
#15 1.495 2024-11-04 20:20:54 INFO:	Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2024-01-08)
#15 1.495 2024-11-04 20:20:54 INFO:	Running 1 job(s) on bbtools, starting at 11/04/2024 20:20:54
#15 2.206 2024-11-04 20:20:55 INFO:	[bbtools]	1 of 1 task(s) completed
#15 2.224 2024-11-04 20:20:55 INFO:	***** Run Prodigal on input to predict and extract genes *****
#15 2.225 2024-11-04 20:20:55 INFO:	Running Prodigal with genetic code 11 in single mode
#15 2.225 2024-11-04 20:20:55 INFO:	Running 1 job(s) on prodigal, starting at 11/04/2024 20:20:55
#15 5.456 2024-11-04 20:20:58 INFO:	[prodigal]	1 of 1 task(s) completed
#15 5.604 2024-11-04 20:20:58 INFO:	Genetic code 11 selected as optimal
#15 5.604 2024-11-04 20:20:58 INFO:	***** Run HMMER on gene sequences *****
#15 5.605 2024-11-04 20:20:58 INFO:	Running 124 job(s) on hmmsearch, starting at 11/04/2024 20:20:58
#15 6.467 2024-11-04 20:20:59 INFO:	[hmmsearch]	13 of 124 task(s) completed
#15 6.608 2024-11-04 20:20:59 INFO:	[hmmsearch]	25 of 124 task(s) completed
#15 6.846 2024-11-04 20:21:00 INFO:	[hmmsearch]	38 of 124 task(s) completed
#15 6.984 2024-11-04 20:21:00 INFO:	[hmmsearch]	50 of 124 task(s) completed
#15 7.217 2024-11-04 20:21:00 INFO:	[hmmsearch]	63 of 124 task(s) completed
#15 7.339 2024-11-04 20:21:00 INFO:	[hmmsearch]	75 of 124 task(s) completed
#15 7.458 2024-11-04 20:21:00 INFO:	[hmmsearch]	87 of 124 task(s) completed
#15 7.767 2024-11-04 20:21:01 INFO:	[hmmsearch]	100 of 124 task(s) completed
#15 8.044 2024-11-04 20:21:01 INFO:	[hmmsearch]	112 of 124 task(s) completed
#15 8.362 2024-11-04 20:21:01 INFO:	[hmmsearch]	124 of 124 task(s) completed
#15 8.421 2024-11-04 20:21:01 INFO:	Results:	C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124	   
#15 8.421 
#15 8.443 2024-11-04 20:21:01 INFO:	
#15 8.443 
#15 8.443     ---------------------------------------------------
#15 8.443     |Results from dataset bacteria_odb10               |
#15 8.443     ---------------------------------------------------
#15 8.443     |C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124       |
#15 8.443     |122    Complete BUSCOs (C)                        |
#15 8.443     |122    Complete and single-copy BUSCOs (S)        |
#15 8.443     |0    Complete and duplicated BUSCOs (D)           |
#15 8.443     |0    Fragmented BUSCOs (F)                        |
#15 8.443     |2    Missing BUSCOs (M)                           |
#15 8.443     |124    Total BUSCO groups searched                |
#15 8.443     ---------------------------------------------------
#15 8.444 2024-11-04 20:21:01 INFO:	BUSCO analysis done. Total running time: 7 seconds
#15 8.444 2024-11-04 20:21:01 INFO:	Results written in /data/offline
#15 8.444 2024-11-04 20:21:01 INFO:	For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html
#15 8.444 
#15 8.444 2024-11-04 20:21:01 INFO:	Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
#15 9.957 # BUSCO version is: 5.8.0 
#15 9.957 # The lineage dataset is: bacteria_odb10 (Creation date: 2024-01-08, number of genomes: 4085, number of BUSCOs: 124)
#15 9.957 # Summarized benchmarking in BUSCO notation for file /data/GCA_010941835.1_PDT000052640.3_genomic.fna
#15 9.957 # BUSCO was run in mode: prok_genome_prod
#15 9.957 # Gene predictor used: prodigal
#15 9.957 
#15 9.957 	***** Results: *****
#15 9.957 
#15 9.957 	C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124	   
#15 9.957 	122	Complete BUSCOs (C)			   
#15 DONE 10.0s

#16 [test 3/3] RUN busco -m genome -i GCA_010941835.1_PDT000052640.3_genomic.fna -o auto --cpu 4 --auto-lineage-prok &&     head auto/short_summary*.txt
#16 0.412 2024-11-04 20:21:03 INFO:	***** Start a BUSCO v5.8.0 analysis, current time: 11/04/2024 20:21:03 *****
#16 0.412 2024-11-04 20:21:03 INFO:	Configuring BUSCO with local environment
#16 0.413 2024-11-04 20:21:03 INFO:	Running genome mode
#16 0.414 2024-11-04 20:21:03 INFO:	Downloading information on latest versions of BUSCO data...
#16 3.704 2024-11-04 20:21:07 INFO:	Input file is /data/GCA_010941835.1_PDT000052640.3_genomic.fna
#16 3.704 2024-11-04 20:21:07 INFO:	No lineage specified. Running lineage auto selector.
#16 3.704 
#16 3.706 2024-11-04 20:21:07 INFO:	***** Starting Auto Select Lineage *****
#16 3.706 	This process runs BUSCO on the generic lineage datasets for the domains archaea, bacteria and eukaryota. Once the optimal domain is selected, BUSCO automatically attempts to find the most appropriate BUSCO dataset to use based on phylogenetic placement.
#16 3.706 	--auto-lineage-euk and --auto-lineage-prok are also available if you know your input assembly is, or is not, an eukaryote. See the user guide for more information.
#16 3.706 	A reminder: Busco evaluations are valid when an appropriate dataset is used, i.e., the dataset belongs to the lineage of the species to test. Because of overlapping markers/spurious matches among domains, busco matches in another domain do not necessarily mean that your genome/proteome contains sequences from this domain. However, a high busco score in multiple domains might help you identify possible contaminations.
#16 3.707 2024-11-04 20:21:07 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/lineages/archaea_odb10.2024-01-08.tar.gz'
#16 5.866 2024-11-04 20:21:09 INFO:	Decompressing file '/data/busco_downloads/lineages/archaea_odb10.tar.gz'
#16 6.257 2024-11-04 20:21:09 INFO:	Running BUSCO using lineage dataset archaea_odb10 (prokaryota, 2024-01-08)
#16 6.257 2024-11-04 20:21:09 INFO:	Running 1 job(s) on bbtools, starting at 11/04/2024 20:21:09
#16 7.028 2024-11-04 20:21:10 INFO:	[bbtools]	1 of 1 task(s) completed
#16 7.045 2024-11-04 20:21:10 INFO:	***** Run Prodigal on input to predict and extract genes *****
#16 7.045 2024-11-04 20:21:10 INFO:	Running Prodigal with genetic code 11 in single mode
#16 7.045 2024-11-04 20:21:10 INFO:	Running 1 job(s) on prodigal, starting at 11/04/2024 20:21:10
#16 10.28 2024-11-04 20:21:13 INFO:	[prodigal]	1 of 1 task(s) completed
#16 10.43 2024-11-04 20:21:13 INFO:	Genetic code 11 selected as optimal
#16 10.43 2024-11-04 20:21:13 INFO:	***** Run HMMER on gene sequences *****
#16 10.43 2024-11-04 20:21:13 INFO:	Running 194 job(s) on hmmsearch, starting at 11/04/2024 20:21:13
#16 11.28 2024-11-04 20:21:14 INFO:	[hmmsearch]	20 of 194 task(s) completed
#16 11.46 2024-11-04 20:21:14 INFO:	[hmmsearch]	39 of 194 task(s) completed
#16 11.65 2024-11-04 20:21:14 INFO:	[hmmsearch]	59 of 194 task(s) completed
#16 12.02 2024-11-04 20:21:15 INFO:	[hmmsearch]	78 of 194 task(s) completed
#16 12.38 2024-11-04 20:21:15 INFO:	[hmmsearch]	97 of 194 task(s) completed
#16 12.70 2024-11-04 20:21:16 INFO:	[hmmsearch]	117 of 194 task(s) completed
#16 13.05 2024-11-04 20:21:16 INFO:	[hmmsearch]	136 of 194 task(s) completed
#16 13.29 2024-11-04 20:21:16 INFO:	[hmmsearch]	156 of 194 task(s) completed
#16 13.52 2024-11-04 20:21:16 INFO:	[hmmsearch]	175 of 194 task(s) completed
#16 13.71 2024-11-04 20:21:17 INFO:	[hmmsearch]	194 of 194 task(s) completed
#16 13.77 2024-11-04 20:21:17 INFO:	Results:	C:19.1%[S:18.0%,D:1.0%],F:5.7%,M:75.3%,n:194	   
#16 13.77 
#16 13.78 2024-11-04 20:21:17 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2024-01-08.tar.gz'
#16 15.76 2024-11-04 20:21:19 INFO:	Decompressing file '/data/busco_downloads/lineages/bacteria_odb10.tar.gz'
#16 16.29 2024-11-04 20:21:19 INFO:	Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2024-01-08)
#16 16.29 2024-11-04 20:21:19 INFO:	Skipping BBTools as already run
#16 16.29 2024-11-04 20:21:19 INFO:	***** Run Prodigal on input to predict and extract genes *****
#16 16.44 2024-11-04 20:21:19 INFO:	Genetic code 11 selected as optimal
#16 16.44 2024-11-04 20:21:19 INFO:	***** Run HMMER on gene sequences *****
#16 16.44 2024-11-04 20:21:19 INFO:	Running 124 job(s) on hmmsearch, starting at 11/04/2024 20:21:19
#16 17.28 2024-11-04 20:21:20 INFO:	[hmmsearch]	13 of 124 task(s) completed
#16 17.43 2024-11-04 20:21:20 INFO:	[hmmsearch]	25 of 124 task(s) completed
#16 17.69 2024-11-04 20:21:21 INFO:	[hmmsearch]	38 of 124 task(s) completed
#16 17.82 2024-11-04 20:21:21 INFO:	[hmmsearch]	50 of 124 task(s) completed
#16 18.00 2024-11-04 20:21:21 INFO:	[hmmsearch]	63 of 124 task(s) completed
#16 18.16 2024-11-04 20:21:21 INFO:	[hmmsearch]	75 of 124 task(s) completed
#16 18.31 2024-11-04 20:21:21 INFO:	[hmmsearch]	87 of 124 task(s) completed
#16 18.61 2024-11-04 20:21:21 INFO:	[hmmsearch]	100 of 124 task(s) completed
#16 18.86 2024-11-04 20:21:22 INFO:	[hmmsearch]	112 of 124 task(s) completed
#16 19.20 2024-11-04 20:21:22 INFO:	[hmmsearch]	124 of 124 task(s) completed
#16 19.26 2024-11-04 20:21:22 INFO:	Results:	C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124	   
#16 19.26 
#16 19.28 2024-11-04 20:21:22 INFO:	bacteria_odb10 selected
#16 19.28 
#16 19.28 2024-11-04 20:21:22 INFO:	***** Searching tree for chosen lineage to find best taxonomic match *****
#16 19.28 
#16 19.54 2024-11-04 20:21:22 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/list_of_reference_markers.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 20.19 2024-11-04 20:21:23 INFO:	Decompressing file '/data/busco_downloads/placement_files/list_of_reference_markers.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 20.19 2024-11-04 20:21:23 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/tree.bacteria_odb10.2019-12-16.nwk.tar.gz'
#16 21.51 2024-11-04 20:21:24 INFO:	Decompressing file '/data/busco_downloads/placement_files/tree.bacteria_odb10.2019-12-16.nwk.tar.gz'
#16 21.51 2024-11-04 20:21:24 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/tree_metadata.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 22.18 2024-11-04 20:21:25 INFO:	Decompressing file '/data/busco_downloads/placement_files/tree_metadata.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 22.19 2024-11-04 20:21:25 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/supermatrix.aln.bacteria_odb10.2019-12-16.faa.tar.gz'
#16 25.00 2024-11-04 20:21:28 INFO:	Decompressing file '/data/busco_downloads/placement_files/supermatrix.aln.bacteria_odb10.2019-12-16.faa.tar.gz'
#16 25.53 2024-11-04 20:21:28 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/mapping_taxids-busco_dataset_name.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 26.17 2024-11-04 20:21:29 INFO:	Decompressing file '/data/busco_downloads/placement_files/mapping_taxids-busco_dataset_name.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 26.17 2024-11-04 20:21:29 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/placement_files/mapping_taxid-lineage.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 27.82 2024-11-04 20:21:31 INFO:	Decompressing file '/data/busco_downloads/placement_files/mapping_taxid-lineage.bacteria_odb10.2019-12-16.txt.tar.gz'
#16 27.84 2024-11-04 20:21:31 INFO:	Extract markers...
#16 27.87 2024-11-04 20:21:31 INFO:	Place the markers on the reference tree...
#16 27.87 2024-11-04 20:21:31 INFO:	Running 1 job(s) on sepp, starting at 11/04/2024 20:21:31
#16 76.68 2024-11-04 20:22:20 INFO:	[sepp]	1 of 1 task(s) completed
#16 77.23 2024-11-04 20:22:20 INFO:	Lineage enterobacterales is selected, supported by 43 markers out of 45
#16 77.23 2024-11-04 20:22:20 INFO:	Downloading file 'https://busco-data.ezlab.org/v5/data/lineages/enterobacterales_odb10.2024-01-08.tar.gz'
#16 79.58 2024-11-04 20:22:22 INFO:	Decompressing file '/data/busco_downloads/lineages/enterobacterales_odb10.tar.gz'
#16 80.39 2024-11-04 20:22:23 INFO:	Running BUSCO using lineage dataset enterobacterales_odb10 (prokaryota, 2024-01-08)
#16 80.39 2024-11-04 20:22:23 INFO:	Skipping BBTools as already run
#16 80.39 2024-11-04 20:22:23 INFO:	***** Run Prodigal on input to predict and extract genes *****
#16 80.55 2024-11-04 20:22:23 INFO:	Genetic code 11 selected as optimal
#16 80.55 2024-11-04 20:22:23 INFO:	***** Run HMMER on gene sequences *****
#16 80.55 2024-11-04 20:22:23 INFO:	Running 440 job(s) on hmmsearch, starting at 11/04/2024 20:22:23
#16 82.26 2024-11-04 20:22:25 INFO:	[hmmsearch]	44 of 440 task(s) completed
#16 83.07 2024-11-04 20:22:26 INFO:	[hmmsearch]	88 of 440 task(s) completed
#16 83.98 2024-11-04 20:22:27 INFO:	[hmmsearch]	132 of 440 task(s) completed
#16 84.59 2024-11-04 20:22:27 INFO:	[hmmsearch]	176 of 440 task(s) completed
#16 85.30 2024-11-04 20:22:28 INFO:	[hmmsearch]	220 of 440 task(s) completed
#16 85.87 2024-11-04 20:22:29 INFO:	[hmmsearch]	264 of 440 task(s) completed
#16 86.32 2024-11-04 20:22:29 INFO:	[hmmsearch]	308 of 440 task(s) completed
#16 86.96 2024-11-04 20:22:30 INFO:	[hmmsearch]	352 of 440 task(s) completed
#16 87.35 2024-11-04 20:22:30 INFO:	[hmmsearch]	396 of 440 task(s) completed
#16 87.74 2024-11-04 20:22:31 INFO:	[hmmsearch]	440 of 440 task(s) completed
#16 87.88 2024-11-04 20:22:31 INFO:	Results:	C:99.1%[S:98.9%,D:0.2%],F:0.0%,M:0.9%,n:440	   
#16 87.88 
#16 87.96 2024-11-04 20:22:31 INFO:	
#16 87.96 
#16 87.96     ---------------------------------------------------
#16 87.96     |Results from generic domain bacteria_odb10        |
#16 87.96     ---------------------------------------------------
#16 87.96     |C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124       |
#16 87.96     |122    Complete BUSCOs (C)                        |
#16 87.96     |122    Complete and single-copy BUSCOs (S)        |
#16 87.96     |0    Complete and duplicated BUSCOs (D)           |
#16 87.96     |0    Fragmented BUSCOs (F)                        |
#16 87.96     |2    Missing BUSCOs (M)                           |
#16 87.96     |124    Total BUSCO groups searched                |
#16 87.96     ---------------------------------------------------
#16 87.96 
#16 87.96     ---------------------------------------------------
#16 87.96     |Results from dataset enterobacterales_odb10       |
#16 87.96     ---------------------------------------------------
#16 87.96     |C:99.1%[S:98.9%,D:0.2%],F:0.0%,M:0.9%,n:440       |
#16 87.96     |436    Complete BUSCOs (C)                        |
#16 87.96     |435    Complete and single-copy BUSCOs (S)        |
#16 87.96     |1    Complete and duplicated BUSCOs (D)           |
#16 87.96     |0    Fragmented BUSCOs (F)                        |
#16 87.96     |4    Missing BUSCOs (M)                           |
#16 87.96     |440    Total BUSCO groups searched                |
#16 87.96     ---------------------------------------------------
#16 87.96 2024-11-04 20:22:31 INFO:	BUSCO analysis done. Total running time: 84 seconds
#16 87.96 2024-11-04 20:22:31 INFO:	Results written in /data/auto
#16 87.96 2024-11-04 20:22:31 INFO:	For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html
#16 87.96 
#16 87.96 2024-11-04 20:22:31 INFO:	Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
#16 88.78 ==> auto/short_summary.generic.bacteria_odb10.auto.txt <==
#16 88.78 # BUSCO version is: 5.8.0 
#16 88.78 # The lineage dataset is: bacteria_odb10 (Creation date: 2024-01-08, number of genomes: 4085, number of BUSCOs: 124)
#16 88.78 # Summarized benchmarking in BUSCO notation for file /data/GCA_010941835.1_PDT000052640.3_genomic.fna
#16 88.78 # BUSCO was run in mode: prok_genome_prod
#16 88.78 # Gene predictor used: prodigal
#16 88.78 
#16 88.78 	***** Results: *****
#16 88.78 
#16 88.78 	C:98.4%[S:98.4%,D:0.0%],F:0.0%,M:1.6%,n:124	   
#16 88.78 	122	Complete BUSCOs (C)			   
#16 88.78 
#16 88.78 ==> auto/short_summary.specific.enterobacterales_odb10.auto.txt <==
#16 88.78 # BUSCO version is: 5.8.0 
#16 88.78 # The lineage dataset is: enterobacterales_odb10 (Creation date: 2024-01-08, number of genomes: 212, number of BUSCOs: 440)
#16 88.78 # Summarized benchmarking in BUSCO notation for file /data/GCA_010941835.1_PDT000052640.3_genomic.fna
#16 88.78 # BUSCO was run in mode: prok_genome_prod
#16 88.78 # Gene predictor used: prodigal
#16 88.78 
#16 88.78 	***** Results: *****
#16 88.78 
#16 88.78 	C:99.1%[S:98.9%,D:0.2%],F:0.0%,M:0.9%,n:440	   
#16 88.78 	436	Complete BUSCOs (C)			   
#16 DONE 89.0s

@erinyoung
Copy link
Contributor

I can't find the other test because of the cache

#21 [test 6/6] RUN wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/010/941/835/GCA_010941835.1_PDT000052640.3/GCA_010941835.1_PDT000052640.3_genomic.fna.gz  &&     gzip -d GCA_010941835.1_PDT000052640.3_genomic.fna.gz &&     busco -m genome -i GCA_010941835.1_PDT000052640.3_genomic.fna -o busco_GCA_010941835.1 --cpu 4 --auto-lineage-prok &&     head busco_GCA_010941835.1/short_summary*.txt
#21 sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2 17.83MB / 27.51MB 0.2s
#21 sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2 27.51MB / 27.51MB 0.3s done
#21 extracting sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2
#21 extracting sha256:d9802f032d6798e2086607424bfe88cb8ec1d6f116e11cd99592dcaf261e9cd2 0.6s done
#21 sha256:29a23fd494993cecba84313a53844055a7c548379eb467d1dc6f81b497b6e104 13.63MB / 283.55MB 0.2s
#21 sha256:29a23fd494993cecba84313a53844055a7c548379eb467d1dc6f81b497b6e104 29.36MB / 283.55MB 0.3s
#21 sha256:dadd254d72b825015767ed89b81dd32f35d78c4f914ff44cd4d66ada05c33c08 688.16kB / 688.16kB 0.0s done
#21 extracting sha256:dadd254d72b825015767ed89b81dd32f35d78c4f914ff44cd4d66ada05c33c08 0.0s done
#21 sha256:14f09e2b93e06cfbacb9e0dff7d13376573e933c72de1fd024f284dc4494b5ff 94B / 94B done
#21 extracting sha256:14f09e2b93e06cfbacb9e0dff7d13376573e933c72de1fd024f284dc4494b5ff done
#21 sha256:4e786963f09335f7d68d354c60f46c07594daefe5aa2d4059b897c4370f841af 131B / 131B done
#21 extracting sha256:4e786963f09335f7d68d354c60f46c07594daefe5aa2d4059b897c4370f841af done
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 13.63MB / 119.26MB 0.2s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 31.46MB / 119.26MB 0.3s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 45.09MB / 119.26MB 0.5s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 58.72MB / 119.26MB 0.6s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 71.31MB / 119.26MB 0.8s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 87.03MB / 119.26MB 0.9s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 100.66MB / 119.26MB 1.1s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 116.39MB / 119.26MB 1.2s
#21 sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 119.26MB / 119.26MB 1.4s done
#21 extracting sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d
#21 extracting sha256:9abbb54a7121c6a02c939015e003b236a6b6f2aab9abba0eabfc431a7a64742d 1.8s done
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de[240](https://github.com/StaPH-B/docker-builds/actions/runs/11672009604/job/32499686932?pr=1095#step:8:246)6db83fbafe3d680360f 12.58MB / 160.07MB 0.2s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 27.26MB / 160.07MB 0.3s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 40.89MB / 160.07MB 0.5s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 52.43MB / 160.07MB 0.6s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 65.01MB / 160.07MB 0.8s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 75.50MB / 160.07MB 0.9s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 91.23MB / 160.07MB 1.1s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 103.81MB / 160.07MB 1.2s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 120.59MB / 160.07MB 1.4s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 134.22MB / 160.07MB 1.5s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 147.85MB / 160.07MB 1.7s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 160.07MB / 160.07MB 1.8s
#21 sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 160.07MB / 160.07MB 1.9s done
#21 extracting sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f
#21 extracting sha256:3aad17cf0176024673c41d087f3276bc937f4d79de2406db83fbafe3d680360f 3.4s done
#21 sha256:ac8d0580fbdf6dc398b2863e6abb3df3e59723b93c2e507c2e26928[242](https://github.com/StaPH-B/docker-builds/actions/runs/11672009604/job/32499686932?pr=1095#step:8:248)c0572a 1.05MB / 1.05MB 0.2s
#21 sha256:ac8d0580fbdf6dc398b2863e6abb3df3e59723b93c2e507c2e26928242c0572a 1.05MB / 1.05MB 0.3s done
#21 extracting sha256:ac8d0580fbdf6dc398b2863e6abb3df3e59723b93c2e507c2e26928242c0572a
#21 extracting sha256:ac8d0580fbdf6dc398b2863e6abb3df3e59723b93c2e507c2e26928242c0572a 0.1s done
#21 sha256:c85af6ebdd100551fc42f1566d27a73cefeaab1a56cb838359b19bc580d05a41 133.12kB / 133.12kB done
#21 extracting sha256:c85af6ebdd100551fc42f1566d27a73cefeaab1a56cb838359b19bc580d05a41 done
#21 sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10 13.63MB / 43.80MB 0.2s
#21 sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10 29.36MB / 43.80MB 0.3s
#21 sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10 43.80MB / 43.80MB 0.5s
#21 sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10 43.80MB / 43.80MB 0.5s done
#21 extracting sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10
#21 extracting sha256:9af5bca65a404dc498ad567b7931beadf7603cf111e8b56b4183ee1d8fc5fb10 0.9s done
#21 CACHED

@erinyoung
Copy link
Contributor

Just kidding!!!

#15 [test 1/6] RUN busco -h && generate_plot.py -h
#15 0.411 usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]
#15 0.411 
#15 0.411 Welcome to BUSCO 5.8.0: the Benchmarking Universal Single-Copy Ortholog assessment tool.
#15 0.411 For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
#15 0.411 
#15 0.411 optional arguments:
#15 0.411   -i SEQUENCE_FILE, --in SEQUENCE_FILE
#15 0.411                         Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files.
#15 0.411   -o OUTPUT, --out OUTPUT
#15 0.411                         Give your analysis run a recognisable short name. Output folders and files will be labelled with this name. The path to the output folder is set with --out_path.
#15 0.411   -m MODE, --mode MODE  Specify which BUSCO analysis mode to run.
#15 0.411                         There are three valid modes:
#15 0.411                         - geno or genome, for genome assemblies (DNA)
#18 40.95 2024-11-04 20:04:13 INFO:	[gff2gbSmallDNA.pl]	27 of 38 task(s) completed
#18 40.97 2024-11-04 20:04:13 INFO:	[gff2gbSmallDNA.pl]	31 of 38 task(s) completed
#18 41.00 2024-11-04 20:04:13 INFO:	[gff2gbSmallDNA.pl]	35 of 38 task(s) completed
#18 41.01 2024-11-04 20:04:13 INFO:	[gff2gbSmallDNA.pl]	38 of 38 task(s) completed
#18 41.05 2024-11-04 20:04:13 INFO:	All files converted to short genbank files, now training Augustus using Single-Copy Complete BUSCOs
#18 41.05 2024-11-04 20:04:13 INFO:	Running 1 job(s) on new_species.pl, starting at 11/04/2024 20:04:13
#18 42.06 2024-11-04 20:04:15 INFO:	[new_species.pl]	1 of 1 task(s) completed
#18 42.09 2024-11-04 20:04:15 INFO:	Running 1 job(s) on etraining, starting at 11/04/2024 20:04:15
#18 43.16 2024-11-04 20:04:16 INFO:	[etraining]	1 of 1 task(s) completed
#18 43.20 2024-11-04 20:04:16 INFO:	Re-running Augustus with the new metaparameters, number of target BUSCOs: 217
#18 43.20 2024-11-04 20:04:16 INFO:	Running Augustus gene predictor on BLAST search results.
#18 43.20 2024-11-04 20:04:16 INFO:	Running Augustus prediction using BUSCO_test_eukaryota_augustus as species:
#18 43.20 2024-11-04 20:04:16 INFO:	Running 40 job(s) on augustus, starting at 11/04/2024 20:04:16
#18 44.52 2024-11-04 20:04:17 INFO:	[augustus]	4 of 40 task(s) completed
#18 44.76 2024-11-04 20:04:17 INFO:	[augustus]	8 of 40 task(s) completed
#18 44.96 2024-11-04 20:04:17 INFO:	[augustus]	12 of 40 task(s) completed
#18 45.29 2024-11-04 20:04:18 INFO:	[augustus]	16 of 40 task(s) completed
#18 45.69 2024-11-04 20:04:18 INFO:	[augustus]	20 of 40 task(s) completed
#18 45.97 2024-11-04 20:04:18 INFO:	[augustus]	24 of 40 task(s) completed
#18 46.32 2024-11-04 20:04:19 INFO:	[augustus]	28 of 40 task(s) completed
#18 47.19 2024-11-04 20:04:20 INFO:	[augustus]	32 of 40 task(s) completed
#18 49.05 2024-11-04 20:04:21 INFO:	[augustus]	36 of 40 task(s) completed
#18 49.67 2024-11-04 20:04:22 INFO:	[augustus]	40 of 40 task(s) completed
#18 49.70 2024-11-04 20:04:22 INFO:	Extracting predicted proteins...
#18 49.71 2024-11-04 20:04:22 INFO:	***** Run HMMER on gene sequences *****
#18 49.71 2024-11-04 20:04:22 INFO:	Running 35 job(s) on hmmsearch, starting at 11/04/2024 20:04:22
#18 50.73 2024-11-04 20:04:23 INFO:	[hmmsearch]	4 of 35 task(s) completed
#18 50.75 2024-11-04 20:04:23 INFO:	[hmmsearch]	7 of 35 task(s) completed
#18 50.77 2024-11-04 20:04:23 INFO:	[hmmsearch]	11 of 35 task(s) completed
#18 50.80 2024-11-04 20:04:23 INFO:	[hmmsearch]	14 of 35 task(s) completed
#18 50.82 2024-11-04 20:04:23 INFO:	[hmmsearch]	18 of 35 task(s) completed
#18 50.84 2024-11-04 20:04:23 INFO:	[hmmsearch]	21 of 35 task(s) completed
#18 50.87 2024-11-04 20:04:23 INFO:	[hmmsearch]	25 of 35 task(s) completed
#18 50.88 2024-11-04 20:04:23 INFO:	[hmmsearch]	28 of 35 task(s) completed
#18 50.89 2024-11-04 20:04:23 INFO:	[hmmsearch]	32 of 35 task(s) completed
#18 50.90 2024-11-04 20:04:23 INFO:	[hmmsearch]	35 of 35 task(s) completed
#18 50.94 2024-11-04 20:04:23 INFO:	42 exons in total
#18 50.94 2024-11-04 20:04:23 INFO:	Results:	C:18.4%[S:18.4%,D:0.0%],F:0.8%,M:80.8%,n:255	   
#18 50.94 
#18 50.97 2024-11-04 20:04:23 INFO:	
#18 50.97 
#18 50.97     ---------------------------------------------------
#18 50.97     |Results from dataset eukaryota_odb10              |
#18 50.97     ---------------------------------------------------
#18 50.97     |C:18.4%[S:18.4%,D:0.0%],F:0.8%,M:80.8%,n:255      |
#18 50.97     |47    Complete BUSCOs (C)                         |
#18 50.97     |47    Complete and single-copy BUSCOs (S)         |
#18 50.97     |0    Complete and duplicated BUSCOs (D)           |
#18 50.97     |2    Fragmented BUSCOs (F)                        |
#18 50.97     |206    Missing BUSCOs (M)                         |
#18 50.97     |255    Total BUSCO groups searched                |
#18 50.97     ---------------------------------------------------
#18 50.97 2024-11-04 20:04:23 INFO:	BUSCO analysis done. Total running time: 48 seconds
#18 50.97 2024-11-04 20:04:23 INFO:	Results written in /data/test_eukaryota_augustus
#18 50.97 2024-11-04 20:04:23 INFO:	For assistance with interpreting the results, please consult the userguide: https://busco.ezlab.org/busco_userguide.html
#18 50.97 
#18 50.97 2024-11-04 20:04:23 INFO:	Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO
#18 DONE 51.4s

@erinyoung
Copy link
Contributor

Thank you for putting both of these together!!!

@erinyoung
Copy link
Contributor

It looks like they both work and I don't have any changes to recommend.

@erinyoung erinyoung merged commit fe5ae13 into StaPH-B:master Nov 5, 2024
3 checks passed
@erinyoung
Copy link
Contributor

Thank you for putting both of these together!

You can check the status of the deploy at https://github.com/StaPH-B/docker-builds/actions/runs/11693868144 and https://github.com/StaPH-B/docker-builds/actions/runs/11693863688

@Kincekara Kincekara deleted the busco branch November 6, 2024 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Container Request]: busco version 5.8.0
2 participants