What is the process for updating databases #61
Replies: 1 comment 2 replies
-
Hello, There is no official distribution of updated databases. And as you already know, all databases are hosted in the https://edge-dl.lanl.gov/EDGE/dev/. For BWA index, users can download the sequences from NCBI and build their own bwa index to update it and also updated the associated taxonomy database as described here. Since GOTTCHA2 is a sequel of GOTTCHA, we will recommend use GOTTCHA2 as preferred tool. We have an updated GOTTCHA2 database built by @poeli recently but the size is too big to be hosted on edge-dl site. We are looking a better solution for hosting large db files. For the diamond database, after diamond database created, users should update corresponding taxonomy files (names.dmp, nodes.dmp, and merged.dmp from NCBI taxdump.tar.gz ) in the $EDGE_HOME/database/diamond/taxonomy to make the EDGE aware of updated TAXID. --Chienchi |
Beta Was this translation helpful? Give feedback.
-
Hello. We are interested in updating the taxonomic profiling databases on our local instance. Are there any instructions for how to do this?
First, I want to make sure that there is not an 'official' distribution of updated databases. I looked at the databases distribution that is described in the installation instructions (https://edge-dl.lanl.gov/EDGE/dev/), and it looks like these are several years old. For instance, the BWA distribution has a 'modified date' of March 2020, but the actual database is from 2017. Likewise, the GOTTCHA DB files are from 2015.
Since there don't seem to be an updated DB collection for EDGE, I attempted to create updated databases for BWA and Diamond. I am unsure whether these can be updated independently of the other databases -- specifically whether EDGE needs additional lookup tables to provide a taxonomic interpretation of these results. My updated BWA library appears to work correctly. However, my Diamond database fails specifically because EDGE is not aware of the TAXID that come from Diamond (I created the database using the taxonmap, taxonnames, and taxonnodes options)
Below is the error message from the bottom of the log file at
ReadsBasedAnalysis/Taxonomy/log/allReads-diamond.log
Beta Was this translation helpful? Give feedback.
All reactions