-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Genome Additions Master Ticket #242
Comments
For reference, Master spreadsheet of dbkeys and indexes done and to-do. Older genomes removed. dbkeys with fasta loaded. |
Issues detected Meets goal of consistency in nomeclature permiting DMs to function change genome label in all_fasta (kill "full" in all descriptions/dbkeys). Other locs may need mods.
|
Genomes that need followup: Not at http://genome.ucsc.edu (browser or download). Are at http://genome-test.cse.ucsc.edu/. Pending release?
|
Completed and indexes promoted to http://usegalaxy.org (Galaxy Main) April 2016 FastaNew genomes (confirmed, to be indexed for all)
2bitNote: Lastz indexes created by same DM New genomes
Existing
SamNew genomes
Existing PicardNew genomes
Existing Bowtie2/Tophat2Issue about Bowtie2 DM creating duplicate indexes: galaxyproject/tools-devteam#319 New genomes
Existing
BWA/BWA-MEMNew genomes
Existing
HISAT2New genomes
Existing
LiftoverSee distinct tracking checklist, below |
2018 FastaNew genomes (confirmed, to be indexed for all)
2bitNote: Lastz indexes created by same DM New genomes
Existing
SamNew genomes
Existing
PicardNew genomes
Existing
Bowtie2/Tophat2New genomes
Existing
BWA/BWA-MEMNew genomes
Existing
HISAT2New genomes
Existing
LiftoverSee distinct tracking checklist, below RNA STARFast tracked genomes https://github.com/galaxyproject/galaxy/issues/1470#issuecomment-307517254 New genomes
Existing
|
New genomes under review (source/licence)
|
LiftoverNeeds DM: galaxyproject/galaxy#1904 Workaround: Use the LiftOver tool at UCSC (the source for the wrapped version in Galaxy) and upload the results to Galaxy to use with other analysis. http://genome.ucsc.edu/cgi-bin/hgLiftOver New genomes
Existing (update)
|
@jennaj I updated the April 2016 comment to include the missing BWA indexes that I was able to build with the BWT-SW algorithm. Some (like galGal3 and panTro3) with full/canonical variants I rebuilt. The only difference from the original DM run is that after selecting the correct build from "Source FASTA Sequence", I put the build variant name (e.g. galGal3canon) in the "ID for sequence" field. Otherwise the builds clobber eachother in the index dir on disk (the "ID for sequence" field is used for naming the index subdirectory and defaults to the dbkey - which for both full and canonical builds is still just e.g. galGal3 - this could be a bit more intuitive in the DM, I had no idea what "ID for sequence" was for until I noticed that two loc file entries pointed to the same directory/indexes on disk and then dug into the DM code to understand it). I rebuilt these for any indexes which had the variants built originally, and cleaned up the old directories and their entries in the location files. These BWA indexes and the rest of the indexes in that comment are now in the process of being published to CVMFS and once done (this may take a long time) will be available on usegalaxy.org (after a restart, I'll comment again when it's all ready). |
@jennaj The publishing is finished and Main has been restarted. |
Add hg38 MAF alignments. Request: https://biostar.usegalaxy.org/p/17690 |
Hello, I saw 7 weeks ago that another user had made this same request for a newer version of the sheep reference genome - you currently have OviAri1 which is 6 years old and there are two newer versions (about to be 3 newer versions) could we get a newer version? Sheep are amazing agricultural species important for meat milk and wool production and more researchers should study them! I request the current version on NCBI/ENSEMBL for all tools Bowtie and mapping tools, and chIP-seq, RNA-seq tools too: Ovis aries Oar_v4.0 its from late 2015. Thank you for considering! |
Hi I think the best way to is to use your genome of interest - use the fasta I don't think they are uploading any more reference genomes on their Sayali. Update by @jennaj: Yes, use a custom reference genome for now. I will add in sheep and other requests to the next list of updates https://github.com/galaxyproject/galaxy/issues/1470#issuecomment-208444904 |
It will be great if you include the tomate 2.40 genome from: @jennaj Yes, they are in NCBI (tomato and pepper): |
Priority indexes RNA STAR
Indexes
Future requests (may be moved to a new post in this same issue)
|
We're looking for X. tropicalus index to be uploaded to HiSat2
|
New Pig genome: genome: lookup table nice names: https://test.galaxyproject.org/u/bickj/h/pig-genome-lookup-table |
Dear Galaxy Team, If we could get Brassica napus (Bna) as a built-in genome, that would be amazing: Please note that although the annotation is titled v5 while the genome itself is v4.1, it should work just fine, as we have had no problems with it. |
Add NCBI's Xenopus laevis and Xenopus tropicalis genomes (indexed for all tools). The genome is at https://usegalaxy.eu -- so when we get the data synced between all mirrors that might be the best solution. |
Request: add Medicago truncatula https://biostar.usegalaxy.org/p/5916/#30132 To-do: Check if present in ELIXER plant genomes already indexed (to be added in cvmfs): https://www.elixir-europe.org/about/groups/galaxy-wg |
Request: add https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Canis_lupus_dingo/100/ re: https://help.galaxyproject.org/t/dingo-reference-genome-upload-request/529 To-do: Check if UCSC has processed the genome |
Dear Galaxy Team, Thanks for your amazing work. Please note that the genome is version 2.0 and made from the surface eco-morphotype (unlike the previous version 1.02 from cave eco-morphotype). In the meantime, I am working with a custom genome. Best, |
Migrate to usegalaxy-playbook? |
Request: Genome: Citrus sinensis v1.1 |
Request: Human herpesvirus 1 with ref accession number NC_001806 |
Request: Genome: Tribolium castaneum genome assembly (Tcas5.2) Source: https://www.ncbi.nlm.nih.gov/genome?term=tribolium%20castaneum |
Request: Dada tools: #273 |
Dear Galaxy Team, It would be great if the genome and annotation release of Physcomitrella patens can be added to Galaxy Main. They are available at NCBI: Best, |
@echoyps & everyone else with genome requests: We will now be adding new genomes over the next few months and throughout the upcoming year. Please continue to post requests here. UCSC and NCBI are the preferred data sources. Others are possible. However requested, be specific. Reminders: Anyone can use genome (or transcriptome/exome) Be sure to format the genome Mapping jobs will usually not "fail" due to chromosome identifier mismatch issues. Instead, if the annotation is input during the mapping step, the annotation will not really be used, creating problematic scientific results that may not be obvious to detect. Tools used downstream with a mismatched genome+annotation can also produce problematic scientific results that are not obvious, or may fail outright with errors that are difficult to interpret. Problematic annotation formatting itself will also lead to problems. Try to avoid issues by preparing your inputs correctly at the start :) Finally, when loading these data with the If you ever have a problem that you cannot figure out how to resolve, know that the vast majority of tool errors or unexpected results are due to input issues that can be fixed to achieve a successful and correct scientific/technical result. First, review the tool form help -- most have examples of the expected input's content and format. Next, review our Troubleshooting and other FAQs. If those do not resolve the issue, the Galaxy Help forum is a great place to review prior Q&A or to ask a novel question. The Galaxy Training Network (GTN) tutorials are also a very useful resource -- compare your methods to the examples. I'm only posting this advice here now since it hasn't been covered for a while at Github, and there are newer related FAQs plus prior Q&A available. Any followup/clarification should be asked about at Galaxy Help (not here). The FAQs/links below will help with all of the above. All FAQs: https://galaxyproject.org/support/ Start with these to learn how to use a custom genome and the associated annotation:
Error or unexpected result FAQ: Galaxy Help forum:
GTN Tutorials: Thanks! Jen |
Hi! could you kindly add macaca fascicularis genome for BWA ? it's now at ncbi https://www.ncbi.nlm.nih.gov/genome/776 [There is a 2015 petition for the same here https://trello.com/c/mJWnAuuQ/1511-reference-genome-requests-for-http-usegalaxyorgby AmyK Feb 4, 2015 at 6:47 PM and another in 2016, but i guess the source wasn't validated then? ] Best! David |
Genome and indexes for CVMFS and http://usegalaxy.org
CONVERT THIS ISSUE TO A PROJECT @jennaj
Current plans are to bring http://usegalaxy.org up to date with UCSC's released genomes, indexed for all tools, so those do not need to be requested by users at this time.
Main:
Other reference data:
Admin/Local Data and DM usage enhancements are included here.
Resolved data issues:
New genomes and indexes will be installed at https://test.galaxyproject.org/ first for testing. If your genome is listed and checked as complete, community testing and feedback can be posted to https://biostar.usegalaxy.org or through a bug report from the error dataset (from a mapping tool, etc).
All data will later be promoted to http://usegalaxy.org. Timeline is not firm.
Making a Reference genome request
The text was updated successfully, but these errors were encountered: