-
Notifications
You must be signed in to change notification settings - Fork 63
-
Is there an easy way to convert the data we're displaying in jbrowse1 to be displayed in jbrowse2? this seems like a very basic question but i can't seem to find docs on the topic. Our IT group has already set up jbrowse2, and now i have to get our data to display on it. i'm hoping this is an easy convert? thanks. |
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 8 comments · 39 replies
-
Hi there, |
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks very much for your offer to help. our jbrowse1 sites are public and can be view at: FYI, we really just want to replicate what we currently have in JB1 into JB2, ie, we're not looking to take advantage of any new features that JB2 provides. The motivating factor in moving to JB2 is that our IT department identified some security vulnerabilities in JB1 javascript library - that are not present in JB2. and... you folks are focusing your dev efforts on JB2 - so that's where we should be. I've attached the jbrowse.conf (which seems to have most lines commented out) and the hopefully more informative trackList.json. Both of these files are from the MTB server. jbrowse.conf.txt thanks for your help, Mike. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Here is an example script that I made to try to help convert your jbrowse 1 trackList.json into a jbrowse 2 config.json. Just does a very basic conversion, not attempting to copy color settings and such const fs = require("fs");
const trackList = JSON.parse(fs.readFileSync("./trackList.json", "utf8"));
const newTracks = [];
const unconverted = [];
function basicData(track) {
return {
name: track.key || track.label,
trackId: track.label,
assemblyNames: ["salmonella"],
category: (track.category || track.metadata.category)
?.split("/")
.map((t) => t.trim()),
};
}
for (const track of trackList.tracks) {
if (track.type.includes("XYPlot")) {
newTracks.push({
...basicData(track),
type: "QuantitativeTrack",
adapter: {
type: "BigWigAdapter",
bigWigLocation: { uri: track.urlTemplate },
},
});
} else if (track.storeClass.includes("NCList")) {
newTracks.push({
...basicData(track),
type: "FeatureTrack",
adapter: {
type: "NCListAdapter",
rootUrlTemplate: { uri: track.urlTemplate },
},
});
} else {
unconverted.push(track);
}
}
console.log(
`failed to convert: ${unconverted.length} tracks, written to unconverted.json`,
);
console.log(`converted: ${newTracks.length} tracks, written to config.json`);
fs.writeFileSync("unconverted.json", JSON.stringify(unconverted, null, 2));
fs.writeFileSync(
"config.json",
JSON.stringify(
{
assemblies: [
{
name: "salmonella",
sequence: {
type: "ReferenceSequenceTrack",
trackId: "salmonella_seq",
adapter: {
type: "IndexedFastaAdapter",
fastaLocation: {
uri: "salmonella.fa",
},
faiLocation: {
uri: "salmonella.fa.fai",
},
},
},
},
],
tracks: newTracks,
},
null,
2,
),
); This is not a complete ready-to-go conversion, but it created many of the basic feature track and bigwig tracks. Example new config.json for salmonella.wadsworth.org
|
Beta Was this translation helpful? Give feedback.
All reactions
-
You would basically put the config.json in the data folder, and also create a |
Beta Was this translation helpful? Give feedback.
All reactions
-
I will note, regarding the security vulnerabilities, the 'severity' of jbrowse 1 security issue (and jbrowse 2 for that matter) are likely not very severe in nature, because they are "pure client side apps" (no server side code), and so it would not really lead to such devastating things as remote code execution in general. At worst, it is an XSS, which is not great, but generally not like high CVE score. Hope that helps clarify perhaps :) |
Beta Was this translation helpful? Give feedback.
All reactions
-
thanks much for the help. i hope to work on this throughout this week. |
Beta Was this translation helpful? Give feedback.
All reactions
-
i installed the config.json file (sort of) as you instructed, but i'm getting errors. i've fiddled a bit and still getting errors. the 'sort of' part is that your response above says: "put the config.json in the data folder, and also create a samtools faidx salmonella.fa and put that in the data folder as well". Error: [mobx-state-tree] Error while converting i appreciate any help you can offer. thx. |
Beta Was this translation helpful? Give feedback.
All reactions
-
hi there, your changes should be fine, I just had a bug in my code/config that I pasted...needed to replace "label" with "trackId". I know the errors are pretty unreadable...we have some ongoing brainstorms to try to improve. see my original post above for the corrected code and outputted config. |
Beta Was this translation helpful? Give feedback.
All reactions
-
thanks for the corrected code. i've had some success (see attached image). Clearly the data is being read in some fashion, but the actual data itself is not displaying. any suggestions appreciated. |
Beta Was this translation helpful? Give feedback.
All reactions
-
@mjpworth good to see. The most common issue where data is not displaying is that the data files refNames don't match the reference genome FASTA file's refNames. I can see that JBrowse thinks the "refName" that you are browsing is "14028s_chr.fa". normally it is something like "chr1" or something like that rather than a fasta filename. you can use "bigWigInfo" to print out the refNames the bigwigs are using, and then you can also look at the FASTA file to see if they match that
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Again, thank you for your help. I am in the process of getting bigWigInfo installed... but i don't think this is the issue. i will be happy to be wrong! i understand it's odd, but this is the header line of the fasta file:
|
Beta Was this translation helpful? Give feedback.
All reactions
-
I would perhaps still suspect that it is a refNames not matching issue. as you mention the refNames do look odd, so it would be worth checking that they match between the BigWig and the reference genome FASTA |
Beta Was this translation helpful? Give feedback.
All reactions
-
also, easiest way to install e.g. bigWigInfo IMO is to download files from https://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/ if you are on linux (mac here also https://hgdownload.cse.ucsc.edu/admin/exe/). there is also installing "brew install brewsci/bio/kent-tools" if you use homebrew, and probably others |
Beta Was this translation helpful? Give feedback.
All reactions
-
final add-on: I'd like jbrowse itself to be able to report the refNames that a track uses better. I proposed a PR here that could help |
Beta Was this translation helpful? Give feedback.
All reactions
-
here's a surprise, I think you're right! :) this .bw file was created from a wig file whose first two lines are: although there is no 'refName' here either, there is 'chrom' and it's NOT 14028s_chr.fa |
Beta Was this translation helpful? Give feedback.
All reactions
-
you can use now you can create a "refname alias" configuration. this is documented here https://jbrowse.org/jb2/docs/config_guides/assemblies/#configuring-reference-name-aliasing for our human genome, it is basically for example this file https://s3.amazonaws.com/jbrowse.org/genomes/hg19/hg19_aliases.txt which says e.g. "1" or "chr1" or "NC_000001.10" all refer to the same sequence. a simplified way of thinking about how this affects how tracks are displayed is like this: the data file (e.g. a bigwig file) tells jbrowse what reference name system it uses, and then jbrowse says "OK, I will request data from the file using that naming scheme!" however, i would also double check that this is really intended. the refName that jbrowse refers are basically chromosomes. we use the term refName to be a little more 'broadly encompassing' of sequences that are not just chromosomes as a matter of being sort of inclusive at the expense of being a bit vague. the refNames for the human genome are like chr1, chr2, chr3, chr4, etc. I am just making sort of guesses from the limited info I have from this thread, but it is a little odd for the refNames to look like a filename like "14028s_chr.fa". i see on this jbrowse 1 for salmonella i found from google (maybe the one you set up!) https://salmonella.wadsworth.org/?loc=14028s_plasmid%3A18721..93832&tracks=DNA%2Cdel-hilC_pBAD-empty_1_plas_minus&highlight= it is "14028s_chromo" and "14028s_plasmid". that looks more natural to me as it does not have the .fa file extension in the refName. |
Beta Was this translation helpful? Give feedback.
All reactions
-
i should add i did this work a long time ago... i think 2015 is when i first set this up on jb1. |
Beta Was this translation helpful? Give feedback.
All reactions
-
bump. output of bigWigInfo, as you requested, is above. |
Beta Was this translation helpful? Give feedback.
All reactions
-
just added a reply above. i think the refnames are more important than the assemblynames, but if interested you can also join our office hours :) its part of a new outreach effort we are trying and you can schedule a 1-on-1 help with us! https://jbrowse.org/jb2/contact/ |
Beta Was this translation helpful? Give feedback.
All reactions
-
thank you for this info. most of what you state makes sense, but i will still need to work through the details. |
Beta Was this translation helpful? Give feedback.
All reactions
-
see the other post with the google drive folder... i think it is an effective translation and didn't involve a "rebuild"! I just copied your old bigwigs into it: #3927 (reply in thread) |
Beta Was this translation helpful? Give feedback.
All reactions
-
Is there someplace in jbrowse2 docs that define the syntax (not of json) and/or meaning of labels (if labels is the right word) and what are possible acceptable values in a block like this: |
Beta Was this translation helpful? Give feedback.
All reactions
-
Colin, |
Beta Was this translation helpful? Give feedback.
All reactions
-
@mjpworth I think I had another script (might have lost the source code) that added the min/max score values that created the display config initially, and then added that other test.js script that added the color to that display config you can use whatever language is easiest for you, basically just manipulate the JSON of the config file, if python is easiest, probably fine to make a config.json manipulator in python. the config.json is admittedly tricky and you may come up against other config scenarios. one way I sometimes recommend people to learn about the config format is to use the in-app config editor (the "Settings" panel) and then copy that config manually back into your config.json or as a template for other tracks. by using the in-app config editor, you know that the resulting config json that is exported is something the app understands This is a screenshot showing the steps I take for this. The example is a multibigwig but it works on other track types. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Colin, i broke down and wrote a perl(!) conversion script that reads both the (old) jb1 trackList.json file and the new (incomplete) jb2 config.json file. it pulls appropriate data out of the jb1 trackList file and adds it to the corresponding track in the jb2 config.json file. Mostly, it worked fine, ie, colors and min and maxScores seem to be in place. I executed both of these commands and got similar results (listed below): output: a 'trix' dir was created, but it's empty. this text was added to the config.json file (at its end): any suggestions? thanks. Mike; i think i'm close! |
Beta Was this translation helpful? Give feedback.
All reactions
-
The jbrowse text-index command only works on VcfTabixAdapter and Gff3TabixAdapter currently The NCList adapter can be used but you have to hand configure it to point at an existing JBrowse1TextSearchAdapter (https://jbrowse.org/jb2/docs/config/jbrowse1textsearchadapter/) that was created by the older bin/generate-names.pl. @scottcain has used this method for wormbase and you can see these in this config.json (it is hand edited into the config.json, there are no tools in jbrowse cli for this). my recommendation would be to use Gff3Tabix instead of the NCListAdapter tracks, and then you can use the newer |
Beta Was this translation helpful? Give feedback.
All reactions
-
note: the folder on the google drive i posted includes a gff3tabix with trix index created by |
Beta Was this translation helpful? Give feedback.
All reactions
-
Beta Was this translation helpful? Give feedback.
All reactions
-
the fact that it is empty is probably the most notable thing there, do you have an example of the gff3? |
Beta Was this translation helpful? Give feedback.
All reactions
-
I've attached the top 30 lines of my gff3 - as a txt file. the thing i notice is that i only have CDS, ie, i don't have any features labeled as 'gene'. |
Beta Was this translation helpful? Give feedback.
All reactions
-
The text-index command by default ignores CDS features as the IDs for these, when they have gene parents, is often 'meaningless'. That is not a great default for GFF that is just CDS features without parent "gene" features unfortunately. You can change this by setting the "exclude" to empty Example
The text-index command does not hierarchicaly reconstruct GFF features, so it just does it line by line, checking the feature type on each line. It's possible this can be revisited in the future so we handle this case by default better |
Beta Was this translation helpful? Give feedback.
All reactions
-
@cmdcolin when i go to the URL, i don't see what i though it's see... i still need to click either 'Open' or 'SHOW ALL REGIONS IN ASSEMBLY', regardless of which i select, i get this error: Thanks for your assistance. |
Beta Was this translation helpful? Give feedback.
All reactions
-
I have actually not used set-default-session and it may even have bugs that cause it to be broken right now. We may need to check. I suggest instead using File->Export session to file, then copying the JSON from session key in that file into the defaultSession key in the config.json (or alternatively, use the admin-server which has a UI dialog.to set default session, and this writes directly to the config.json) |
Beta Was this translation helpful? Give feedback.
All reactions
-
I was unaware of this thread until this morning, when I read through it all. At one point, I considered adding a jokey comment that I have several perl scripts that I've used for manipulating JB2 config.json, but then I saw that @mjpworth wrote his own code in perl, so I'll point out a few things I've used too when doing JB1->JB2 conversions. For Xenopus and yeast, I have: https://github.com/alliance-genome/agr_amplify_jbrowse2/tree/stage/templates/JB1_parsing These are pretty hacky, as the output is qq() instead of properly using the JSON module from CPAN, but I was looking for quick turnaround and the input trackList.json and tracks.conf files where pretty consistent, so I knew I'd be getting pretty constant output. |
Beta Was this translation helpful? Give feedback.
All reactions
-
@scottcain thanks for your comments. i bet my perl script is hackier than yours! :) |
Beta Was this translation helpful? Give feedback.
All reactions
-
😄 1
-
@cmdcolin thanks for you help w/ these issues. I've completed the conversion of the three jb1 servers to jb2. (they are still only 'dev' on our side; i'm waiting for our IT to promote them to prod). |
Beta Was this translation helpful? Give feedback.
@cmdcolin thanks for you help w/ these issues. I've completed the conversion of the three jb1 servers to jb2. (they are still only 'dev' on our side; i'm waiting for our IT to promote them to prod).