You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, Is there any way to extract all mutations in a Particular gene for all samples , say in the Spike region and then build a phylogenetic tree with it
Something like this ? matUtils extract --i my_pb_file.pb -m S_gene_mutation -t tree_with_S_gene_mutations.nwk
Or use the Mutation Annotated jsonl tree file to filter mutation in specific gene?
Thank you ,
The text was updated successfully, but these errors were encountered:
Hi @Sanyukta2001, I don't think we have a single command to do that, but I think it could be done by extracting VCF from your protobuf tree file, filtering the VCF to keep only the Spike mutations, and running usher-sampled on the filtered VCF to build a new tree. It would go something like this, assuming the reference sequence is NC_045512.2 (Wuhan/Hu-1) for your tree so Spike coords are 21563-25384:
matUtils extract -i my_pb_file.pb -v my_pb_mutations.vcf
# Filter VCF to only mutations within the Spike gene coordinates
grep ^# my_pb_mutations.vcf > my_pb_mutations.filtered.vcf
grep -v ^# my_pb_mutations.vcf | awk '$2 >= 21563 && $2 <= 25384' >> my_pb_mutations.filtered.vcf
# Use filtered VCF to build a new tree
echo '()' > emptyTree.nwk
usher-sampled -t emptyTree.nwk -v my_pb_mutations.filtered.vcf -o my_pb_file.spikeOnly.pb
If your tree is large (say >10,000 samples) then it might be faster to add the usher-sampled option --optimization_radius 0 to prevent usher-sampled from doing rounds of optimization interspersed with adding samples from VCF, and then run matOptimize after usher-sampled.
Hi, Is there any way to extract all mutations in a Particular gene for all samples , say in the Spike region and then build a phylogenetic tree with it
Something like this ?
matUtils extract --i my_pb_file.pb -m S_gene_mutation -t tree_with_S_gene_mutations.nwk
Or use the Mutation Annotated jsonl tree file to filter mutation in specific gene?
Thank you ,
The text was updated successfully, but these errors were encountered: