You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think I may have an idea of the right answer to my question, but hoping to check I'm not going to do something stupid. I'm wondering if I need to create an multisequence alignment file rather than using a multifasta. My ultimate aim is to produce a haplotype network.
I have a locus which I've amplified, but which has lots of small indels within an intronic part of the sequence.
I have used the following code to load a multifasta file into R, but cannot convert it to a matrix because the sequences are uneven lengths. I can't see any way to get a matrix to load with 'NA' for gaps, and I'm not sure if pegas would subsequently re-align the sequences, or assume they were aligned, if the matrix stuck a lot of NAs on the end of each sequence, rather than internally.
library("apex")
library("adegenet")
library("pegas")
library("mmod")
library("poppr")
# To get a SINGLE fasta file in:
myseq<-read.FASTA("ASV_multifasta.fa")
myseq # Provides the summary information of the file
2172 DNA sequences in binary format stored in a list.
Mean sequence length: 329.453
Shortest sequence: 294
Longest sequence: 350
Labels:
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
...
Base composition:
a c g t
0.282 0.204 0.190 0.324
(Total: 715.57 kb)
# We need to make it as a matrix:
myseqmatrix<-as.matrix(myseq)
Then I get the error telling me it won't work because the sequences are different lengths.
Error in as.matrix.DNAbin(myseq) :
DNA sequences in list not of the same length.
If I make a multifasta file that has a sequence from an Multisequence alignment instead of the sequence itself, would that work for pegas and a haplotype network? Or would it then change the output? Would it even work for the conversion to a matrix?
What do people do with uneven sequence lengths?
Many thanks!
The text was updated successfully, but these errors were encountered:
Ah sorry, I had totally missed the #92 issue raised by FischHa. From looking at their data I'm assuming that loading a MSA might work to get it into a matrix. So I guess my remaining question is whether it will work to then plot a haplotype network? Or will indels etc be discarded?
Hi,
I think I may have an idea of the right answer to my question, but hoping to check I'm not going to do something stupid. I'm wondering if I need to create an multisequence alignment file rather than using a multifasta. My ultimate aim is to produce a haplotype network.
I have a locus which I've amplified, but which has lots of small indels within an intronic part of the sequence.
I have used the following code to load a multifasta file into R, but cannot convert it to a matrix because the sequences are uneven lengths. I can't see any way to get a matrix to load with 'NA' for gaps, and I'm not sure if pegas would subsequently re-align the sequences, or assume they were aligned, if the matrix stuck a lot of NAs on the end of each sequence, rather than internally.
Then I get the error telling me it won't work because the sequences are different lengths.
If I make a multifasta file that has a sequence from an Multisequence alignment instead of the sequence itself, would that work for pegas and a haplotype network? Or would it then change the output? Would it even work for the conversion to a matrix?
What do people do with uneven sequence lengths?
Many thanks!
The text was updated successfully, but these errors were encountered: