Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble exporting multidna object as fasta file... #48

Open
Wesleyhattingh opened this issue Aug 3, 2016 · 1 comment
Open

Trouble exporting multidna object as fasta file... #48

Wesleyhattingh opened this issue Aug 3, 2016 · 1 comment

Comments

@Wesleyhattingh
Copy link

Dear Dr Jombart,

I am hoping you might be able to help me. I have sucesfully used the multidna constructor algorithms in apex to combine sequence data from 2 gene regions into a single matrix. This is really a fantastic addition to R, thank you!

However, I would now like to export this concatenated file in the fasta format but am having some difficulty doing that. I generally use the "writeXStringSet" function to export sequence data in fasta format but R returns the following error "'x' must be an XStringSet object". I have tried to coerce this object into a string set but this does not work either. I have copied some of my code below for your reference if you have a chance to look.

Any help on how to export a multidna object as a fasta file would be GREATLY appreciated! I fear that I am overlooking something very simple.

I look forward to hearing from you.

new("multidna")
ITS_working_1Aug16<- read.dna(choose.files(), format="fasta")
ITS_working_1Aug16
trnL_working_1Aug16<- read.dna(choose.files(), format="fasta")
trnL_working_1Aug16
genes<-list(ITS=ITS_working_1Aug16,trnL=trnL_working_1Aug16)
Two_gene_supermatrix<-new("multidna",genes)
Two_gene_supermatrix
getNumInd(Two_gene_supermatrix) # The number of individuals
getNumLoci(Two_gene_supermatrix) # The number of loci
getLocusNames(Two_gene_supermatrix) # The names of the loci
getSequenceNames(Two_gene_supermatrix) # A list of the names of the sequences at each locus
getSequences(Two_gene_supermatrix) # A list of all loci

#now align this concatenated matrix####
Two_gene_supermatrix_R<-concatenate(Two_gene_supermatrix)#converts the multiDNA object into something that can be alignable 
Two_gene_supermatrix_R<-DNAStringSet(Two_gene_supermatrix_R)###convert to a DNA string set
Two_gene_supermatrix_R
writeXStringSet(Two_gene_supermatrix_R, file="C:\\Users\\Wesley Neil Hattingh\\Dropbox\\R statistics\\PhD_WN Hattingh_2016\\Data files\\Two_gene_supermatrix_R.fas")
@KlausVigo
Copy link
Collaborator

Dear @Wesleyhattingh,
you can simplify your code and read all the files in with read.multiFASTA function in a multidna object. After concatenation the you will have a DNAbin object (a class from package ape) not a multidna object, which you can export using write.dna. So the code would look like:

genes <- read.multiFASTA(c("ITS.fas",  "trnL.fas"))
Two_gene_supermatrix<-concatenate(genes)
class(Two_gene_supermatrix)  # check the class
write.dna(Two_gene_supermatrix_R, file="result.fas", format = "fasta", colsep = "", nbcol = -1)

The Biostrings package seems to lack a method for conversion from DNAbin objects, the other direction (from XStringSet to DNAbin) is contained in ape.
Regards,
Klaus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants