Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write.dna cuts off dna sequences when put in a list #131

Open
TheLaughingDuck opened this issue Nov 21, 2024 · 2 comments
Open

write.dna cuts off dna sequences when put in a list #131

TheLaughingDuck opened this issue Nov 21, 2024 · 2 comments

Comments

@TheLaughingDuck
Copy link

Essentially, when the data going into write.dna is of the DNAbin class, the output dnabin.fasta file is nicely formatted.

Reproducible example:

# This produces a fasta file with nicely formatted sequences
dnabin_sequences <- ape::read.GenBank(c("JF806202", "HM161150", "FJ356743"))
ape::write.dna(dnabin_sequences,
               file ="dnabin.fasta",
               format = "fasta")

image

When the data going into write.dna is a list of sequences, the file test_fasta contains sequences that are not broken apart with spaces, and they cut off after 10 bases. I played around with the arguments, and the closest I got was setting the colw argument to some absurdly large value so that the entire sequences were included.

# This produces a fasta file without separation, and it cuts off the sequences
list_sequences <- list("ggaggccatagagcagatgctgaggtgatagatggaacatga",
                       "ggaggccatagagcagatgctgaggtgatagatggaacatga",
                       "ggaggccatagagcagatgctgaggtgatagatggaacatga")

ape::write.dna(list_sequences,
               file ="test.fasta",
               format = "fasta")

image

@emmanuelparadis
Copy link
Owner

Hi,

Your object list_sequences is not of the correct class. You can convert it with;

list_sequences <- as.DNAbin(sapply(list_sequences, strsplit, split = ""))

then it'll be usable by ape. See this doc for explanations about how "DNAbin" objects are coded.

Emmanuel

@TheLaughingDuck
Copy link
Author

Aaah I see, thank you!

TheLaughingDuck added a commit to TheLaughingDuck/bioinformatics_labs that referenced this issue Nov 22, 2024
sim_sequences were the wrong type, see emmanuelparadis/ape#131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants