Skip to content
This repository has been archived by the owner on Jun 2, 2022. It is now read-only.

augmenting available sequence for a specific mammalian protein with VGP/Tree of Life raw reads #7

Open
avilella opened this issue May 20, 2022 · 0 comments

Comments

@avilella
Copy link

Hi all,

I am trying to augment the available sequences for a handful of specific vertebrate/mammalian proteins and my idea was to use 'diamond blastx' to blast fastq data from the VGP / Tree of Life raw reads.

I've seen the darwintreeoflife.data repo, which says it's discontinued, and this seems to be the one containing more up to date information (up to current month). Is there a way to get a long list of http bam or cram URLs for all the vertebrates/mammalian genomes in the Tree of Life project? E.g.

Something equivalent to the "*data.tsv" files in darwintreeoflife.data but up-to-date with current freshly generated data. Thanks in advance.

find darwintreeoflife.data/ -name "*data.tsv" | sort -V | xargs cat | grep -e 'bam$' -e 'cram$'
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant