Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing FASTA file paths from bac120 and ar122 GTDB metadata #23

Open
dgolden96 opened this issue Jun 10, 2022 · 1 comment
Open

Missing FASTA file paths from bac120 and ar122 GTDB metadata #23

dgolden96 opened this issue Jun 10, 2022 · 1 comment

Comments

@dgolden96
Copy link

Hi there,

I've managed to replicate the Kraken2 database creation process with the toy dataset as described in the ReadMe file, but I've run into a snag doing the same using metadata from the GTDB. The metadata files at the following URL don't seem to contain the FASTA file paths necessary for running the pipeline: "https://data.gtdb.ecogenomic.org/releases/release202/202.0/". Would you happen to know of a workaround by which I can use one of the other fields in the metadata to get the necessary filepaths?

Thanks!

@nick-youngblut
Copy link
Contributor

You have to add the fasta files yourself to the table. The file paths would be specific to the genomes that you have locally downloaded. See https://github.com/leylabmpi/Struo2#downloading-genomes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants