Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to understand the levels in the result file? #5

Open
billzt opened this issue Jan 28, 2023 · 1 comment
Open

How to understand the levels in the result file? #5

billzt opened this issue Jan 28, 2023 · 1 comment

Comments

@billzt
Copy link

billzt commented Jan 28, 2023

This is a row of my result

;SeqID	MislabeledLevel	OriginalLabel	ProposedLabel	Confidence	OriginalTaxonomyPath	ProposedTaxonomyPath	PerRankConfidence
MN952976	Domain		Pempheriformes	0.999	Metazoa;Chordata;Actinopteri;;Lactariidae;Lactarius;Lactarius lactarius	Metazoa;Chordata;Actinopteri;Pempheriformes	0.999;0.999;0.999;0.999

My question is:
(1)Why the MislabeledLevel is "Domain"?
(2)How to prepare the taxonomic annotations file in -t correctly? How to do if certain level is missing? Currently I just put an empty string there

@amkozlov
Copy link
Owner

Sorry for the late response.

SATIVA requires balanced taxonomy, i.e. all sequences should ideally have the same number of taxonomic levels, or at least the should be no 'holes' or empty labels.

In your example

Metazoa;Chordata;Actinopteri;;Lactariidae;Lactarius;Lactarius lactarius

should be replaced by

Metazoa;Chordata;Actinopteri;Perciformes;Lactariidae;Lactarius;Lactarius lactarius

according to Wikipedia

Should the label be indeed missing at some rank due to unbalanced taxonomy, you could introduce artificial (but non-empty!) label, e.g. PerciformesFamily1.

Finally, please add -x zoo option so specify that you are using zoological taxonomic code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants