Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asteroid on non-randomly missing data #5

Open
LPDagallier opened this issue Jun 13, 2023 · 2 comments
Open

Asteroid on non-randomly missing data #5

LPDagallier opened this issue Jun 13, 2023 · 2 comments

Comments

@LPDagallier
Copy link

Hi Benoit,

Thanks for Asteroid, looks a very promising tool!
This is not an issue on the program, but more a question.
From what I understand of the paper, Asteroid performs well with high proportion of data that is missing because of a stochastic process of data deletion (in the case of simulated datasets) or data absence (in the case of empirical datasets).

Do you have any idea of the performance of Asteroid in case data is non-randomly missing?
For example, in case where a dataset combines a few species represented by a lot of genes (e.g. phylogenomic dataset) with a lot of species represented by a few genes (e.g. sanger sequencing/barcode data) (see e.g. https://doi.org/10.1093/molbev/msad109).

Did you tried to simulate missing data in a non random manner?

I'm curious to know whether Asteroid would perform similarly well with high levels of non-random missing data.

Thanks,
Léo-Paul

@BenoitMorel
Copy link
Owner

BenoitMorel commented Jun 14, 2023 via email

@LPDagallier
Copy link
Author

Hi Benoit,
Ok I see, yes for sure Asteroid would be less affected than other tools.
I haven't tried on such dataset, but I'm planning to in the upcoming months. Will let you know how it goes :)
Cheers,
Léo-Paul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants