Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring individuals in EIGENSOFT in IND file #14

Open
muhammadsohailraza opened this issue Nov 22, 2016 · 2 comments
Open

Ignoring individuals in EIGENSOFT in IND file #14

muhammadsohailraza opened this issue Nov 22, 2016 · 2 comments

Comments

@muhammadsohailraza
Copy link

Dear All,
I am trying to run Eigensoft's SMARTPCA module for my PCA analysis on world-wide populations of 1000 G together with my own samples. However, when i run SMARTPCA with remove outlier options, it removes some samples from 1000G populations and from my data set.

So, I run the SMARTPCA again (without outlier removal option) while ignoring the some of the outlier individuals in the "*.ind" file. as:
HG04239 M SAS
NA06984 M ???
NA06985 F ???
NA06986 M ???
NA10847 F CEU
NA10851 M CEU
NA11829 M CEU
NA11830 F CEU

But the E-values still calculated in the resultant output files for the individuals marked. Can anyone kindly guide me how to format the "ind" file to ignore some individuals from subset of populations?

Thank you very much!

--sohail

@bumblenick
Copy link

The population label should be "Ignore" (no quotes)
Also convertf with such a label will strip out such individuals.
In addition, if you specify the populations that you want to use
to make axes, (poplistname:) the ??? individuals will appear in the
evec file but this seems likely to be completely harmless.

Nick

On Tue, Nov 22, 2016 at 2:42 AM, SOHAIL [email protected] wrote:

Dear All,
I am trying to run Eigensoft's SMARTPCA module for my PCA analysis on
world-wide populations of 1000 G together with my own samples. However,
when i run SMARTPCA with remove outlier options, it removes some samples
from 1000G populations and from my data set.

So, I run the SMARTPCA again (without outlier removal option) while
ignoring the some of the outlier individuals in the "*.ind" file. as:
HG04239 M SAS
NA06984 M ???
NA06985 F ???
NA06986 M ???
NA10847 F CEU
NA10851 M CEU
NA11829 M CEU
NA11830 F CEU

But the E-values still calculated in the resultant output files for the
individuals marked. Can anyone kindly guide me how to format the "ind" file
to ignore some individuals from subset of populations?

Thank you very much!

--sohail


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#14, or mute the thread
https://github.com/notifications/unsubscribe-auth/AQn_h0gRI_oJv4NFQoMJOVhPHuImQh_Jks5rApzQgaJpZM4K5GIr
.

@muhammadsohailraza
Copy link
Author

issue resolved.. thank you @nick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants