-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XTX has zero trace #41
Comments
1) Replace chr1 by 1 etc
2) I recommend using PLINK to make a packed PLINK file
or (better) convertf to make .snp .ind .geno files. (Recih lab format)
PED is a horrible format and best to use it as little as possible
Nick
…On Tue, Dec 17, 2019 at 3:56 AM asmariyaz23 ***@***.***> wrote:
Hello,
I want to run smartpca using .ped and .map files.
Prior to running the code, I had to change 2 things in the .map file
1. Replace 0's in Chromosome column with 40
2. Since it complained about ID too long I made fake ids in sequential
order (snp1-snp766221) and inserted this column in place of original one
Also, I replaced the 6th column value with 1 (it was 0 originally) in the
.ped file.
Here is the code I am running next:
EIG-7.2.1/bin/smartpca -p par.example
Here is the par.example file:
parameter file: par.example
### THE INPUT PARAMETERS
##PARAMETER NAME: VALUE
genotypename: 191101KS-01.ped
snpname: modified.191101KS-01.map
indivname: 191101KS-01.ped
evecoutname: 191101KS-01.evec
evaloutname: 191101KS-01.eval
altnormstyle: NO
numoutevec: 2
familynames: NO
grmoutname: grmjunk
## smartpca version: 16000
norm used
Here is a *portion* of the .map file (tab-separated)
chr1 snp1 121.2806 99826920
chr1 snp2 122.1472 100599380
chr1 snp3 124.0726 102914837
chr1 snp4 124.1985 103761094
chr1 snp5 124.2819 104321842
chr1 snp6 126.3073 106194696
chr1 snp7 128.415 108897058
chr1 snp8 129.4658 109685814
chr1 snp9 129.4659 109685883
chr1 snp10 129.466 109685993
...
Here is a *portion* of the .ped file (tab-separated)
013_00032 013_00032 0 0 0 1 A A A A G G G G C C T G A A 0 0 A C T T 0 0 G G A A 0 0 G G 0 0 T C A C 0 0 0 0 T T A A A C 0 0 G G G G A A A A 0 0 T T 0 0 0 0 T T C C 0 0 A A T T C C A A G G 0 0 T G A A T G T T G G 0 0 G G C C G G C ...
This is a portion of the output/error I get:
parameter file: par.example
### THE INPUT PARAMETERS
##PARAMETER NAME: VALUE
genotypename: 191101KS-01.ped
snpname: modified.191101KS-01.map
indivname: 191101KS-01.ped
evecoutname: 191101KS-01.evec
evaloutname: 191101KS-01.eval
altnormstyle: NO
numoutevec: 2
familynames: NO
grmoutname: grmjunk
## smartpca version: 16000
norm used
*** warning. genetic distances are in cM not Morgans
chr1 snp1 121.2806 99826920
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 58814
zzz 0 62410
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 629241
zzz 1 400267
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 629912
zzz 2 385321
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 630053
zzz 3 739666
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 630128
zzz 4 354357
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 631712
zzz 5 354862
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 632287
zzz 6 385313
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 632828
zzz 7 385323
snp order check fail; snp list not ordered: modified.191101KS-01.map (processing continues) 1 0 632828
zzz 8 385324
...
...
nodata: snp701442
nodata: snp269372
nodata: snp31100
nodata: snp352716
nodata: snp421014
nodata: snp206750
nodata: snp186242
nodata: snp20472
number of samples used: 0 number of snps used: 0
Using 1 thread, and partial sum lookup algorithm.
total number of snps killed in pass: 0 used: 0
fatalx:
XTX has zero trace (perhaps no data)
Abort trap: 6
Since none of snps or samples are used, I am wondering if the format of
the input files is still correct or not? Is it necessary that all snps
belonging to the same chromosome appear together? My .ped file contains
only 1 sample currently (for testing purposes), is it necessary to have
multiple? Also, I noticed the example.ped provided in the package is
different in that 7th column onwards I have a combination of A,G,C,T and
numbers and former has a bunch of numbers. Could you provide me any
pointers as to how I can go about this?
Thank you,
Asma
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#41?email_source=notifications&email_token=AEE77BYDHVZ34XXQAW5D5OLQZCH2BA5CNFSM4J3X5Q5KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IA7LF2A>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEE77B4ZXE52ZLFOH6DELB3QZCH2BANCNFSM4J3X5Q5A>
.
|
Hi Nick, Thank you for your reply. I did both, what helped me move forward was to include multiple samples (2 samples) in one .ped file and corresponding .map files into one, instead of just 1 sample. Once the XTX error went away, I combined all the .ped files into one and all corresponding .map files into one and then ran convertf to make .snp .ind .geno files but I got this error:
Any pointers as to what could have gone wrong this time? Thank you, |
Yes. This is an example of why ped files are horrid.
Your first line as 1400162 fields and some other line has 1532...
Nick
…On Wed, Dec 18, 2019 at 9:04 AM asmariyaz23 ***@***.***> wrote:
I did both, what helped me move forward was to include multiple samples (2
samples) in one .ped file and corresponding .map files into one, instead of
just 1 sample.
Once the XTX error went away, I combined all the .ped files into one and
all corresponding .map files into one and then ran convertf to make .snp
.ind .geno files but I got this error:
fatalx:
bad number of fields 1532448 1400162
Aborted (core dumped)
Any pointers as to what could have gone wrong this time?
Thank you,
Asma
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#41?email_source=notifications&email_token=AEE77B77NORHRVWG4TJIGH3QZIUU7A5CNFSM4J3X5Q5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHGGUXQ#issuecomment-567044702>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEE77BZQIID6JO7G7N765K3QZIUU7ANCNFSM4J3X5Q5A>
.
|
Does that mean this kind of ped file with unequal field numbers cannot be converted to Recih lab format? As I did try to use the convertf to convert it into the preferred format but that gives an error as well. @bumblenick |
What does unequal fields mean? Which genotypes go with which snps?
I think you should aim to get your data into packed ped format (PLINK)
which at least
is standardized. You can then if you wish run convertf on the packed
file.
N
…On Wed, Dec 18, 2019 at 9:20 AM asmariyaz23 ***@***.***> wrote:
Does that mean this kind of ped file with unequal field numbers cannot be
converted to Recih lab format?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41?email_source=notifications&email_token=AEE77BY675Q4G7UDHWEHNUTQZIWRDA5CNFSM4J3X5Q5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHGIGUY#issuecomment-567051091>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEE77B666Z3CQJFUBPRPBKTQZIWRDANCNFSM4J3X5Q5A>
.
|
Hello,
I want to run smartpca using .ped and .map files.
Prior to running the code, I had to change 2 things in the .map file
Also, I replaced the 6th column value with 1 (it was 0 originally) in the .ped file.
Here is the code I am running next:
EIG-7.2.1/bin/smartpca -p par.example
Here is the par.example file:
Here is a portion of the .map file (tab-separated)
Here is a portion of the .ped file (tab-separated)
This is a portion of the output/error I get:
Since none of snps or samples are used, I am wondering if the format of the input files is still correct or not? Is it necessary that all snps belonging to the same chromosome appear together? My .ped file contains only 1 sample currently (for testing purposes), is it necessary to have multiple? Also, I noticed the example.ped provided in the package is different in that 7th column onwards I have a combination of A,G,C,T and numbers and former has a bunch of numbers. Could you provide me any pointers as to how I can go about this?
Thank you,
Asma
The text was updated successfully, but these errors were encountered: