-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue with running GADMA using the dadi output from realsfs2dadi #5
Comments
Hi Marie - The dadi-format route is a bit outdated, tbh, I need to update this soon… I assume GADMA accepts other SFS formats that can be input into Moments or dadi, like the one that basically looks like a header line giving numbers of bins in each dimension and then numbers giving values for each bin. This can be generated by running angsd’s realSFS and then some reformatting, as shown here:
https://github.com/z0on/pipelines2020/blob/main/pipelines_day2_AFS.sh <https://github.com/z0on/pipelines2020/blob/main/pipelines_day2_AFS.sh>
(actually there is a whole moments modeling pipeline in that script)
cheers
Misha
… On Jul 18, 2023, at 8:43 AM, mariels ***@***.***> wrote:
Hello,
I have been using your scripts to format and thin the data from angsd to the dadi format to run GADMA.
When I run GADMA (on the full or thinned file) it stops with the following error message:
raise SyntaxError("Construction of data_dict failed: " + str(e))
SyntaxError: Construction of data_dict failed: 'Allele2' is not in list
Allele2 is in the header of the input file, for example, the first lines look as follow:
REF OUT Allele1 West East NEG Allele2 West East NEG Gene Position
cag CGG a 0 0 0 T 16 14 18 NW_021703766.1 18085
The GADMA developer suggested that there may be problem with the dadi format. I was wondering whether you had heard of similar issues and could advice on how to fix it.
Thank you very much,
Best wishes,
Marie
—
Reply to this email directly, view it on GitHub <#5>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGFWTZWIL5LNNUS6GMDXQ2AHRANCNFSM6AAAAAA2ONUUTQ>.
You are receiving this because you are subscribed to this thread.
|
Hi Misha, |
Yes, in that walkthrough bootstrap happens at the realsfs stage and it is
by chromosome - basically resampling whole chromosomes, or contigs, or
whatever you have in genome reference as separate fasta entries. That takes
care of linkage issue.
On Fri, Jul 21, 2023 at 12:11 PM mariels ***@***.***> wrote:
Hi Misha,
Many thanks for your answer.
Would you run it on all SNPs and the implemented bootstrapping procedure
should solve the issue of including linked SNPs?
Cheers,
Marie
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZUHGDYEIT4D6XDPHH6AMLXRKLZBANCNFSM6AAAAAA2ONUUTQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
cheers
Misha
matzlab.weebly.com
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I have been using your scripts to format and thin the data from angsd to the dadi format to run GADMA.
When I run GADMA (on the full or thinned file) it stops with the following error message:
raise SyntaxError("Construction of data_dict failed: " + str(e))
SyntaxError: Construction of data_dict failed: 'Allele2' is not in list
Allele2 is in the header of the input file, for example, the first lines look as follow:
REF OUT Allele1 West East NEG Allele2 West East NEG Gene Position
cag CGG a 0 0 0 T 16 14 18 NW_021703766.1 18085
The GADMA developer suggested that there may be problem with the dadi format. I was wondering whether you had heard of similar issues and could advice on how to fix it.
Thank you very much,
Best wishes,
Marie
The text was updated successfully, but these errors were encountered: