Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data for mixed.gen.Burridge_Schneider_2017 #65

Closed
svengato opened this issue Dec 25, 2021 · 14 comments
Closed

Missing data for mixed.gen.Burridge_Schneider_2017 #65

svengato opened this issue Dec 25, 2021 · 14 comments
Assignees

Comments

@svengato
Copy link

svengato commented Dec 25, 2021

For cowpea QTL data in v2/Vigna/unguiculata/genetic/mixed.gen.Burridge_Schneider_2017/

  1. Markers listed in vigun.mixed.gen.Burridge_Schneider_2017.qtlmrk.tsv.gz are not found in Vigna/unguiculata/markers/IT97K-499-35.gnm1.mrk.Cowpea1MSelectedSNPs/vigun.IT97K-499-35.gnm1.mrk.Cowpea1MSelectedSNPs.gff3.gz

  2. In vigun.mixed.gen.Burridge_Schneider_2017.obo.tsv.gz, some traits have no ontology code.

@sammyjava
Copy link
Contributor

I put in the parent terms for stuff that's missing in the TO like "Root Tip Abundance" (RTA). Unless there's an ontology term for the Rochester Transit Authority.

@svengato
Copy link
Author

Point 1 (missing Burridge-Schneider markers) is still unresolved.

@sammyjava
Copy link
Contributor

@cann0010 is there a source for the missing Cowpea1MSelectedSNPs marker maps to vigun.IT97K-499-35.gnm1? Otherwise we'll just close this, we can't create data that don't exist. Here's what's in the qtlmrk file:

13772_1075
5084_519
4836_807
10811_937
2326_226
12501_343
139_439
5061_428
7102_965
1004_587
8969_1386
2227_693
9645_589
5428_339
3211_511
4749_1972
11138_624
14604_737
13848_735
11851_914
4245_136
2391_614

@StevenCannon-USDA
Copy link

Some sleuthing: It's not the IT97K-499-35.gnm1.div.Huynh_Ehlers_2018 diversity set, which has only 32k markers.
Skimming the paper ...
Burridge JD, Schneider HM, Huynh BL, Roberts PA, Bucksch A, Lynch JP. Genome-wide association mapping and agronomic impact of cowpea root architecture. Theor Appl Genet. 2017 Feb;130(2):419-431. doi: 10.1007/s00122-016-2823-y. Epub 2016 Nov 18. PMID: 27864597.
... the markers come from
Lucas MR, Diop N-N, Wanamaker S, Ehlers JD, Roberts PA, Close TJ
(2011) Cowpea–soybean synteny clarified through an improved
genetic map. Plant Genome 4:218–225
https://acsess.onlinelibrary.wiley.com/doi/full/10.3835/plantgenome2011.06.0019

... which traces back to Muchero et al, 2009
10.1073/pnas.0905886106

YES: Supplement for that Muchero et al paper (tsp file attached; derived from manuscript supplement 0905886106_SD1.xls)
0905886106_SD1.txt

@adf-ncgr
Copy link
Contributor

thanks @cann0010! just glancing at the file it looks like at least some of the qtlmrk markers correspond to SNPs in the Cowpea1MSelectedSNPs, using the column 1 (SNP) as the match, e.g. 13772_1075 from the list above -> 1_0749 which is in the 1M set. Think it's legit to substitute the SNP ids for the "marker" ids in the qtlmrk file? Seems better to me to do so, but let me know if you agree with this change and I can make it so.

@sammyjava
Copy link
Contributor

What matters is that the identifier in the qtlmarker file is the Name attribute of the marker in the GFF. That's how they're merged. Otherwise it'll be stranded markers with no genomic location.

@sammyjava
Copy link
Contributor

(There are likely marker GFFs that will show up as issues because they don't have full-yuck ID attributes and short-name Name attributes, although I'll typically fix those myself.)

@adf-ncgr
Copy link
Contributor

only 17 out of the 22 SNPs can be found in the 1M set; so we may have to get genomic locations for the others by mapping the sequences. Since they are not all in the 1M set, perhaps we should just treat them as their own little marker collection?

@sammyjava
Copy link
Contributor

Fine with me, be sure to update README.genotyping_platform with your new name if/when you do.

@adf-ncgr
Copy link
Contributor

OK, probably won't get to it for a while, but will re-assign to myself (though others are welcome to wrest it back)

@sammyjava
Copy link
Contributor

ping @adf-ncgr

@sammyjava
Copy link
Contributor

sammyjava commented Apr 7, 2022

But also, @adf-ncgr you're really good at hijacking issues, which is not an instant messaging app. So please if you have a new issue like "what to do with marker sequences" create a new one rather than hijacking this one. Much appreciated. I'll do so with these posts. legumeinfo/datastore-specifications#24

@legumeinfo legumeinfo deleted a comment from adf-ncgr Apr 7, 2022
@legumeinfo legumeinfo deleted a comment from StevenCannon-USDA Apr 7, 2022
@legumeinfo legumeinfo deleted a comment from adf-ncgr Apr 7, 2022
@adf-ncgr
Copy link
Contributor

adf-ncgr commented Apr 7, 2022

thanks, you're right.

@sammyjava
Copy link
Contributor

I think this got sorted out over time. Or we'll create a fresh issue but I've got VignaMine loaded and running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants