-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback parsing for Genotype files (esp. 23andMe) #428
Comments
Thanks for the suggestions! The code is in two different In the Hope that's a good pointer for the start, let us know if we can offer further explanations. 🙂 |
Is this still something you want to tackle? 😃 |
It would be useful (thought possible CPU expensive) to have a fallthrough tree for file types if the indicated one does not work.
My biggest request is that
23andme-format
would fallback to23andMe-EXOME, VCF-format
(or reverse) in the case of:We are sorry to inform you that there was at least one line in your genotyping file, excluding the header, that could not be parsed correctly.
Prior to emailing the user.
This greatly increases the user experience (UX) likely at a low cost of parsing the file; it appears based on the speed that this is a regexp check, so you could even do a stupid simple regexp (not sure which expression engine you are using).
if
23andme-format
fails, see if outside the header (heck, do a head -n 10)^1\
matches, then try parsing with23andMe-EXOME, VCF-format
with the reverse being true if
EXOME
fails see if^rs[digits]\
on a head after the header.It also saves bandwidth on both ends, so in the end, depending on hosting, may end up saving you money in transport costs versus cpu cycles.
deCODEme
has a unique header:
as does
FamilyTreeDNA
Though I would think the
23andMe
is the most likely fault case.If you can point me to the right place in the src tree I would be happy to write a PR for this feature.
The text was updated successfully, but these errors were encountered: