Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Now supports extracting gencall scores (GCALL) as real numbers #74

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

Ahhgust
Copy link

@Ahhgust Ahhgust commented Apr 27, 2023

I added support for exporting an extra field; the GenCall score (without Phred scaling, so as to have better precision).
(And I use git forking mechanics once in a blue moon, so apologies in advance!)

While this is not the place, it would also be handy to export the gentrain score too, but I could not find programmatic access to it.

Thanks!
-August

@jzieve
Copy link
Collaborator

jzieve commented Apr 27, 2023

This is essentially what it used to do. I'm thinking it may be better to just change GS back to what you have here with GCALL as the phred-scaled score may be useful for parsing via bcftools, but is not actually a good metric for GS.

For the gentrain score, that is encoded in the cluster file (egt) and out of the scope of this tool. This tool may help you out:
https://github.com/Illumina/BeadArrayFiles/blob/develop/examples/locus_summary.py

@Ahhgust
Copy link
Author

Ahhgust commented Apr 27, 2023 via email

@jzieve
Copy link
Collaborator

jzieve commented May 5, 2023

@Ahhgust Quick update: I was advised to point you to Illumina's C#/dotnet core implementation of GTCtoVCF here:
https://support.illumina.com/array/array_software/ima-array-analysis-cli.html (i.e. see latest README about semi-archived state).
The 2.0 version of that software should reflect the GS score as you prefer (i.e. not phred-scaled).
Hopefully that helps meet your needs.

@Ahhgust
Copy link
Author

Ahhgust commented May 5, 2023 via email

@jzieve
Copy link
Collaborator

jzieve commented May 5, 2023

Thanks for bringing to my attention. But sorry not following... did you paste something? Where is the SampleID missing?
Also, how were the GTCs generated? Might help track down the bug. I think we've seen something similar from Beeline generated GTCs. But seemed to be ok when the IDAT->GTC conversion was done via Array Analysis CLI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants