Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing Infinium Control probes with GenomeStudio #43

Open
fkokocinski opened this issue Nov 28, 2024 · 3 comments
Open

Comparing Infinium Control probes with GenomeStudio #43

fkokocinski opened this issue Nov 28, 2024 · 3 comments

Comments

@fkokocinski
Copy link

We are seeing a disagreement between our Python-based software using the GenotypeCalls and BeadPoolManifest modules and GenomeStudio for the control probe results of the KaryoMap-v2-1 array. I might be reading the values incorrectly, so I would appreciate your feedback.
Reading the control config from the manifest shows 5 repeated values for each of the 23 control types:

bpm = BeadPoolManifest(manifest_file)
bpm.control_config 
 
 '0027630314:0027630314:0027630314:0027630314:0027630314,Staining,Red,DNP (High)',
 '0043603326:0043603326:0043603326:0043603326:0043603326,Staining,Purple,DNP (Bgnd)',
 '0041666334:0041666334:0041666334:0041666334:0041666334,Staining,Green,Biotin (High)',
 '0034648333:0034648333:0034648333:0034648333:0034648333,Staining,Blue,Biotin (Bgnd)',
 '0017616306:0017616306:0017616306:0017616306:0017616306,Extension,Red,Extension (A)',
...

with Staining-Red as the first entry which we can use as an example.
Looking at the intensities for X in the data file of a GTC file I can see the first entries have high values, while I expect these to be low as this is the reading for the Green channel:

gtc = GenotypeCalls(gtc_file)
gtc.get_control_x_intensities()
 
array([33242, 33242, 33242, 33242, 33242,   518,   518,   518,   518,
         518,   533,   533,   533,   533,   533,   441,   441,   441,
         441,   441, 29266, 29266, 29266, 29266, 29266, 34486, 34486,
       34486, 34486, 34486,  1608,  1608,  1608,  1608,  1608,  1898,
        1898,  1898,  1898,  1898,   975,   975,   975,   975,   975,
        2154,  2154,  2154,  2154,  2154,   549,   549,   549,   549,
         549,  1808,  1808,  1808,  1808,  1808, 23898, 23898, 23898,
       23898, 23898,  5512,  5512,  5512,  5512,  5512,   315,   315,
         315,   315,   315,   228,   228,   228,   228,   228,   296,
         296,   296,   296,   296,   258,   258,   258,   258,   258,
       17748, 17748, 17748, 17748, 17748, 16469, 16469, 16469, 16469,
       16469,   981,   981,   981,   981,   981,   794,   794,   794,
         794,   794,   277,   277,   277,   277,   277], dtype=uint16)

So I am wondering:
Do I have the wrong approach to fetch the control data or its interpretation here?
Only one position and one unique intensity value is reported here for every type - is this expected?

@jzieve
Copy link
Collaborator

jzieve commented Dec 2, 2024

What are you seeing in genome studio? Depending on how old this manifest is there could be a switch between red/grn channels and what is considered x/y.

@fkokocinski
Copy link
Author

fkokocinski commented Dec 5, 2024

Thanks, that switch would be important to know and would explain it. :-)
This is a custom-designed BeadArray, so there is no public info available. I'll contact you directly if that's ok.

@jzieve
Copy link
Collaborator

jzieve commented Dec 24, 2024

@fkokocinski I was not able to reproduce a discordance between genome studio and this library (or the data in the GTCs).
I replicated the genome studio control dashboard report in this PR: #44
And when running diff --strip-trailing-cr on both reports it was empty/identical.
There could be an issue with the assay or how this manifest/controls were designed but I would not be a SME for that and would be out of scope for this library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants