compose Obs and Exp snipper classes -> ObsExpSnipper #496
sergpolly
started this conversation in
Technical details
Replies: 1 comment
-
here is a better version i believe: class ObsExpSnipper:
def __init__(
self,
clr,
expected,
view_df,
cooler_opts=None,
min_diag=2,
expected_value_col="balanced.avg",
):
"""Class for generating expected-normalised snips from a cooler"""
self.exp_snipper = ExpectedSnipper(
clr=clr,
expected=expected,
view_df=view_df,
min_diag=min_diag,
expected_value_col=expected_value_col,
)
self.clr_snipper = CoolerSnipper(
clr=clr,
view_df=view_df,
cooler_opts=cooler_opts,
min_diag=min_diag,
)
def select(self, region1, region2):
"""Select a portion of the cooler based on two regions in the view, normalize by expected"""
observed = self.clr_snipper.select(region1, region2) # CSR sparse matrix
expected = self.exp_snipper.select(region1, region2) # LazyToeplitz subscriptable object
# generating O/E right away in a CSR format:
expected_data = array_to_csr_data(expected, observed)
observed.data = observed.data / expected_data
return observed
def snip(self, matrix, region1, region2, tup):
"""Extract an expected-normalised snippet from the matrix"""
s1, e1, s2, e2 = tup
offset1 = self.clr_snipper.offsets[region1]
offset2 = self.clr_snipper.offsets[region2]
binsize = self.clr_snipper.binsize
# bins relative to start of respective chromosomes
lo1, hi1 = s1 // binsize - offset1, e1 // binsize - offset1
lo2, hi2 = s2 // binsize - offset2, e2 // binsize - offset2
return _snip(
matrix,
offset = (offset1, offset2),
bad_bins_masks = (self.clr_snipper._isnan1, self.clr_snipper._isnan2),
min_diag = self.clr_snipper.min_diag,
bbox = (lo1, hi1, lo2, hi2),
) definitions of |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
thinking about the baseclass for snippers #227 (comment) ... not obvious how would that look like - i.e. how one would unify Obs and Exp...
However, one can compose Obs and Exp to give rise to ObsExp one like so:
this might look inefficient, but i think it's mostly duplicated work in
snip
itself,select
one probably is fine as is ...Some of that could be mitigated, i think, by making snippers work on pre-calculated
bin_id
-s instead of recalculating nucleotides into bins several times like it is now (i.e. taking advantage of thoselo1, hi1, lo2, hi2
- perhaps keeping them absolute + keeping region offsets along the way). just a thought anyhowBeta Was this translation helpful? Give feedback.
All reactions