You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there! For a simple TemporalProblem, I've held out some genes (from the embedding computation, simple PCA) and computed the coupling. I would now like to use the coupling to predict expression values of the held out genes (at either t_1 or t_2, both possible), as a means of validation. However, when calling tp.push(source=8.0, target=8.5, data=gexp_sc, scale_by_marginals=True), where gexp_sc is the gene expression matrix of held-out genes on the source cells, my kernel dies. I assume that's because the matrix multiplication is carried out using a dense formulation, all at once. Is it somehow possible to do this in a batch-wise fashion, i.e. by only loading small chunks of the coupling into memory at once?
The text was updated successfully, but these errors were encountered:
Great, thanks @giovp! I guess this is also related to #569.
A solution that works for me is specifying the batch_size=x in the problem's solve method, even though that's not actually required to solve the problem as it's quite small. However, that seems to imply that downstream computations are also batched, I can run
out = tp.push(source=8.0, target=8.5, data=gexp_src, scale_by_marginals=True, return_all=True, key_added=None)
now fine without any issues. However, this is a bit clumsy, as it requires me to solve the problem in a (slower) batch-wise fashion, even though I could solve it in offline mode. Thus, I think it would be nice to decouple the two batch_sizes, to allow a problem to be solved using some batch size, and to use pull/push downstream with another batch size.
sorry, partly unrelated - if I want to impute gene expression at the target using the source, would I have to use scale_by_marginals? Intuitively, I would say no, as all I want is Y = P^T X, where P is the coupling, X is known gene expression in the source, and Y is my unknown gene expression in the target. So I just want this matrix multiplication, with no additional scaling.
Hi there! For a simple
TemporalProblem
, I've held out some genes (from the embedding computation, simple PCA) and computed the coupling. I would now like to use the coupling to predict expression values of the held out genes (at either t_1 or t_2, both possible), as a means of validation. However, when callingtp.push(source=8.0, target=8.5, data=gexp_sc, scale_by_marginals=True)
, wheregexp_sc
is the gene expression matrix of held-out genes on the source cells, my kernel dies. I assume that's because the matrix multiplication is carried out using a dense formulation, all at once. Is it somehow possible to do this in a batch-wise fashion, i.e. by only loading small chunks of the coupling into memory at once?The text was updated successfully, but these errors were encountered: