You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a heterogeneous dataset consisting of stroma and immune cells. For now, I'm interested in the stroma cells, and I was wondering if running scran again after subsetting the compartment of interest will lead to more accurate size factor estimation, since heterogenous data can produce negative estimates in some cases (which I witnessed but was able to address).
After subsetting, I have this many cells per sample:
And I would imagine that this would be a problem as well. Prior to subsetting I have this many cells per sample:
I've also read that low number of cells per sample could be problematic with scran normalization... but just want to get some opinion from authors on which may be the better route forward.. run norm prior to subset, then subset? Or re run...
Thanks!
The text was updated successfully, but these errors were encountered:
If you're considering the analysis of each sample, then yes, the small number of cells in some of the samples will make normalization difficult. More specifically, this will introduce some instability in the estimates; the question is whether or not this instability is offset by the (assumed) improvement in accuracy once heterogeneity is out of the picture.
Having said that, if you've already subsetted it down to stroma cells and the subpopulations within the stroma subset are reasonably similar, you could just go with library size normalization (e.g., scuttle::librarySizeFactors). The expectation would be that there isn't a lot of composition biases that would motivate the use of scran's pooling normalization in the first place.
Alternatively, if you're analyzing all samples together and the batch effects are modest, you could run pooledSizeFactors on the set of all stroma cells. Any composition biases introduced by minor DE between batches would then be handled by the pooling normalization, while ensuring you have enough cells to get stable estimates.
Hello,
I have a heterogeneous dataset consisting of stroma and immune cells. For now, I'm interested in the stroma cells, and I was wondering if running scran again after subsetting the compartment of interest will lead to more accurate size factor estimation, since heterogenous data can produce negative estimates in some cases (which I witnessed but was able to address).
After subsetting, I have this many cells per sample:
And I would imagine that this would be a problem as well. Prior to subsetting I have this many cells per sample:
I've also read that low number of cells per sample could be problematic with scran normalization... but just want to get some opinion from authors on which may be the better route forward.. run norm prior to subset, then subset? Or re run...
Thanks!
The text was updated successfully, but these errors were encountered: