You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RaggedExperiment continues to rule for all our 'omics related work! I did notice something interesting yesterday when running compactSummarizedExperiment(), when I attempt to access the names of the assays in a large RE
it will either be near instantaneous with using assayNames(), or require 100s of GB of memory with names(assay(my_RE)). Do you know why this might be the case? I'll work on getting a smaller reproducible example if there is interest.
Thanks again for all that you do and RaggedExperiments!
The text was updated successfully, but these errors were encountered:
I believe, without actually checking, that the names are stored independently of the underlying data representation, and the cost is associated with adding names and hence duplicating the underlying data. If it's 'easy' to simulate the data for a reproducible example that would be great.
Hi Ben, @biobenkj
I'm glad to hear you are making use of this data representation!
The trick behind RaggedExperiment involves providing a matrix representation from a GRangesList object. In the background, the stored representation is a GRangesList so accessing the metadata it relatively straightforward. When using assay, the GRangesList representation has to be converted to matrix, this involves creating quite a large sparse matrix from the mcols in the original GRangesList, a costly operation.
I agree, a minimal and reproducible example would be helpful. We'll see what we can do to increase the efficiency of this conversion. Thank you.
@biobenkj Any updates on this?
Would a dgCMatrix representation help? Have you tested this?
We can create additional functionality to return this data representation.
If you can provide a reproducible example to help this move along, that would be great. Thanks!
RaggedExperiment continues to rule for all our 'omics related work! I did notice something interesting yesterday when running compactSummarizedExperiment(), when I attempt to access the names of the assays in a large RE
it will either be near instantaneous with using assayNames(), or require 100s of GB of memory with names(assay(my_RE)). Do you know why this might be the case? I'll work on getting a smaller reproducible example if there is interest.
Thanks again for all that you do and RaggedExperiments!
The text was updated successfully, but these errors were encountered: