fix the plot_control transcripts for large images #107

lopollar · 2023-06-26T11:31:33Z

for the large image, too many rows need to be combined.

Maybe we can subsample, by pulling locations together, and creating sums over 25 pixels (5*5). This will reduce the plotting a lot.

Zoom-ins should be possible, we should put a threshold (or a hidden parameter) to state how big an image can be to still plot it fully!

Question: do we save this image in the object? I don't think that is necessary at the moment, but maybe there is a good reason to do so (i.g. visualization in napari)

ArneDefauw · 2023-08-16T09:36:04Z

This is partially fixed by bfd5957.

Currently solved as follows (e.g. for VIZGEN):

possibility to give crd when calculating transcript density:
e.g.:

import napari_sparrow as nas
sdata=nas.im.transcript_density(  sdata, crd= [ 20000, 30000, 20000, 30000 ])

the calculated transcript density for this crd is stored in the sdata object as an image layer (default name='transcript_density') (and in the .zarr file if backed by zarr).

this layer can then be plotted by the nas.pl.plot_shapes function, or pl.plot_image, as one would plot other image layers. Or one can use nas.pl.transcript_density, e.g.:

nas.pl.transcript_density( sdata, img_layer=[ 'clahe', 'transcript_density' ] crd=[ 20000, 30000, 20000, 30000 ] ).

We should decide if we want to have a look at 'subsampeling by pulling location together' in the nas.im.transcript_density function, to be able to calculate transcript density for large images , as you suggested Lotte.

ArneDefauw · 2023-08-21T13:58:17Z

Fixed a bug in (commit ( c82c236 ))

import napari_sparrow as nas
sdata=nas.im.transcript_density(  sdata, ...)

the use of .unstack in:

image = np.array(counts_location_transcript.unstack(fill_value=0))

did not automatically result in an image with dimensions similar as the other image layers, because not all rows or columns are populated with transcripts.

Rewrote code to fix this + added option to sample transcripts before calculation of transcript density. Default behaviour is now that sampling is performed if number of transcripts inside the crd > 15 000 000.
Everything is now rewritten in dask/dask dataframe, so even for vizgen, it is relatively fast to calculate density (approx 3min if we set leave the n_sample parameter to the default value of 15 000 000) and crd to None:

sdata=nas.im.transcript_density(  sdata, crd=None )
nas.pl.plot_image( sdata, img_layer='transcript_density', crd=[ 0, 20000, 0, 20000 ] )

One can also still provide a crd, and then sampling is not necessary:

sdata=nas.im.transcript_density(  sdata, crd=[ 0, 20000, 0, 20000 ] )
nas.pl.plot_image( sdata, img_layer='transcript_density', crd=[ 0, 20000, 0, 20000 ] )

ArneDefauw self-assigned this Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the plot_control transcripts for large images #107

fix the plot_control transcripts for large images #107

lopollar commented Jun 26, 2023

ArneDefauw commented Aug 16, 2023

ArneDefauw commented Aug 21, 2023 •

edited

Loading

fix the plot_control transcripts for large images #107

fix the plot_control transcripts for large images #107

Comments

lopollar commented Jun 26, 2023

ArneDefauw commented Aug 16, 2023

ArneDefauw commented Aug 21, 2023 • edited Loading

ArneDefauw commented Aug 21, 2023 •

edited

Loading