diff --git a/dev/articles/bivariate.html b/dev/articles/bivariate.html index a8555181..6ad12cde 100644 --- a/dev/articles/bivariate.html +++ b/dev/articles/bivariate.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/bivariate.Rmd
bivariate.Rmd
diff --git a/dev/articles/create_sfe.html b/dev/articles/create_sfe.html index 069c3d52..2e54f1b3 100644 --- a/dev/articles/create_sfe.html +++ b/dev/articles/create_sfe.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/create_sfe.Rmd
create_sfe.Rmd
diff --git a/dev/articles/create_sfe_v2.html b/dev/articles/create_sfe_v2.html index 70dd0afe..ff904b93 100644 --- a/dev/articles/create_sfe_v2.html +++ b/dev/articles/create_sfe_v2.html @@ -117,7 +117,7 @@

Kayla Jackson

-

2024-05-02

+

2024-05-03

Source: vignettes/create_sfe_v2.Rmd
create_sfe_v2.Rmd
diff --git a/dev/articles/localc.html b/dev/articles/localc.html index ae394173..c97c254c 100644 --- a/dev/articles/localc.html +++ b/dev/articles/localc.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/localc.Rmd
localc.Rmd
diff --git a/dev/articles/multispati.html b/dev/articles/multispati.html index 57bd464c..7f1becc6 100644 --- a/dev/articles/multispati.html +++ b/dev/articles/multispati.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/multispati.Rmd
multispati.Rmd
@@ -304,7 +304,7 @@

Quality control style = "W") ) #> user system elapsed -#> 34.506 0.224 34.753 +#> 35.006 0.249 35.303
 features_use <- c("nCounts", "nGenes", "volume")
 sfe <- colDataUnivariate(sfe, "moran.mc", features_use, 
@@ -424,10 +424,10 @@ 

Non-spatial PCA scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed -#> 21.877 1.884 24.710 +#> 18.992 1.225 21.017 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) -#> Ncells 16055274 857.5 25942897 1385.6 NA 25942897 1385.6 +#> Ncells 16055263 857.5 25943217 1385.6 NA 25943217 1385.6 #> Vcells 239248882 1825.4 502470759 3833.6 16384 497137484 3792.9

That’s pretty quick for almost 400,000 cells, but there aren’t that many genes here. Use the elbow plot to see variance explained by each @@ -476,7 +476,7 @@

MULTISPATI PCA#> Warning in asMethod(object): sparse->dense coercion: allocating vector of size #> 1.1 GiB #> user system elapsed -#> 185.093 15.141 211.931 +#> 176.516 20.441 212.961

Then plot the most positive and most negative eigenvalues. Note that the eigenvalues here are not variance explained. Instead, they are the product of variance explained and Moran’s I. So the most positive @@ -670,7 +670,7 @@

Clustering with MULTISPATI PCA#> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed -#> 552.609 6.863 560.613 +#> 947.058 9.214 959.522

See if clustering with the positive MULTISPATI PCs give more spatially coherent clusters

+#> 740.368 8.836 754.958

Plot the clusters in space:

 plotSpatialFeature(sfe, c("clusts_nonspatial", "clusts_multispati"), 
diff --git a/dev/articles/nonspatial.html b/dev/articles/nonspatial.html
index 1e0bf29e..966c44ed 100644
--- a/dev/articles/nonspatial.html
+++ b/dev/articles/nonspatial.html
@@ -117,7 +117,7 @@ 
                         

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/nonspatial.Rmd
nonspatial.Rmd
@@ -754,7 +754,7 @@

Moran’s I= TRUE, BPPARAM = MulticoreParam(2)) }) #> user system elapsed -#> 213.532 17.137 231.456

+#> 227.927 19.754 248.680
 plotCorrelogram(sfe, top_markers, swap_rownames = "Symbol")

diff --git a/dev/articles/preprocess_10xv3.html b/dev/articles/preprocess_10xv3.html index be92ff2b..095b2979 100644 --- a/dev/articles/preprocess_10xv3.html +++ b/dev/articles/preprocess_10xv3.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_10xv3.Rmd
preprocess_10xv3.Rmd
diff --git a/dev/articles/preprocess_atac.html b/dev/articles/preprocess_atac.html index d9ab7d1f..3758636a 100644 --- a/dev/articles/preprocess_atac.html +++ b/dev/articles/preprocess_atac.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_atac.Rmd
preprocess_atac.Rmd
diff --git a/dev/articles/preprocess_clicktag.html b/dev/articles/preprocess_clicktag.html index ebbcef15..4971d50b 100644 --- a/dev/articles/preprocess_clicktag.html +++ b/dev/articles/preprocess_clicktag.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_clicktag.Rmd
preprocess_clicktag.Rmd
diff --git a/dev/articles/preprocess_crispr.html b/dev/articles/preprocess_crispr.html index 99a59c91..925b4fc6 100644 --- a/dev/articles/preprocess_crispr.html +++ b/dev/articles/preprocess_crispr.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_crispr.Rmd
preprocess_crispr.Rmd
diff --git a/dev/articles/preprocess_multiome.html b/dev/articles/preprocess_multiome.html index 0521d408..0ae322ce 100644 --- a/dev/articles/preprocess_multiome.html +++ b/dev/articles/preprocess_multiome.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_multiome.Rmd
preprocess_multiome.Rmd
diff --git a/dev/articles/preprocess_nuclei.html b/dev/articles/preprocess_nuclei.html index ddbb1c08..3b73c0e6 100644 --- a/dev/articles/preprocess_nuclei.html +++ b/dev/articles/preprocess_nuclei.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_nuclei.Rmd
preprocess_nuclei.Rmd
diff --git a/dev/articles/preprocess_splitseq.html b/dev/articles/preprocess_splitseq.html index 7e0c3652..b8e1f0b2 100644 --- a/dev/articles/preprocess_splitseq.html +++ b/dev/articles/preprocess_splitseq.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_splitseq.Rmd
preprocess_splitseq.Rmd
diff --git a/dev/articles/preprocess_visium.html b/dev/articles/preprocess_visium.html index 27be503e..b5bf275b 100644 --- a/dev/articles/preprocess_visium.html +++ b/dev/articles/preprocess_visium.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/preprocess_visium.Rmd
preprocess_visium.Rmd
diff --git a/dev/articles/sfemethod.html b/dev/articles/sfemethod.html index 333fbb61..b68b79fd 100644 --- a/dev/articles/sfemethod.html +++ b/dev/articles/sfemethod.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/sfemethod.Rmd
sfemethod.Rmd
diff --git a/dev/articles/variogram.html b/dev/articles/variogram.html index 524cbb87..82a8b0fc 100644 --- a/dev/articles/variogram.html +++ b/dev/articles/variogram.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/variogram.Rmd
variogram.Rmd
diff --git a/dev/articles/vig10_10x_nuclei.html b/dev/articles/vig10_10x_nuclei.html index bb12469f..274f558a 100644 --- a/dev/articles/vig10_10x_nuclei.html +++ b/dev/articles/vig10_10x_nuclei.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig10_10x_nuclei.Rmd
vig10_10x_nuclei.Rmd
diff --git a/dev/articles/vig11_clicktags.html b/dev/articles/vig11_clicktags.html index 72b828d6..25bc9520 100644 --- a/dev/articles/vig11_clicktags.html +++ b/dev/articles/vig11_clicktags.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig11_clicktags.Rmd
vig11_clicktags.Rmd
diff --git a/dev/articles/vig12_crispr.html b/dev/articles/vig12_crispr.html index 6d147e41..2397843d 100644 --- a/dev/articles/vig12_crispr.html +++ b/dev/articles/vig12_crispr.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig12_crispr.Rmd
vig12_crispr.Rmd
diff --git a/dev/articles/vig13_10xatac.html b/dev/articles/vig13_10xatac.html index e6b6601c..50ad2882 100644 --- a/dev/articles/vig13_10xatac.html +++ b/dev/articles/vig13_10xatac.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig13_10xatac.Rmd
vig13_10xatac.Rmd
diff --git a/dev/articles/vig14_10xmultiome.html b/dev/articles/vig14_10xmultiome.html index bef38a3d..5a1cf335 100644 --- a/dev/articles/vig14_10xmultiome.html +++ b/dev/articles/vig14_10xmultiome.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig14_10xmultiome.Rmd
vig14_10xmultiome.Rmd
diff --git a/dev/articles/vig1_visium_basic.html b/dev/articles/vig1_visium_basic.html index 83a0210c..6738a3d7 100644 --- a/dev/articles/vig1_visium_basic.html +++ b/dev/articles/vig1_visium_basic.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/vig1_visium_basic.Rmd
vig1_visium_basic.Rmd
diff --git a/dev/articles/vig2_visium.html b/dev/articles/vig2_visium.html index f04f11f0..01bc6cbb 100644 --- a/dev/articles/vig2_visium.html +++ b/dev/articles/vig2_visium.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/vig2_visium.Rmd
vig2_visium.Rmd
diff --git a/dev/articles/vig3_slideseq_v2.html b/dev/articles/vig3_slideseq_v2.html index 187f438b..2b46a343 100644 --- a/dev/articles/vig3_slideseq_v2.html +++ b/dev/articles/vig3_slideseq_v2.html @@ -117,7 +117,7 @@

Kayla Jackson

-

2024-05-02

+

2024-05-03

Source: vignettes/vig3_slideseq_v2.Rmd
vig3_slideseq_v2.Rmd
diff --git a/dev/articles/vig4_cosmx.html b/dev/articles/vig4_cosmx.html index 2f99cb6d..a9e737f8 100644 --- a/dev/articles/vig4_cosmx.html +++ b/dev/articles/vig4_cosmx.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/vig4_cosmx.Rmd
vig4_cosmx.Rmd
@@ -598,7 +598,7 @@

Spatial autocorrelation in QC met #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed -#> 5.473 0.024 5.499 +#> 5.755 0.029 5.791

Now compute Moran’s I for some cell QC metrics

 features_use <- c("nCounts", "nGenes", "Area", "AspectRatio")
@@ -636,7 +636,7 @@ 

Spatial autocorrelation in QC met BPPARAM = MulticoreParam(2)) ) #> user system elapsed -#> 419.966 10.410 216.121

+#> 470.737 12.785 244.024

Note that MulticoreParam() doesn’t work on Windows; this vignette was built on Linux. Use SnowParam() or DoparParam() for Windows. See diff --git a/dev/articles/vig5_xenium.html b/dev/articles/vig5_xenium.html index dcabb8e5..8ff96ab2 100644 --- a/dev/articles/vig5_xenium.html +++ b/dev/articles/vig5_xenium.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/vig5_xenium.Rmd
vig5_xenium.Rmd
@@ -882,7 +882,7 @@

Spatial autocorrelation of QC met style = "W") ) #> user system elapsed -#> 7.258 0.033 7.297 +#> 6.706 0.028 6.738
 sfe <- colDataMoransI(sfe, c("nCounts", "nGenes", "cell_area", "nucleus_area"),
                       colGraphName = "knn5")
@@ -946,7 +946,7 @@

Moran’s Isfe <- runMoransI(sfe, colGraphName = "knn5", BPPARAM = MulticoreParam(2)) ) #> user system elapsed -#> 35.226 4.670 21.965 +#> 38.816 5.136 24.308
 rowData(sfe)$is_neg <- is_any_neg
diff --git a/dev/articles/vig6_merfish.html b/dev/articles/vig6_merfish.html
index be6d34b2..b4ddeb99 100644
--- a/dev/articles/vig6_merfish.html
+++ b/dev/articles/vig6_merfish.html
@@ -117,7 +117,7 @@ 
                         

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/vig6_merfish.Rmd
vig6_merfish.Rmd
@@ -238,7 +238,7 @@

Quality control)

#>    user  system elapsed 
-#>  25.775   2.265  28.125
+#> 19.499 0.900 20.442

Here nCounts kind of looks like salt and pepper. Using the scattermore package can speed up plotting a large number of points. In this non-interactive plot, the cell polygons are too small to see anyway, so @@ -250,7 +250,7 @@

Quality control})

#>    user  system elapsed 
-#>   1.468   0.369   1.838
+#> 1.625 0.194 1.823

When run on our server, plotting almost 400,000 polygons took around 23 seconds, while using geom_scattermore() (scattermore = TRUE) took about 2 seconds. Since @@ -610,7 +610,7 @@

Spatial autocorrelation of QC met style = "W") ) #> user system elapsed -#> 56.014 0.818 57.663 +#> 39.365 0.177 39.630

With the spatial neighborhood graph, we can compute Moran’s I for QC metrics.

@@ -654,7 +654,7 @@ 

Moran’s Isfe <- runMoransI(sfe, BPPARAM = MulticoreParam(2)) ) #> user system elapsed -#> 132.531 36.719 88.560

+#> 114.823 11.824 67.100

It’s actually not as slow as I thought for almost 400,000 cells. How are Moran’s I’s distributed for real genes and blank probes?

@@ -911,11 +911,11 @@ 

PCA for larger datasets scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed -#> 22.968 1.253 24.254 +#> 21.014 1.319 22.367 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) -#> Ncells 16650376 889.3 32047497 1711.6 NA 32047497 1711.6 -#> Vcells 250335884 1910.0 490341574 3741.1 16384 490268204 3740.5

+#> Ncells 16650376 889.3 32047514 1711.6 NA 32047514 1711.6 +#> Vcells 250335884 1910.0 490341574 3741.1 16384 490265901 3740.5

That’s pretty quick for almost 400,000 cells, but there aren’t that many genes here. Use the elbow plot to see variance explained by each PC:

diff --git a/dev/articles/vig7_seqfish.html b/dev/articles/vig7_seqfish.html index e977c492..f2700887 100644 --- a/dev/articles/vig7_seqfish.html +++ b/dev/articles/vig7_seqfish.html @@ -117,7 +117,7 @@

Kayla Jackson

-

2024-05-02

+

2024-05-03

Source: vignettes/vig7_seqfish.Rmd
vig7_seqfish.Rmd
diff --git a/dev/articles/vig8_codex.html b/dev/articles/vig8_codex.html index 6c029c9a..6c10293c 100644 --- a/dev/articles/vig8_codex.html +++ b/dev/articles/vig8_codex.html @@ -117,7 +117,7 @@

Kayla Jackson

-

2024-05-02

+

2024-05-03

Source: vignettes/vig8_codex.Rmd
vig8_codex.Rmd
diff --git a/dev/articles/vig9_splitseq.html b/dev/articles/vig9_splitseq.html index e01f52dc..df2b907f 100644 --- a/dev/articles/vig9_splitseq.html +++ b/dev/articles/vig9_splitseq.html @@ -117,7 +117,7 @@

Kayla Jackson and A. Sina Booeshaghi

-

2024-05-02

+

2024-05-03

Source: vignettes/vig9_splitseq.Rmd
vig9_splitseq.Rmd
diff --git a/dev/articles/visium_10x.html b/dev/articles/visium_10x.html index 5c3e7926..e7fbbf39 100644 --- a/dev/articles/visium_10x.html +++ b/dev/articles/visium_10x.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/visium_10x.Rmd
visium_10x.Rmd
diff --git a/dev/articles/visium_10x_spatial.html b/dev/articles/visium_10x_spatial.html index 7c14cbc7..a8b7e908 100644 --- a/dev/articles/visium_10x_spatial.html +++ b/dev/articles/visium_10x_spatial.html @@ -117,7 +117,7 @@

Lambda Moses

-

2024-05-02

+

2024-05-03

Source: vignettes/visium_10x_spatial.Rmd
visium_10x_spatial.Rmd
diff --git a/dev/pkgdown.yml b/dev/pkgdown.yml index 7ab152c7..742f04e7 100644 --- a/dev/pkgdown.yml +++ b/dev/pkgdown.yml @@ -60,7 +60,7 @@ articles: visium_10x: visium_10x.html visium_landing: visium_landing.html xenium_landing: xenium_landing.html -last_built: 2024-05-02T19:40Z +last_built: 2024-05-03T19:37Z urls: reference: https://pachterlab.github.io/voyager/reference article: https://pachterlab.github.io/voyager/articles diff --git a/dev/reference/plotLocalResult-1.png b/dev/reference/plotLocalResult-1.png index 85f6932e..3fcdf209 100644 Binary files a/dev/reference/plotLocalResult-1.png and b/dev/reference/plotLocalResult-1.png differ diff --git a/dev/search.json b/dev/search.json index 517e6f34..a36a6657 100644 --- a/dev/search.json +++ b/dev/search.json @@ -1 +1 @@ -[{"path":"https://pachterlab.github.io/voyager/dev/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"Artistic License 2.0","title":"Artistic License 2.0","text":"Copyright (c) 2000-2006, Perl Foundation. Everyone permitted copy distribute verbatim copies license document, changing allowed. Preamble ******** license establishes terms given free software Package may copied, modified, distributed, /redistributed. intent Copyright Holder maintains artistic control development Package still keeping Package available open source free software. always permitted make arrangements wholly outside license directly Copyright Holder given Package. terms license permit full use propose make Package, contact Copyright Holder seek different licensing arrangement. Definitions *********** “Copyright Holder” means individual(s) organization(s) named copyright notice entire Package. “Contributor” means party contributed code material Package, accordance Copyright Holder’s procedures. “” “” means person like copy, distribute, modify Package. “Package” means collection files distributed Copyright Holder, derivatives collection /files. given Package may consist either Standard Version, Modified Version. “Distribute” means providing copy Package making accessible anyone else, case company organization, others outside company organization. “Distributor Fee” means fee charge Distributing Package providing support Package another party. mean licensing fees. “Standard Version” refers Package modified, modified ways explicitly requested Copyright Holder. “Modified Version” means Package, changed, changes explicitly requested Copyright Holder. “Original License” means Artistic License Distributed Standard Version Package, current version may modified Perl Foundation future. “Source” form means source code, documentation source, configuration files Package. “Compiled” form means compiled bytecode, object code, binary, form resulting mechanical transformation translation Source form. Permission Use Modification Without Distribution ******************************************************** permitted use Standard Version create use Modified Versions purpose without restriction, provided Distribute Modified Version. Permissions Redistribution Standard Version ****************************************************** may Distribute verbatim copies Source form Standard Version Package medium without restriction, either gratis Distributor Fee, provided duplicate original copyright notices associated disclaimers. discretion, verbatim copies may may include Compiled form Package. may apply bug fixes, portability changes, modifications made available Copyright Holder. resulting Package still considered Standard Version, subject Original License. Distribution Modified Versions Package Source ********************************************************** may Distribute Modified Version Source (either gratis Distributor Fee, without Compiled form Modified Version) provided clearly document differs Standard Version, including, limited , documenting non-standard features, executables, modules, provided least ONE following: make Modified Version available Copyright Holder Standard Version, Original License, Copyright Holder may include modifications Standard Version. ensure installation Modified Version prevent user installing running Standard Version. addition, Modified Version must bear name different name Standard Version. allow anyone receives copy Modified Version make Source form Modified Version available others Original License license permits licensee freely copy, modify redistribute Modified Version using licensing terms apply copy licensee received, requires Source form Modified Version, works derived , made freely available license fees prohibited Distributor Fees allowed. Distribution Compiled Forms Standard Version Modified ****************************************************************** Versions without Source *************************** may Distribute Compiled forms Standard Version without Source, provided include complete instructions get Source Standard Version. instructions must valid time distribution. instructions, time carrying distribution, become invalid, must provide new instructions demand cease distribution. provide valid instructions cease distribution within thirty days become aware instructions invalid, forfeit rights license. may Distribute Modified Version Compiled form without Source, provided comply Section 4 respect Source Modified Version. Aggregating Linking Package ********************************** may aggregate Package (either Standard Version Modified Version) packages Distribute resulting aggregation provided charge licensing fee Package. Distributor Fees permitted, licensing fees components aggregation permitted. terms license apply use Distribution Standard Modified Versions included aggregation. permitted link Modified Standard Versions works, embed Package larger work , build stand-alone binary bytecode versions applications include Package, Distribute result without restriction, provided result expose direct interface Package. Items Considered Part Modified Version ******************************************************** Works (including, limited , modules scripts) merely extend make use Package, , , cause Package Modified Version. addition, works considered parts Package , subject terms license. General Provisions ****************** use, modification, distribution Standard Modified Versions governed Artistic License. using, modifying distributing Package, accept license. use, modify, distribute Package, accept license. Modified Version derived Modified Version made someone , nevertheless required ensure Modified Version complies requirements license. license grant right use trademark, service mark, tradename, logo Copyright Holder. license includes non-exclusive, worldwide, free--charge patent license make, made, use, offer sell, sell, import otherwise transfer Package respect patent claims licensable Copyright Holder necessarily infringed Package. institute patent litigation (including cross-claim counterclaim) party alleging Package constitutes direct contributory patent infringement, Artistic License shall terminate date litigation filed. Disclaimer Warranty: PACKAGE PROVIDED COPYRIGHT HOLDER CONTRIBUTORS “’ WITHOUT EXPRESS IMPLIED WARRANTIES. IMPLIED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE, NON-INFRINGEMENT DISCLAIMED EXTENT PERMITTED LOCAL LAW. UNLESS REQUIRED LAW, COPYRIGHT HOLDER CONTRIBUTOR LIABLE DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL DAMAGES ARISING WAY USE PACKAGE, EVEN ADVISED POSSIBILITY DAMAGE.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xatac_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell ATAC Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xatac_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell ATAC Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xcrispr_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell CRISPR Screening Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xcrispr_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell CRISPR Screening Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xmultiome_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell Multiome Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xmultiome_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell Multiome Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"Pros: Applicable frozen samples Capture nascent transcripts Sensible tissues multiple nuclei one cell Cons: Lose spatial information expensive less flexible open source technologies","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Bivariate spatial statistics","text":"Consider two variables correlated, say Pearson correlation 0.8. observations spatially referenced. locations observations can permuted without affecting Pearson correlation. purpose bivariate spatial statistics indicate correlation value (Pearson correlation), spatial autocorrelation co-patterning. One bivariate methods implemented Voyager cross variogram, shown variogram vignette. vignette demonstrates bivariate spatial statistics, use spatial neighborhood graph, mouse skeletal muscle Visium dataset. load packages used: list bivariate global methods can seen : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods(). QC performed another vignette, vignette plot QC metrics. image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(pheatmap) library(scico) theme_set(theme_bw()) listSFEMethods(variate = \"bi\", scope = \"global\") #> name description #> 1 lee Lee's bivariate statistic #> 2 lee.mc Lee's bivariate static with permutation testing #> 3 lee.test Lee's L test #> 4 cross_variogram Cross variogram #> 5 cross_variogram_map Cross variogram map (sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\") sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] sfe_tissue <- logNormCounts(sfe_tissue) colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"lees-l","dir":"Articles","previous_headings":"","what":"Lee’s L","title":"Bivariate spatial statistics","text":"Lee’s L (Lee 2001) developed relating Moran’s Pearson correlation, defined \\[ L_{X,Y} = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\left[ \\sum_{j=1}^n w_{ij} (x_j - \\bar{x}) \\right] \\left[ \\sum_{j=1}^n w_{ij} (y_j - \\bar{y}) \\right]}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2}\\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2} }, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) \\(y\\) variables values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood. compute Lee’s L top highly variagle genes (HVGs) dataset: bivariate global results can different formats (matrix Lee’s L lists many methods), results stored SFE object. gives spatially informed correlation matrix among genes, can plotted heatmap: coexpression blocks can seen. Note unlike Pearson correlation, diagonal 1, \\[ L_{X,X} = \\frac{\\sum_i (\\tilde x_i - \\bar x)^2}{\\sum_i (x_i - \\bar x)^2} = \\mathrm{SSS}_X, \\] approximated ratio variance spatially lagged \\(x\\) variance \\(x\\). spatial lag introduces smoothing, spatial lag reduced variance, making diagonal less 1. spatial smoothing scalar (SSS), Moran’s approximately Pearson correlation \\(X\\) spatially lagged \\(X\\) (\\(\\tilde X\\)) multiplied SSS: \\[ \\approx \\mathrm{SSS}_X \\cdot \\rho_{X, \\tilde X} \\] Similarly Lee’s L, shown (Lee 2001), \\[ L_{X, Y} = \\sqrt{\\mathrm{SSS}_X}\\sqrt{\\mathrm{SSS}_Y} \\cdot \\rho_{\\tilde X, \\tilde Y} \\] spatial clustering, variance less reduced spatial lag, leading larger SSS. Hence \\(X\\) \\(Y\\) spatially distributed like salt pepper strongly correlated, Lee’s L low lack spatial autocorrelation leads small SSS. Weighted correlation network analysis (WGCNA) (Langfelder Horvath 2008) time honored method find gene co-expression modules, can take correlation matrix. interesting apply WGCNA Lee’s L matrix identify spatially informed gene co-expression modules.","code":"hvgs <- getTopHVGs(sfe_tissue, fdr.threshold = 0.01) res <- calculateBivariate(sfe_tissue, type = \"lee\", feature1 = hvgs) pal_rng <- getDivergeRange(res) pal <- scico(256, begin = pal_rng[1], end = pal_rng[2], palette = \"vik\") pheatmap(res, color = pal, show_rownames = FALSE, show_colnames = FALSE, cellwidth = 1, cellheight = 1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"local-lee","dir":"Articles","previous_headings":"","what":"Local Lee","title":"Bivariate spatial statistics","text":"Local Lee’s L (Lee 2001) defined \\[ L_i = \\frac{n\\left[ \\sum_{j=1}^n w_{ij} (x_j - \\bar{x}) \\right] \\left[ \\sum_{j=1}^n w_{ij} (y_j - \\bar{y}) \\right]}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2}\\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2} } \\] Compare global L previous section. Local L sum locations \\(\\). contribution location global L can show spatial heterogeneity relationship two variables. bivariate local methods Voyager listed : compute local L two myofiber marker genes one gene highly expressed injury site: Bivariate local results stored localResults field feature names pairwise combinations features supplied. feature1 specified, bivariate method applied pairwise combinations feature1. Lee’s L, \\(L_{X,Y}\\) \\(L_{Y,X}\\) computed although . However, bivariate methods symmetric (see next section). next release (Bioconductor 3.18), may introduce another argument indicate whether method symmetric compute \\(L_{X,Y}\\) \\(L_{Y,X}\\). First plot three genes individually: plot local L’s: see regions Myh1 Myh2 co-expressed, myosins Ftl1 negatively correlated. \\(L_{X,X}\\) also computed, can plot local SSS three genes: See local SSS compares local Moran’s : patterns qualitatively , local Moran’s negative heterogeneous regions, SSS can’t negative.","code":"listSFEMethods(\"bi\", \"local\") #> name description #> 1 locallee Local Lee's bivariate statistic #> 2 localmoran_bv Local bivariate Moran's I sfe_tissue <- runBivariate(sfe_tissue, \"locallee\", swap_rownames = \"symbol\", feature1 = c(\"Myh2\", \"Myh1\", \"Ftl1\")) localResultFeatures(sfe_tissue, \"locallee\") #> [1] \"Myh2__Myh2\" \"Myh1__Myh2\" \"Ftl1__Myh2\" \"Myh2__Myh1\" \"Myh1__Myh1\" #> [6] \"Ftl1__Myh1\" \"Myh2__Ftl1\" \"Myh1__Ftl1\" \"Ftl1__Ftl1\" plotSpatialFeature(sfe_tissue, c(\"Myh2\", \"Myh1\", \"Ftl1\"), swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4) plotLocalResult(sfe_tissue, \"locallee\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\"), colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"locallee\", c(\"Myh1__Myh1\", \"Myh2__Myh2\", \"Ftl1__Ftl1\"), colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- runUnivariate(sfe_tissue, \"localmoran\", c(\"Myh2\", \"Myh1\", \"Ftl1\"), swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"localmoran\", c(\"Myh1\", \"Myh2\", \"Ftl1\"), colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"bivariate-local-moran","dir":"Articles","previous_headings":"","what":"Bivariate local Moran","title":"Bivariate spatial statistics","text":"spdep package implements bivariate version local Moran, basically \\[ I_{X_i,Y_i} = (n-1)\\frac{(x_i - \\bar{x})\\sum_{j=1}^n w_{ij} (y_j - \\bar{y})}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2} \\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2}}. \\] Note symmetric, .e. \\(I_{X_i,Y_i} \\neq I_{Y_i,X_i}\\). Permutation testing performed get pseudo p-value First plot bivariate local Moran’s values first row plots XY second row plots YX; note similar, . bivariate local Moran mean? ’s kind like contribution location correlation \\(x\\) spatially lagged \\(y\\), \\(x\\) smoothed. contrast, Lee’s L scaled Pearson correlation spatially lagged \\(x\\) spatially lagged \\(y\\). permutation testing performed, can plot pseudo-p-value, correcting multiple testing based spatial neighborhood graph: Note p-values asymetric, according source code localmoran_bv(), \\(y\\) permuted, \\(x\\). ’s also related Wartenberg’s spatial PCA (Wartenberg 1985), Moran’s expressed matrix form: \\[ \\mathbf{} = \\frac{\\mathbf{Z}^T\\mathbf{WZ}}{\\mathbf 1^T \\mathbf{W1}}, \\] \\(\\mathbf Z\\) data matrix scaled centered variables columns, \\(\\mathbf W\\) spatial weights matrix, \\(\\mathbf 1\\) vector 1’s, denominator effect \\(\\sum_{=1}^n \\sum_{j=1}^n w_{ij}\\). diagonal entries Moran’s ’s variables, diagonal entries global versions computed sum bivariate local Moran’s ’s divide sum spatial weights. \\(\\mathbf W\\) doesn’t symmetric, matrix may symmetric. Wartenberg diagonalized matrix place covariance matrix spatial PCA. using scaled centered data row normalized spatial weights matrix, MULTISPATI PCA equivalent Wartenberg’s approach (Dray, Saı̈d, Débias 2008). Lee considered asymmetry inadequacy Wartenberg’s approach bivariate association measure (Lee 2001). ’m sure bivariate local Moran’s helps data analysis, interesting piece history.","code":"sfe_tissue <- runBivariate(sfe_tissue, \"localmoran_bv\", c(\"Myh1\", \"Myh2\", \"Ftl1\"), swap_rownames = \"symbol\", nsim = 1000) localResultFeatures(sfe_tissue, \"localmoran_bv\") #> [1] \"Myh1__Myh1\" \"Myh2__Myh1\" \"Ftl1__Myh1\" \"Myh1__Myh2\" \"Myh2__Myh2\" #> [6] \"Ftl1__Myh2\" \"Myh1__Ftl1\" \"Myh2__Ftl1\" \"Ftl1__Ftl1\" localResultAttrs(sfe_tissue, \"localmoran_bv\", \"Myh1__Myh2\") #> [1] \"Ibvi\" \"E.Ibvi\" \"Var.Ibvi\" #> [4] \"Z.Ibvi\" \"Pr(z != E(Ibvi))\" \"Pr(z != E(Ibvi)) Sim\" #> [7] \"Pr(folded) Sim\" \"-log10p Sim\" \"-log10p_adj Sim\" plotLocalResult(sfe_tissue, \"localmoran_bv\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\", \"Myh2__Myh1\", \"Ftl1__Myh2\", \"Ftl1__Myh1\"), colGeometryName = \"spotPoly\", attribute = \"Ibvi\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"localmoran_bv\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\", \"Myh2__Myh1\", \"Ftl1__Myh2\", \"Ftl1__Myh1\"), colGeometryName = \"spotPoly\", attribute = \"-log10p_adj Sim\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Bivariate spatial statistics","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] scico_1.5.0 pheatmap_1.0.12 #> [3] scran_1.30.2 scater_1.30.1 #> [5] ggplot2_3.5.1 scuttle_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] SpatialFeatureExperiment_1.3.0 SFEData_1.4.0 #> [19] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 sf_1.0-16 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 DBI_1.2.2 #> [19] RColorBrewer_1.1-3 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] dqrng_0.3.2 pkgdown_2.0.9 #> [33] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [35] DelayedArray_0.28.0 tidyselect_1.2.1 #> [37] farver_2.1.1 ScaledMatrix_1.10.0 #> [39] viridis_0.6.5 BiocFileCache_2.10.2 #> [41] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [43] e1071_1.7-14 systemfonts_1.0.6 #> [45] dbscan_1.1-12 tools_4.3.3 #> [47] ggnewscale_0.4.10 ragg_1.3.0 #> [49] Rcpp_1.0.12 glue_1.7.0 #> [51] gridExtra_2.3 SparseArray_1.2.4 #> [53] xfun_0.43 dplyr_1.1.4 #> [55] HDF5Array_1.30.1 withr_3.0.0 #> [57] BiocManager_1.30.22 fastmap_1.1.1 #> [59] boot_1.3-30 rhdf5filters_1.14.1 #> [61] bluster_1.12.0 fansi_1.0.6 #> [63] spData_2.3.0 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] RSQLite_2.3.6 utf8_1.2.4 #> [73] generics_0.1.3 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 spdep_1.3-3 #> [79] pkgconfig_2.0.3 gtable_0.3.5 #> [81] blob_1.2.4 XVector_0.42.0 #> [83] htmltools_0.5.8.1 scales_1.3.0 #> [85] png_0.1-8 SpatialExperiment_1.12.0 #> [87] knitr_1.45 rjson_0.2.21 #> [89] curl_5.2.1 proxy_0.4-27 #> [91] cachem_1.0.8 rhdf5_2.46.1 #> [93] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [95] parallel_4.3.3 vipor_0.4.7 #> [97] AnnotationDbi_1.64.1 desc_1.4.3 #> [99] s2_1.1.6 pillar_1.9.0 #> [101] grid_4.3.3 vctrs_0.6.5 #> [103] promises_1.3.0 BiocSingular_1.18.0 #> [105] dbplyr_2.5.0 beachmat_2.18.1 #> [107] xtable_1.8-4 cluster_2.1.6 #> [109] beeswarm_0.4.0 evaluate_0.23 #> [111] magick_2.8.3 cli_3.6.2 #> [113] locfit_1.5-9.9 compiler_4.3.3 #> [115] rlang_1.1.3 crayon_1.5.2 #> [117] labeling_0.4.3 classInt_0.4-10 #> [119] fs_1.6.4 ggbeeswarm_0.7.2 #> [121] viridisLite_0.4.2 deldir_2.0-4 #> [123] BiocParallel_1.36.0 munsell_0.5.1 #> [125] Biostrings_2.70.3 Matrix_1.6-5 #> [127] ExperimentHub_2.10.0 patchwork_1.2.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] interactiveDisplayBase_1.40.0 highr_0.10 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"Pros: Widely used, tested many cell types tissues Many existing datasets, including many 10X website High throughput, applied atlases millions cells Cons: Lose spatial information expensive open source technologies Less flexible tissues skeletal muscles challenging dissociate","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"PhenoCycler Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution Formalin fixed, paraffin embedded (FFPE) tissue compatible Cons: Requires panels proteins usually dozens antibodies, standard highly multiplexed immunofluorescence. Akoya sells curated panels.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"PhenoCycler Processing Workflows with Voyager","text":"Several CODEX datasets generated HuBMAP Consortium available download data portal. Raw processed data typically avaiable several fields view can readily combined single SpatialFeatureExperiment(SFE) object. tutorial processing output various spatial transcriptomics technologies SFE object use Voyager available vignette linked .","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"PhenoCycler Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated CODEX technology.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"CosMX Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible histological staining including DAPI 100 proteins can quantified CosMX along side RNAs Cons: curated panel usually hundred genes required. However, Nanostring provides curated gene panels common applications oncology, neuroscience, immunology, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"CosMX Processing Workflows with Voyager","text":"Nanostring released CosMX FFPE dataset website. tutorial processing output various spatial transcriptomics technologies, including CosMX, SpatialFeatureExperiment(SFE) object use Voyager available . vignette provides technology specific notes data downloaded Nanostring. Nanostring provides cell segmentation data images, cell centroid coordinates provided metadata.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"CosMX Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated CosMX SMI. publicly available CosMX dataset profiles 960 genes across 8 non-small-cell lung cancer (NSCLC) samples.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"visium-space-ranger-output","dir":"Articles","previous_headings":"","what":"Visium Space Ranger output","title":"Create a SpatialFeatureExperiment object","text":"10x Genomics Space Ranger output Visium experiment can read similar manner SpatialExperiment; SpatialFeatureExperiment SFE object spotPoly column geometry spot polygons. filtered matrix (.e. spots tissue) read , column graph called visium also present spatial neighborhood graph Visium spots tissue. graph computed spots read regardless whether tissue. results tissue capture outs directory. Inside outs directory two directories: raw_reature_bc_matrix unfiltered gene count matrix, spatial spatial information. DropletUtils package function read10xCounts() reads gene count matrix. SPE reads spatial information, SFE uses spatial information construct Visium spot polygons spatial neighborhood graphs. Inside spatial directory: tissue_lowres_image.png low resolution image tissue. Inside scalefactors_json.json file: spot_diameter_fullres diameter Visium spot full resolution H&E image pixels. tissue_hires_scalef tissue_lowres_scalef ratio size high resolution (full resolution) low resolution H&E image full resolution image. fiducial_diameter_fullres diameter fiducial spot used align spots H&E image pixels full resolution image. tissue_positions_list.csv file contains information spatial coordinates spots whether spot tissue automatically detected Space Ranger manually annotated Loupe browser. polygon tissue boundary available, whether image processing manual annotation, geometric operations supported SFE package, based sf package, can used find spots intersect tissue spots contained tissue. Geometric operations can also find polygons intersections spots tissue, results can get messy since intersections can polygons also points lines. Now read toy data Space Ranger output format. Since Bioconductor version 3.17 (Voyager version 1.2.0), image read SpatRaster object terra package, loaded memory unless necessary. plotting large image, downsampled thus fully loaded memory. unit can set unit argument, can either pixels full resolution image microns. latter calculated former based spacing spots, known 100 microns. Space Ranger output includes gene count matrix, spot coordinates, spot diameter. Space Ranger output include nuclei segmentation pathologist annotation histological regions. Extra image processing, ImageJ QuPath, required geometries.","code":"# Example from SpatialExperiment dir <- system.file( file.path(\"extdata\", \"10xVisium\"), package = \"SpatialExperiment\") sample_ids <- c(\"section1\", \"section2\") (samples <- file.path(dir, sample_ids, \"outs\")) #> [1] \"/Users/runner/work/_temp/Library/SpatialExperiment/extdata/10xVisium/section1/outs\" #> [2] \"/Users/runner/work/_temp/Library/SpatialExperiment/extdata/10xVisium/section2/outs\" list.files(samples[1]) #> [1] \"raw_feature_bc_matrix\" \"spatial\" list.files(file.path(samples[1], \"spatial\")) #> [1] \"scalefactors_json.json\" \"tissue_lowres_image.png\" #> [3] \"tissue_positions_list.csv\" fromJSON(file = file.path(samples[1], \"spatial\", \"scalefactors_json.json\")) #> $spot_diameter_fullres #> [1] 89.44476 #> #> $tissue_hires_scalef #> [1] 0.1701114 #> #> $fiducial_diameter_fullres #> [1] 144.4877 #> #> $tissue_lowres_scalef #> [1] 0.05103343 (sfe3 <- read10xVisiumSFE(samples, dirs = samples, sample_id = sample_ids, type = \"sparse\", data = \"raw\", images = \"lowres\", unit = \"full_res_image_pixel\")) #> class: SpatialFeatureExperiment #> dim: 50 99 #> metadata(0): #> assays(1): counts #> rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000005886 ENSMUSG00000101476 #> rowData names(1): symbol #> colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> AAAGTCGACCCTCAGT-1-1 AAAGTGCCATCAATTA-1-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> section1: #> section2:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"vizgen-merfish-output","dir":"Articles","previous_headings":"Visium Space Ranger output","what":"Vizgen MERFISH output","title":"Create a SpatialFeatureExperiment object","text":"commercialized MERFISH Vizgen standard output format, can read SFE readVizgen(). cell segmentation field view (FOV) separate HDF5 file MERFISH dataset can hundreds FOVs, strongly recommend reading MERFISH output server large number CPU cores. Alternatively, MERFISH datasets store cell segmentation parquet file, can easily read R. read toy dataset first FOV real dataset: unit always microns.","code":"dir_use <- system.file(file.path(\"extdata\", \"vizgen\"), package = \"SpatialFeatureExperiment\") (sfe_mer <- readVizgen(dir_use, z = 0L, image = \"PolyT\", use_cellpose = FALSE)) #> class: SpatialFeatureExperiment #> dim: 385 100 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(0): #> colnames(100): 103327291694389284070574461648020091166 #> 105028411815552368766949841604861213395 ... #> 99103300832376657987379734140330816574 #> 99471994882184799235845481075474519252 #> colData names(7): fov volume ... max_y sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: micron #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"create-sfe-object-from-scratch","dir":"Articles","previous_headings":"","what":"Create SFE object from scratch","title":"Create a SpatialFeatureExperiment object","text":"SFE object can constructed scratch assay matrices metadata. toy example, dgCMatrix used, since SFE inherits SingleCellExperiment (SCE), types arrays supported SCE delayed arrays also work. sufficient create SPE object, SFE object, even though sf data frame constructed geometries. constructor behaves similarly SPE constructor. centroid coordinates Visium spots example can converted spot polygons spotDiameter argument, can also relevant technologies round spots beads, Slide-seq. Spot diameter pixels full resolution images can found scalefactors_json.json file Space Ranger output. geometries spatial graphs can added calling constructor. Geometries can also supplied constructor.","code":"# Visium barcode location from Space Ranger data(\"visium_row_col\") coords1 <- visium_row_col[visium_row_col$col < 6 & visium_row_col$row < 6,] coords1$row <- coords1$row * sqrt(3) # Random toy sparse matrix set.seed(29) col_inds <- sample(1:13, 13) row_inds <- sample(1:5, 13, replace = TRUE) values <- sample(1:5, 13, replace = TRUE) mat <- sparseMatrix(i = row_inds, j = col_inds, x = values) colnames(mat) <- coords1$barcode rownames(mat) <- sample(LETTERS, 5) sfe3 <- SpatialFeatureExperiment(list(counts = mat), colData = coords1, spatialCoordsNames = c(\"col\", \"row\"), spotDiameter = 0.7) # Convert regular data frame with coordinates to sf data frame cg <- df2sf(coords1[,c(\"col\", \"row\")], c(\"col\", \"row\"), spotDiameter = 0.7) rownames(cg) <- colnames(mat) sfe3 <- SpatialFeatureExperiment(list(counts = mat), colGeometries = list(foo = cg))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"technology-specific-notes","dir":"Articles","previous_headings":"Create SFE object from scratch","what":"Technology specific notes","title":"Create a SpatialFeatureExperiment object","text":"commercial technologies function directly read outputs. may implement functions next version SpatialFeatureExperiment. now show example code read output CosMX Xenium.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"gene-count-matrix-and-cell-metadata","dir":"Articles","previous_headings":"Create SFE object from scratch > Technology specific notes","what":"Gene count matrix and cell metadata","title":"Create a SpatialFeatureExperiment object","text":"gene count matrix cell metadata (including cell centroid coordinates) example datasets technologies CosMX Vizgen CSV files. recommend vroom package quickly read large CSV files. CSV files read data frames. gene count matrix, can converted matrix sparse dgCMatrix. matrix may need transposed genes rows cells columns. smFISH based data tend less sparse scRNA-seq data, using sparse matrix worthwhile since matrix still 50% zero. 10x Genomics’ new single cell resolution technology Xenium, gene count matrix h5 file, can read R SCE object DropletUtils::read10xCounts(). can converted SpatialExperiment, SpatialFeatureExperiment. gene count matrix DelayedArray, data loaded memory operations matrix performed chunks. DelayedArray converted dgCMatrix memory. cell metadata available CSV format, ’s also parquet format compact disk, can read R data frame arrow package. Example code:","code":"# Download data from https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast system(\"curl -O https://cf.10xgenomics.com/samples/xenium/1.0.1/Xenium_FFPE_Human_Breast_Cancer_Rep1/Xenium_FFPE_Human_Breast_Cancer_Rep1_outs.zip\") system(\"unzip Xenium_FFPE_Human_Breast_Cancer_Rep1_outs.zip\") system(\"mv outs outs_R1\") system(\"curl -O https://cf.10xgenomics.com/samples/xenium/1.0.1/Xenium_FFPE_Human_Breast_Cancer_Rep2/Xenium_FFPE_Human_Breast_Cancer_Rep2_outs.zip\") system(\"unzip Xenium_FFPE_Human_Breast_Cancer_Rep2_outs.zip\") system(\"mv outs outs_R2\") library(SpatialExperiment) library(DropletUtils) #library(arrow) sce <- read10xCounts(\"outs_R1/cell_feature_matrix.h5\") cell_info <- read_parquet(\"outs_R1/cells.parquet\") # Add the centroid coordinates to colData colData(sce) <- cbind(colData(sce), cell_info[,-1]) spe <- toSpatialExperiment(sce, spatialCoordsNames = c(\"x_centroid\", \"y_centroid\")) sfe <- toSpatialFeatureExperiment(spe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"cell-polygons","dir":"Articles","previous_headings":"Create SFE object from scratch > Technology specific notes","what":"Cell polygons","title":"Create a SpatialFeatureExperiment object","text":"File format cell polygons (available) different formats different technology. cell polygons sf data frames put colGeometries() SFE object. section explains number smFISH-based technologies. Xenium, cell polygons come CSV parquet files can directly read R data frame, 2 columns x y coordinates, one indicating cell coordinates belong . Change name cell ID column “ID”, use SpatialFeatureExperiment::df2sf() convert data frame sf data frame POLYGON geometry. Example code: CoxMX, cell polygons CSV files. Besides two coordinates columns, ’s column field view (FOV) another cell ID. However, unlike Xenium, cell IDs unique FOV, concatenated FOV make unique. df2sf() can also used convert regular data frame sf. Example code: See code used construct example datasets SFEData examples. Use sf::st_is_valid() check polygons valid. Polygons self-intersection valid, throw error geometric operations. common reason polygons invalid protruding line, can eliminated sf::st_buffer(cell_sf, dist = 0). Use sf::st_is_valid(cell_sf, reason = TRUE), plot invalid polygons, find polygons valid.","code":"#library(arrow) cell_poly <- read_parquet(\"outs_R2/cell_boundaries.parquet\") # Here the first column is cell ID names(cell_poly)[1] <- \"ID\" # \"vertex_x\" and \"vertex_y\" are the column names for coordinates here cell_sf <- df2sf(cell_poly, c(\"vertex_x\", \"vertex_y\"), geometryType = \"POLYGON\") # Download data from https://nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/ # stored here: https://www.dropbox.com/s/hl3peavrx92bluy/Lung5_Rep1-polygons.csv?dl=0 system(\"wget https://www.dropbox.com/s/hl3peavrx92bluy/Lung5_Rep1-polygons.csv?dl=1\") system(\"mv Lung5_Rep1-polygons.csv?dl=1 Lung5_Rep1-polygons.csv\") library(vroom) library(tidyr) cell_poly <- vroom(\"Lung5_Rep1-polygons.csv\") cell_poly <- cell_poly |> unite(\"ID\", fov:cellID) cell_sf <- df2sf(cell_poly, spatialCoordsNames = c(\"x_global_px\", \"y_global_px\"), geometryType = \"POLYGON\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Create a SpatialFeatureExperiment object","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] Matrix_1.6-5 rjson_0.2.21 #> [3] SpatialFeatureExperiment_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] DBI_1.2.2 bitops_1.0-7 #> [3] deldir_2.0-4 s2_1.1.6 #> [5] rlang_1.1.3 magrittr_2.0.3 #> [7] matrixStats_1.3.0 e1071_1.7-14 #> [9] compiler_4.3.3 DelayedMatrixStats_1.24.0 #> [11] systemfonts_1.0.6 vctrs_0.6.5 #> [13] pkgconfig_2.0.3 SpatialExperiment_1.12.0 #> [15] wk_0.9.1 crayon_1.5.2 #> [17] fastmap_1.1.1 magick_2.8.3 #> [19] XVector_0.42.0 scuttle_1.12.0 #> [21] utf8_1.2.4 rmarkdown_2.26 #> [23] tzdb_0.4.0 ragg_1.3.0 #> [25] bit_4.0.5 purrr_1.0.2 #> [27] xfun_0.43 bluster_1.12.0 #> [29] beachmat_2.18.1 zlibbioc_1.48.2 #> [31] cachem_1.0.8 GenomeInfoDb_1.38.8 #> [33] jsonlite_1.8.8 rhdf5filters_1.14.1 #> [35] DelayedArray_0.28.0 scico_1.5.0 #> [37] Rhdf5lib_1.24.2 BiocParallel_1.36.0 #> [39] terra_1.7-71 parallel_4.3.3 #> [41] cluster_2.1.6 R6_2.5.1 #> [43] bslib_0.7.0 limma_3.58.1 #> [45] boot_1.3-30 GenomicRanges_1.54.1 #> [47] jquerylib_0.1.4 Rcpp_1.0.12 #> [49] SummarizedExperiment_1.32.0 knitr_1.45 #> [51] R.utils_2.12.3 IRanges_2.36.0 #> [53] igraph_2.0.3 tidyselect_1.2.1 #> [55] abind_1.4-5 yaml_2.3.8 #> [57] codetools_0.2-20 lattice_0.22-6 #> [59] tibble_3.2.1 Biobase_2.62.0 #> [61] evaluate_0.23 desc_1.4.3 #> [63] sf_1.0-16 units_0.8-5 #> [65] spData_2.3.0 proxy_0.4-27 #> [67] pillar_1.9.0 MatrixGenerics_1.14.0 #> [69] KernSmooth_2.23-22 stats4_4.3.3 #> [71] generics_0.1.3 vroom_1.6.5 #> [73] sp_2.1-4 RCurl_1.98-1.14 #> [75] S4Vectors_0.40.2 ggplot2_3.5.1 #> [77] sparseMatrixStats_1.14.0 munsell_0.5.1 #> [79] scales_1.3.0 class_7.3-22 #> [81] glue_1.7.0 tools_4.3.3 #> [83] ggnewscale_0.4.10 BiocNeighbors_1.20.2 #> [85] RSpectra_0.16-1 locfit_1.5-9.9 #> [87] fs_1.6.4 rhdf5_2.46.1 #> [89] grid_4.3.3 spdep_1.3-3 #> [91] DropletUtils_1.22.0 edgeR_4.0.16 #> [93] colorspace_2.1-0 SingleCellExperiment_1.24.0 #> [95] patchwork_1.2.0 GenomeInfoDbData_1.2.11 #> [97] HDF5Array_1.30.1 cli_3.6.2 #> [99] textshaping_0.3.7 fansi_1.0.6 #> [101] S4Arrays_1.2.1 dplyr_1.1.4 #> [103] gtable_0.3.5 R.methodsS3_1.8.2 #> [105] sass_0.4.9 digest_0.6.35 #> [107] BiocGenerics_0.48.1 classInt_0.4-10 #> [109] dqrng_0.3.2 SparseArray_1.2.4 #> [111] htmlwidgets_1.6.4 R.oo_1.26.0 #> [113] memoise_2.0.1 htmltools_0.5.8.1 #> [115] pkgdown_2.0.9 lifecycle_1.0.4 #> [117] statmod_1.5.0 bit64_4.0.5"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"downloading-the-data","dir":"Articles","previous_headings":"","what":"Downloading the data","title":"How to create a SpatialFeatureExperiment object","text":"data used recent publication, High Resolution Slide-seqV2 Spatial Transcriptomics Enables Discovery Disease-Specific Cell Neighborhoods Pathways available download GEO (Accession Number: GSE190094. demonstrate use ffq access FTP links downloading relevant data. download data single WT sample. commented line shows install ffq R terminal. output command metadata GSM5713341. can use curl wget download files FTP links one--one. Files beginning ftp:// can read directly R package vroom. Files uncompressed reading. files automatically downloaded uncompressed. use method , commented lines show download files using curl.","code":"# system(\"pip install ffq\") system(\"ffq -l1 GSM5713341\") # system(\"curl -O ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_MappedDGEForR.csv.gz\") # system(\"curl -O ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_BeadLocationsForR.csv.gz\") # list.files(pattern = \"*.gz\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"reading-in-the-data","dir":"Articles","previous_headings":"","what":"Reading in the data","title":"How to create a SpatialFeatureExperiment object","text":"","code":"mtx <- vroom(\"ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_MappedDGEForR.csv.gz\") centroids <- vroom(\"ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_BeadLocationsForR.csv.gz\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"construct-a-sfe-object","dir":"Articles","previous_headings":"","what":"Construct a SFE object","title":"How to create a SpatialFeatureExperiment object","text":"count matrix bead locations provided authors. pass constructor SpatialFeatureExperiment object. files read data frames. convert gene count matrix matrix sparse dgCMatrix. , spot locations provided CSV file. two columns particular interest, namely xcoord ycoord. barcode column corresponds barcodes count matrix. calling SpatialFeatureExperiment constructor, spatial coordinates must converted sf data frame using df2sf(). coordinates centroid positions, indicate geometryType=\"POINT\". Now ingredients create SFE object. values assays colGeometries arguments must passed list shown .","code":"# Note: if using Google Colab, this step might run out of RAM # If this happens, please upgrade to Colab Pro rn <- mtx$Row mtx <- as.matrix(mtx[,-1]) rownames(mtx) <- rn mtx <- as(mtx, \"dgCMatrix\") colnames(centroids)[1] <- \"ID\" centroids <- df2sf( centroids, geometryType = \"POINT\", spatialCoordsNames=c(\"xcoord\",\"ycoord\")) sfe <- SpatialFeatureExperiment( assays = list(counts = mtx), colGeometries = list(centroids = centroids) ) sfe"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"How to create a SpatialFeatureExperiment object","text":"","code":"sessionInfo()"},{"path":[]},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Multivariate local Geary's C","text":"Local Geary’s C (Anselin 1995) defined : \\[ c_i = \\sum_jw_{ij}(x_i - x_j)^2, \\] \\(w_{ij}\\)s spatial weights location \\(\\) location \\(j\\) \\(x\\) variable spatial location. generalized multiple variables (Anselin 2019): \\[ c_{k,} = \\sum_{v=1}^k c_{v,}, \\] \\(k\\) variables. essentially spatially weighted sum squared distances locations feature space. vignette demonstrates usage multivariate local Geary’s C. load packages used: QC performed another vignette, vignette plot QC metrics. image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(spdep) theme_set(theme_bw()) (sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\") sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] sfe_tissue <- logNormCounts(sfe_tissue) colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"gene-expression","dir":"Articles","previous_headings":"","what":"Gene expression","title":"Multivariate local Geary's C","text":"compute multivariate local C top highly variagle genes (HVGs) dataset: results stored reducedDim although ’s really dimension reduction. can also go colData dest = \"colData\". test two sided, alternative argument can set “greater” test positive spatial autocorrelation “less” negative spatial autocorrelation. Geary’s C, value 1 indicates positive spatial autocorrelation value 1 indicates negative spatial autocorrelation. Local Geary’s C scaled, square difference expression, low value means homogeneous neighborhood high value means heterogeneous neighborhood. considering 341 top HVGs, muscle tendon junction unjury site heterogeneous, detected negative cluster. Permutation testing performed, although Anselin noted pseudo-p-values taken indicative interesting regions interpreted strict sense. Warm colors indicate adjusted p < 0.05. interpreted along clusters. dataset, interestingly homogeneous regions myofibers, interestingly heterogeneous region injury site. significant regions positive cluster, center injury site significant negative cluster.","code":"hvgs <- getTopHVGs(sfe_tissue, fdr.threshold = 0.01) sfe_tissue <- runMultivariate(sfe_tissue, \"localC_perm_multi\", subset_row = hvgs) names(reducedDim(sfe_tissue, \"localC_perm_multi\")) #> [1] \"localC_perm_multi\" \"E.Ci\" \"Var.Ci\" #> [4] \"Z.Ci\" \"Pr(z != E(Ci))\" \"Pr(z != E(Ci)) Sim\" #> [7] \"Pr(folded) Sim\" \"Skewness\" \"Kurtosis\" #> [10] \"-log10p Sim\" \"-log10p_adj Sim\" \"cluster\" spatialReducedDim(sfe_tissue, \"localC_perm_multi\", c(1, 12), image_id = \"lowres\", maxcell = 5e4) spatialReducedDim(sfe_tissue, \"localC_perm_multi\", c(11, 12), image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"top-principal-components","dir":"Articles","previous_headings":"","what":"Top principal components","title":"Multivariate local Geary's C","text":"multivariate local Geary’s C spatially weighted sum squared distances locations feature space, ’s affected curse dimensionality used large number features, uniformly distributed data points higher dimensions become equidistant increasing number dimensions. However, real data uniformly distributed can much smaller effective dimension number features, many genes co-regulated. Anselin suggested using main principal components, issue curse dimensionality remains investigated. Furthermore, cosine Manhattan distances suggested mitigate curse dimensionality, wonder use instead Euclidean distance feature space multivariate local Geary’s C. perform multivariate local Geary’s C top PCs: percentage variance explained top 20 PCs? area seem significant permutation test larger HVGs, area considered negative clusters smaller. significant regions pretty much positive cluster. differences results anything curse dimensionality? Twenty dimensions can still exhibit curse dimensionality, 300 HVGs worse. lose lot information, including negative spatial autocorrelation, using 20 PCs?","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 20, scale = TRUE) ElbowPlot(sfe_tissue) sum(attr(reducedDim(sfe_tissue, \"PCA\"), \"percentVar\")) #> [1] 38.8627 out <- localC_perm(reducedDim(sfe_tissue, \"PCA\"), listw = colGraph(sfe_tissue, \"visium\")) out <- Voyager:::.localCpermmulti2df(out, nb = colGraph(sfe_tissue, \"visium\")$neighbours, p.adjust.method = \"BH\") reducedDim(sfe_tissue, \"localC_PCs\", withDimnames = FALSE) <- out spatialReducedDim(sfe_tissue, \"localC_PCs\", c(1, 12), image_id = \"lowres\", maxcell = 5e4) spatialReducedDim(sfe_tissue, \"localC_PCs\", c(11, 12), image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Multivariate local Geary's C","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] spdep_1.3-3 sf_1.0-16 #> [3] spData_2.3.0 scran_1.30.2 #> [5] scater_1.30.1 ggplot2_3.5.1 #> [7] scuttle_1.12.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [19] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] DBI_1.2.2 RColorBrewer_1.1-3 #> [19] abind_1.4-5 zlibbioc_1.48.2 #> [21] purrr_1.0.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 dbscan_1.1-12 #> [45] tools_4.3.3 ggnewscale_0.4.10 #> [47] ragg_1.3.0 Rcpp_1.0.12 #> [49] glue_1.7.0 gridExtra_2.3 #> [51] SparseArray_1.2.4 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 bluster_1.12.0 #> [61] fansi_1.0.6 digest_0.6.35 #> [63] rsvd_1.0.5 R6_2.5.1 #> [65] mime_0.12 textshaping_0.3.7 #> [67] colorspace_2.1-0 wk_0.9.1 #> [69] RSQLite_2.3.6 utf8_1.2.4 #> [71] generics_0.1.3 class_7.3-22 #> [73] httr_1.4.7 htmlwidgets_1.6.4 #> [75] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [77] scico_1.5.0 gtable_0.3.5 #> [79] blob_1.2.4 XVector_0.42.0 #> [81] htmltools_0.5.8.1 scales_1.3.0 #> [83] png_0.1-8 SpatialExperiment_1.12.0 #> [85] knitr_1.45 rjson_0.2.21 #> [87] curl_5.2.1 proxy_0.4-27 #> [89] cachem_1.0.8 rhdf5_2.46.1 #> [91] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [93] parallel_4.3.3 vipor_0.4.7 #> [95] AnnotationDbi_1.64.1 desc_1.4.3 #> [97] s2_1.1.6 pillar_1.9.0 #> [99] grid_4.3.3 vctrs_0.6.5 #> [101] promises_1.3.0 BiocSingular_1.18.0 #> [103] dbplyr_2.5.0 beachmat_2.18.1 #> [105] xtable_1.8-4 cluster_2.1.6 #> [107] beeswarm_0.4.0 evaluate_0.23 #> [109] magick_2.8.3 cli_3.6.2 #> [111] locfit_1.5-9.9 compiler_4.3.3 #> [113] rlang_1.1.3 crayon_1.5.2 #> [115] labeling_0.4.3 classInt_0.4-10 #> [117] fs_1.6.4 ggbeeswarm_0.7.2 #> [119] viridisLite_0.4.2 deldir_2.0-4 #> [121] BiocParallel_1.36.0 munsell_0.5.1 #> [123] Biostrings_2.70.3 Matrix_1.6-5 #> [125] ExperimentHub_2.10.0 patchwork_1.2.0 #> [127] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [129] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [131] statmod_1.5.0 shiny_1.8.1.1 #> [133] interactiveDisplayBase_1.40.0 highr_0.10 #> [135] AnnotationHub_3.10.1 igraph_2.0.3 #> [137] memoise_2.0.1 bslib_0.7.0 #> [139] bit_4.0.5"},{"path":[]},{"path":[]},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"MERFISH Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible histological staining including DAPI Protein co-detection supported Cons: curated panel usually hundred genes required. However, Vizgen provides curated gene panels neuroscience oncology, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"MERFISH Processing Workflows with Voyager","text":"Several MERFISH datasets generated MERSCOPE Platform publicly available Vizgen website. provide examples available processing output various spatial transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager vignette . vignette provides technology specific notes data downloaded Vizgen. Briefly, Vizgen provides cell metadata gene count matrix CSV files can read quickly vroom package. Cell segmentation data provided HDF5 files delineated field view.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"MERFISH Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated MERFISH technology. publicly available MERFISH datasets profile hundreds genes hundreds thousands millions cells. Thus, vignettes linked can provide context capabilities Voyager moderate large datasets.","code":""},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"Due large number genes quantified single cell spatial transcriptomics, dimension reduction part standard workflow analyze data, visualize, help interpreting data, distill relevant information reduce noise, facilitate downstream analyses clustering pseudotime, project different samples shared latent space data integration, . first dimension reduction methods learn , good old principal component analysis (PCA), tSNE, UMAP, don’t use spatial information. rise spatial transcriptomics, dimension reduction methods take spatial dependence account written. , SpatialPCA (Shang Zhou 2022), NSF (Townes Engelhardt 2023), MEFISTO (Velten et al. 2022) use factor analysis probabilistic PCA related factor analysis, model factors Gaussian processes, spatial kernel covariance matrix, factors positive spatial autocorrelation can used downstream clustering clusters can spatially coherent. use graph convolution networks spatial neighborhood graph find spatially informed embeddings cells, conST (Zong et al. 2022) SpaceFlow (Ren et al. 2022). SpaSRL (Zhang et al. 2023) finds low dimension projection spatial neighborhood augmented data. Spatially informed dimension reduction actually new, dates back least 1985, Wartenberg’s crossover Moran’s PCA (Wartenberg 1985), generalized developed MULTISPATI PCA (Dray, Saı̈d, Débias 2008), implemented adespatial package CRAN. short, PCA tries maximize variance explained PC, MULTISPATI maximizes product Moran’s variance explained. Also, eigenvalues PCA non-negative, covariance matrix positive semidefinite, MULTISPATI can give negative eigenvalues, represent negative spatial autocorrelation, can present interesting common positive spatial autocorrelation often masked latter (Griffith 2019). single cell -omics conventions, let \\(X\\) denote gene count matrix whose columns cells Visium spots whose rows genes, \\(n\\) columns. Let \\(W\\) denote row normalized \\(n\\times n\\) adjacency matrix spatial neighborhood graph cells Visium spots, symmetric. MULTISPATI diagonalizes symmetric matrix \\[ H = \\frac 1 {2n} X(W^t+W)X^t \\] However, implementation adespatial general can used multivariate analyses duality diagram paradigm, correspondence analysis; equation simplified just PCA, without introduce duality diagram . Voyager 1.2.0 (Bioconductor 3.17) much faster implementation MULTISPATI PCA based RSpectra. See benchmark . vignette, perform MULTISPATI PCA MERFISH mouse liver dataset. See first vignette using dataset . load packages used: MULTISPATI PCA one multivariate methods introduced Voyager 1.2.0. multivariate methods Voyager listed : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods().","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(scuttle) library(ggplot2) library(stringr) library(tidyr) library(tibble) library(bluster) library(BiocSingular) library(BiocParallel) library(sf) library(patchwork) theme_set(theme_bw()) (sfe <- VizgenLiverData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 385 395215 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(395215): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(9): fov volume ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: listSFEMethods(variate = \"multi\") #> name description #> 1 multispati MULTISPATI PCA #> 2 localC_multi Multivariate local Geary's C #> 3 localC_perm_multi Multivariate local Geary's C permutation testing"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"QC already performed first vignette. QC , see first vignette details. Remove outliers empty cells: still 390,000 cells left removing outliers. Next compute Moran’s QC metrics, requires spatial neighborhood graph: Moran’s little negative, permutation testing, significant, though can also large number cells. lower bound Moran’s given spatial neighborhood graph usually closer -0.5 -1, upper bound usually around 1. bounds given specific spatial neighborhood graph can found moranBounds(), double centers adjacency matrix, hence making dense, isn’t enough memory use entire dataset. can look Moran bounds small subset data, might generalizable whole dataset given tissue appears quite homogeneous space. considering bounds, MOran’s values QC metrics like whose magnitudes seem substantial nCounts nGenes ’s positive spatial autocorrelation. may mild moderate negative spatial autocorrelation.","code":"is_blank <- str_detect(rownames(sfe), \"^Blank-\") sfe <- addPerCellQCMetrics(sfe, subset = list(blank = is_blank)) get_neg_ctrl_outliers <- function(col, sfe, nmads = 3, log = FALSE) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\", nmads = nmads, log = log) outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } sfe <- get_neg_ctrl_outliers(\"subsets_blank_percent\", sfe, log = TRUE) inds <- !sfe$is_blank_outlier & sfe$nCounts > 0 (sfe <- sfe[, inds]) #> class: SpatialFeatureExperiment #> dim: 385 390348 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(390348): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(16): fov volume ... total is_blank_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 34.506 0.224 34.753 features_use <- c(\"nCounts\", \"nGenes\", \"volume\") sfe <- colDataUnivariate(sfe, \"moran.mc\", features_use, colGraphName = \"knn5\", nsim = 49, BPPARAM = MulticoreParam(2)) plotMoranMC(sfe, features_use) bbox_use <- c(xmin = 6000, xmax = 7000, ymin = 4000, ymax = 5000) inds2 <- st_intersects(cellSeg(sfe), st_as_sfc(st_bbox(bbox_use)), sparse = FALSE)[,1] sfe_sub <- sfe[, inds2] (mb <- moranBounds(colGraph(sfe_sub, \"knn5\"))) #> Imin Imax #> -0.6079436 1.0608389 setNames(colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"volume\"), \"moran.mc_statistic_sample01\"] / mb[\"Imin\"], features_use) #> nCounts nGenes volume #> 0.17839356 0.15168017 0.03211427 # Normalize data sfe <- logNormCounts(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"hepatic-zonation","dir":"Articles","previous_headings":"","what":"Hepatic zonation","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"dataset comes relatively large piece tissue need zoom smaller region better see local structures. specify bounding box. portal triad shown near top right bounding box. two large vessels left bottom right central veins. portal triad consists hepatic artery, portal vein brings blood intestine, bile duct, ’s oxygenated. regions around central vein deoxygenated. different oxygen nutrient contents mean hepatocytes play different metabolic roles zones portal triad central vein. plot zonation marker genes (Halpern et al. 2017). 3 marker genes present dataset. first two pericentral (near central vein), last one periportal (near portal triad). Besides hepatocytes, liver also many endothelial cells Kupffer cells (macrophages). Marker genes cells (Bonnardel et al. 2019) plotted visualize cell types space: one Kupffer cell markers available dataset. Expression gene seem spatially coherent. 3 endothelial cell marker genes available dataset. Wnt2 seems pericentral, Ltbp4 Efnb2 seem periportal. marker genes show top PC loadings non-spatial spatial PCA.","code":"bbox_use <- c(xmin = 6100, xmax = 7100, ymin = 7500, ymax = 8500) markers <- c(\"Axin2\", \"Cyp1a2\", \"Gstm3\", \"Psmd4\", # Pericentral \"Cyp2e1\", \"Asl\", \"Alb\", \"Ass1\", # Monotonic but has intermediate \"Hamp\", \"Igfbp2\", \"Cyp8b1\", \"Mup3\", # Non-monotonic \"Arg1\", \"Pck1\", \"C2\", \"Sdhd\") # Periportal (inds <- which(markers %in% rownames(sfe))) #> [1] 1 2 14 plotSpatialFeature(sfe, markers[inds], colGeometryName = \"cellSeg\", ncol = 3, bbox = bbox_use) # Kuppfer cells kc_genes <- c(\"Timd4\", \"Vsig4\", \"Clec4f\", \"Clec1b\", \"Il18bp\", \"C6\", \"Irf7\", \"Slc40a1\", \"Cdh5\", \"Nr1h3\", \"Dmpk\", \"Paqr9\", \"Pcolce2\", \"Kcna2\", \"Gbp8\", \"Iigp1\", \"Helz2\", \"Cd207\", \"Icos\", \"Adcy4\", \"Slc1a2\", \"Rsad2\", \"Slc16a9\", \"Cd209f\", \"Oasl1\", \"Fam167a\") which(kc_genes %in% rownames(sfe)) #> [1] 9 plotSpatialFeature(sfe, kc_genes[9], colGeometryName = \"cellSeg\", bbox = bbox_use) # Endothelial cells lec_genes <- c(\"Rspo3\", \"Wnt2\", \"Wnt9b\", \"Pcdhgc5\", \"Ecm1\", \"Ltbp4\", \"Efnb2\") (inds_lec <- which(lec_genes %in% rownames(sfe))) #> [1] 2 6 7 plotSpatialFeature(sfe, lec_genes[inds_lec], colGeometryName = \"cellSeg\", bbox = bbox_use, ncol = 3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"non-spatial-pca","dir":"Articles","previous_headings":"","what":"Non-spatial PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"First run non-spatial PCA, compare MULTISPATI. ’s pretty quick almost 400,000 cells, aren’t many genes . Use elbow plot see variance explained PC: Plot top gene loadings PC Many genes seem related endothelium. PC1 PC4 concern Kupffer cells well, Kupffer cell marker gene Cdh5 high loading. Plot first 4 PCs space PC1 PC4 highlight major blood vessels, PC2 PC3 less spatial structure. CosMX Xenium datasets website, top PCs clear spatial structures despite absence spatial information non-spatial PCA clear spatial compartments cell types, seem case dataset except blood vessels. seen genes strong spatial structures. PC2 PC3 don’t seem large scale spatial structure, may local spatial structure obvious plotting entire section, zoom bounding box shows hepatic zonation. ’s spatial structure smaller scale, perhaps negative spatial autocorrelation.","code":"set.seed(29) system.time( sfe <- runPCA(sfe, ncomponents = 20, subset_row = !is_blank, exprs_values = \"logcounts\", scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed #> 21.877 1.884 24.710 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) #> Ncells 16055274 857.5 25942897 1385.6 NA 25942897 1385.6 #> Vcells 239248882 1825.4 502470759 3833.6 16384 497137484 3792.9 ElbowPlot(sfe) plotDimLoadings(sfe) spatialReducedDim(sfe, \"PCA\", 4, colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0) spatialReducedDim(sfe, \"PCA\", ncomponents = 4, colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"multispati-pca","dir":"Articles","previous_headings":"","what":"MULTISPATI PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"plot positive negative eigenvalues. Note eigenvalues variance explained. Instead, product variance explained Moran’s . positive eigenvalues correspond eigenvectors simultaneously explain variance large positive Moran’s . negative eigenvalues correspond eigenvectors simultaneously explain variance negative Moran’s . positive eigenvalues drop sharply PC1 PC4, one negative eigenvalue might interesting, unsurprising given moderately negative Moran’s nCounts nGenes. However, first MERFISH vignette, none genes negative Moran’s . Perhaps negative eigenvalue comes negative spatial autocorrelation gene program “eigengene” obvious individual genes. beauty multivariate analysis. components mean? component linear combination genes maximize product variance explained Moran’s . second component maximizes product provided ’s orthogonal first component, . loss variance explained usually huge, components can considered axes along spatially coherent groups spots separated much possible according expression highly variable genes, theory, clustering positive MULTISPATI components give spatially coherent clusters. spatial coherence, MULTISPATI might robust outliers. gene loadings, PC40 seems separate endothelial cells Kupffer cells hepatocytes. Plot PCs: first two PCs pick zoning. PC3 seems smaller scale spatial structure. PC”40” (really 300 something) example negative spatial autocorrelation biology. Kupffer cells endothelial cells scattered among hepatocytes may play functional role. mean non-spatial PCA bad. MULTISPATI tends lose much variance explained per PC positive eigenvalues, identifies co-expressed genes spatially structured expression patterns. MULTISPATI tells different story non-spatial PCA. PCA cell embeddings often used downstream analysis. Whether use MULTISPATI embeddings instead many PCs use depend questions asked downstream analyses.","code":"system.time({ sfe <- runMultivariate(sfe, \"multispati\", colGraphName = \"knn5\", nfposi = 20, nfnega = 20) }) #> Warning in asMethod(object): sparse->dense coercion: allocating vector of size #> 1.1 GiB #> user system elapsed #> 185.093 15.141 211.931 ElbowPlot(sfe, nfnega = 20, reduction = \"multispati\") plotDimLoadings(sfe, dims = c(1:3, 40), reduction = \"multispati\") spatialReducedDim(sfe, \"multispati\", components = c(1:3, 40), colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"morans-i","dir":"Articles","previous_headings":"Spatial autocorrelation of principal components","what":"Moran’s I","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"compare Moran’s cell embeddings non-spatial MULTISPATI PC: MULTISPATI, Moran’s high PC1 PC2, sharply drops. Moran’s PC negative eigenvalues negative, means large magnitude eigenvalue comes explaining variance. However, considering lower bound Moran’s around -0.6 instead -1, magnitude Moran’s PC negative eigenvalue trivial. Non-spatial PCs sorted Moran’s ; PC5 surprising large Moran’s . PC5 must zonation. Also show larger scale:","code":"# non-spatial sfe <- reducedDimMoransI(sfe, dimred = \"PCA\", components = 1:20, BPPARAM = MulticoreParam(2)) # spatial sfe <- reducedDimMoransI(sfe, dimred = \"multispati\", components = 1:40, BPPARAM = MulticoreParam(2)) df_moran <- tibble(PCA = reducedDimFeatureData(sfe, \"PCA\")$moran_sample01[1:20], MULTISPATI_pos = reducedDimFeatureData(sfe, \"multispati\")$moran_sample01[1:20], MULTISPATI_neg = reducedDimFeatureData(sfe,\"multispati\")$moran_sample01[21:40] |> rev(), index = 1:20) data(\"ditto_colors\") df_moran |> pivot_longer(cols = -index, values_to = \"value\", names_to = \"name\") |> ggplot(aes(index, value, color = name)) + geom_line() + scale_color_manual(values = ditto_colors) + geom_hline(yintercept = 0, color = \"gray\") + geom_hline(yintercept = mb, linetype = 2, color = \"gray\") + scale_y_continuous(breaks = scales::breaks_pretty()) + scale_x_continuous(breaks = scales::breaks_width(5)) + labs(y = \"Moran's I\", color = \"Type\", x = \"Component\") min(df_moran$MULTISPATI_neg) / mb[1] #> Imin #> 0.1374483 spatialReducedDim(sfe, \"PCA\", component = 5, colGeometryName = \"cellSeg\", divergent = TRUE, diverge_center = 0, bbox = bbox_use) spatialReducedDim(sfe, \"PCA\", components = 5, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"moran-scatter-plot","dir":"Articles","previous_headings":"Spatial autocorrelation of principal components","what":"Moran scatter plot","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"Local positive negative spatial autocorrelation can average global Moran’s . zoomed plots gene loadings , PCs endothelial cells. Moran scatter plot can help discovering local heterogeneity. PCs 1-3 fainter clusters outside main cluster, indicating heterogeneous spatial autocorrelation. Also make Moran scatter plots MULTISPATI interesting clusters.","code":"sfe <- reducedDimUnivariate(sfe, \"moran.plot\", dimred = \"PCA\", components = 1:6) plts <- lapply(seq_len(6), function(i) { moranPlot(sfe, paste0(\"PC\", i), binned = TRUE, hex = TRUE, plot_influential = FALSE) }) wrap_plots(plts, widths = 1, heights = 1) + plot_layout(ncol = 3) + plot_annotation(tag_levels = \"1\", title = \"Moran scatter plot for non-spatial PCs\") & theme(legend.position = \"none\") sfe <- reducedDimUnivariate(sfe, \"moran.plot\", dimred = \"multispati\", components = c(1:5, 40), # Not to overwrite non-spatial PCA moran plots name = \"moran.plot2\") plts2 <- lapply(c(1:5, 40), function(i) { moranPlot(sfe, paste0(\"PC\", i), binned = TRUE, hex = TRUE, plot_influential = FALSE, name = \"moran.plot2\") }) wrap_plots(plts2, widths = 1, heights = 1) + plot_layout(ncol = 3) + plot_annotation(tag_levels = \"1\", title = \"Moran scatter plot for MULTISPATI PCs\") & theme(legend.position = \"none\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"clustering-with-multispati-pca","dir":"Articles","previous_headings":"","what":"Clustering with MULTISPATI PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"standard scRNA-seq data analysis workflow, k nearest neighbor graph found PCA space, used graph based clustering Louvain Leiden, used perform differential expression. Spatial dimension reductions can similarly used perform clustering, identify spatial regions tissue, done (Shang Zhou 2022; Ren et al. 2022; Zhang et al. 2023). type studies often use manual segmentation ground truth compare different methods identify spatial regions. problem spatial region methods meant help us identify novel spatial regions based new -omics data, might reveal ’s previously unknown manual annotations. output method doesn’t match manual annotations, might simply pointing previously unknown aspect tissue rather wrong. Depending questions asked, can simultaneously multiple spatial partitions. happens geographical space. instance, ’s land use neighborhood boundaries, equally valid watershed boundaries types rock formation. one relevant depends questions asked. perform Leiden clustering non-spatial MULTISPATI PCA compare results. k nearest neighbor graph, used default k = 10. See clustering positive MULTISPATI PCs give spatially coherent clusters Plot clusters space: MULTISPATI clusters look somewhat spatially structured clusters non-spatial PCA. Also zoom small area: clusters mean? Clusters supposed groups different spots similar within group, sharing characteristics. Non-spatial MULTISPATI PCA use different characteristics clustering. Non-spatial PCA finds genes good telling cell types apart, although genes may happen spatially structured. Non-spatial clustering aims find groups gene expression, cells similar gene expression can surrounded cells types histological space. just like mapping Art Deco buildings, often near Spanish revival Beaux Art buildings whose styles quite different perform different functions, thus necessarily forming coherent spatial region. contrast, MULTISPATI’s positive components find genes must characterize spatial regions addition distinguishing different cell types. genes involved MULTISPATI component may interesting clusters. interesting perform gene set enrichment analysis, interpret sort spatial patterns spatially variable genes. like mapping buildings built, Art Deco, Spanish revival, Beaux Art popular 1920s 1930s end cluster form spatially coherent region, can found DTLA Historical Core Jewelry District, Old Pasadena. Hence non-spatial clustering spatial data isn’t necessarily bad. Rather, tells different story reveals different aspects data spatial clustering.","code":"system.time({ set.seed(29) sfe$clusts_nonspatial <- clusterCells(sfe, use.dimred = \"PCA\", BLUSPARAM = NNGraphParam( cluster.fun = \"leiden\", cluster.args = list( objective_function = \"modularity\", resolution_parameter = 1 ) )) }) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 552.609 6.863 560.613 system.time({ set.seed(29) sfe$clusts_multispati <- clusterRows(reducedDim(sfe, \"multispati\")[,1:20], BLUSPARAM = NNGraphParam( cluster.fun = \"leiden\", cluster.args = list( objective_function = \"modularity\", resolution_parameter = 1 ) )) }) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 649.927 6.052 658.774 plotSpatialFeature(sfe, c(\"clusts_nonspatial\", \"clusts_multispati\"), colGeometryName = \"centroids\", scattermore = TRUE) & guides(colour = guide_legend(override.aes = list(size=2), ncol = 2)) plotSpatialFeature(sfe, c(\"clusts_nonspatial\", \"clusts_multispati\"), colGeometryName = \"cellSeg\", bbox = bbox_use) & guides(fill = guide_legend(ncol = 2))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] patchwork_1.2.0 sf_1.0-16 #> [3] BiocParallel_1.36.0 BiocSingular_1.18.0 #> [5] bluster_1.12.0 tibble_3.2.1 #> [7] tidyr_1.3.1 stringr_1.5.1 #> [9] scran_1.30.2 scater_1.30.1 #> [11] ggplot2_3.5.1 scuttle_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SpatialFeatureExperiment_1.3.0 SFEData_1.4.0 #> [25] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] RColorBrewer_1.1-3 DBI_1.2.2 #> [19] abind_1.4-5 zlibbioc_1.48.2 #> [21] purrr_1.0.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 tools_4.3.3 #> [45] ggnewscale_0.4.10 ragg_1.3.0 #> [47] Rcpp_1.0.12 glue_1.7.0 #> [49] gridExtra_2.3 SparseArray_1.2.4 #> [51] mgcv_1.9-1 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 fansi_1.0.6 #> [61] spData_2.3.0 digest_0.6.35 #> [63] rsvd_1.0.5 R6_2.5.1 #> [65] mime_0.12 textshaping_0.3.7 #> [67] colorspace_2.1-0 wk_0.9.1 #> [69] scattermore_1.2 RSQLite_2.3.6 #> [71] hexbin_1.28.3 utf8_1.2.4 #> [73] generics_0.1.3 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 spdep_1.3-3 #> [79] pkgconfig_2.0.3 scico_1.5.0 #> [81] gtable_0.3.5 blob_1.2.4 #> [83] XVector_0.42.0 htmltools_0.5.8.1 #> [85] scales_1.3.0 png_0.1-8 #> [87] SpatialExperiment_1.12.0 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 fs_1.6.4 #> [121] ggbeeswarm_0.7.2 stringi_1.8.3 #> [123] viridisLite_0.4.2 deldir_2.0-4 #> [125] munsell_0.5.1 Biostrings_2.70.3 #> [127] Matrix_1.6-5 ExperimentHub_2.10.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] highr_0.10 interactiveDisplayBase_1.40.0 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"areal spatial data, spatial neighborhood graph used indicate proximity, required spatial analysis methods package spdep. One methods find spatial neighborhood graph k nearest neighbors, also commonly used gene expression PCA space graph-based clustering cells non-spatial scRNA-seq data. use k nearest neighbors graph PCA space rather histological space “spatial” analyses non-spatial scRNA-seq data? try analysis human peripheral blood mononuclear cells (PBMC) scRNA-seq dataset, doesn’t originally histological spatial organization. packages loaded analysis: download filtered Cell Ranger gene count matrix 10X website. empty droplets already removed. loaded R SingleCellExperiment (SCE) object.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SpatialExperiment) library(DropletUtils) library(BiocNeighbors) library(scater) library(scran) library(bluster) library(BiocParallel) library(scuttle) library(stringr) library(BiocSingular) library(spdep) library(patchwork) library(dplyr) library(reticulate) theme_set(theme_bw()) # Specify Python version to use gget PY_PATH <- Sys.which(\"python\") use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function # Load gget gget <- import(\"gget\") if (!dir.exists(\"filtered_feature_bc_matrix\")) { download.file(\"https://cf.10xgenomics.com/samples/cell-exp/3.0.2/5k_pbmc_v3_nextgem/5k_pbmc_v3_nextgem_filtered_feature_bc_matrix.tar.gz\", destfile = \"5kpbmc.tar.gz\", quiet = TRUE) system(\"tar -xzf 5kpbmc.tar.gz\") } (sce <- read10xCounts(\"filtered_feature_bc_matrix/\")) #> class: SingleCellExperiment #> dim: 33538 5155 #> metadata(1): Samples #> assays(1): counts #> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 #> ENSG00000268674 #> rowData names(3): ID Symbol Type #> colnames: NULL #> colData names(2): Sample Barcode #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): colnames(sce) <- sce$Barcode"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"perform basic QC, remove low quality cells high proportion mitochondrially encoded counts. addPerCellQCMetrics() function computes total UMI counts detected per cell (sum), number genes detected per cell (detected), sum detected mitochondrial counts, percentage mitochondrial counts per cell. 2D histogram plotted better show point density plot. Remove cells >20% mitochondrial counts","code":"is_mito <- str_detect(rowData(sce)$Symbol, \"^MT-\") sum(is_mito) #> [1] 13 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"Sample\" \"Barcode\" \"sum\" #> [4] \"detected\" \"subsets_mito_sum\" \"subsets_mito_detected\" #> [7] \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) plotColData(sce, x = \"sum\", y = \"subsets_mito_percent\", bins = 100) sce <- sce[, sce$subsets_mito_percent < 20] sce <- sce[rowSums(counts(sce)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"basic-non-spatial-analyses","dir":"Articles","previous_headings":"","what":"Basic non-spatial analyses","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"normalize data, perform PCA, cluster cells, find marker genes clusters. Use highly variable genes PCA: many PCs shall use analyses? Variance explained drops sharply PC1 PC4 levels . Plot genes largest loadings top 4 PCs: keep little information, use 10 PCs, variance explained levels even . plot cells first 4 PCs matrix plot. diagonals density plots number cells projected PC. x axis correspond columns matrix plot, y axis correspond rows, plot row 1 column 2 PC2 x axis PC1 y axis. cells colored clusters found previous code chunk. many cells cluster? use conventional Wilcoxon rank sum test find marker genes cluster. test compares cluster rest cells, genes highly expressed cluster compared cells considered. result list data frames, data frame corresponds one cluster. Areas receiver operator curve (AUC), distinguishing cluster vs. cluster, also included. closer 1 better, 0.5 means better random guessing. false discovery rate (FDR) column contains Benjamini-Hochberg corrected p-values. Genes data frames already sorted p-values. See specific top markers cluster: can use gget info module gget package get additional information marker genes. example, NCBI description:","code":"#clusts <- quickCluster(sce) #sce <- computeSumFactors(sce, cluster = clusts) #sce <- sce[, sizeFactors(sce) > 0] sce <- logNormCounts(sce) dec <- modelGeneVar(sce, lowess = FALSE) hvgs <- getTopHVGs(dec, n = 2000) set.seed(29) sce <- runPCA(sce, ncomponents = 30, BSPARAM = IrlbaParam(), subset_row = hvgs, scale = TRUE) ElbowPlot(sce, ndims = 30) plotDimLoadings(sce, swap_rownames = \"Symbol\") sce$cluster <- clusterRows(reducedDim(sce, \"PCA\")[,1:10], BLUSPARAM = SNNGraphParam(cluster.fun = \"leiden\", k = 10, cluster.args = list( resolution=0.5, objective_function = \"modularity\" ))) plotPCA(sce, ncomponents = 4, color_by = \"cluster\") table(sce$cluster) #> #> 1 2 3 4 5 6 7 8 #> 1057 1029 1278 590 415 207 27 26 markers <- findMarkers(sce, groups = colData(sce)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[4]] #> DataFrame with 21932 rows and 10 columns #> p.value FDR summary.AUC AUC.1 AUC.2 #> #> ENSG00000105369 2.91320e-18 6.38923e-14 1.000000 0.999965 0.999489 #> ENSG00000007312 7.06943e-18 7.75234e-14 0.994068 0.990666 0.990071 #> ENSG00000104894 1.32243e-17 9.66782e-14 0.989896 0.994232 0.988447 #> ENSG00000196735 4.83219e-17 2.16009e-13 0.981063 0.998555 0.938579 #> ENSG00000156738 4.92451e-17 2.16009e-13 0.972316 0.999639 0.998674 #> ... ... ... ... ... ... #> ENSG00000184274 1 1 0.5 0.500000 0.5 #> ENSG00000273796 1 1 0.5 0.500000 0.5 #> ENSG00000274248 1 1 0.5 0.500000 0.5 #> ENSG00000160282 1 1 0.5 0.499527 0.5 #> ENSG00000228137 1 1 0.5 0.500000 0.5 #> AUC.3 AUC.5 AUC.6 AUC.7 AUC.8 #> #> ENSG00000105369 0.999977 0.999534 0.997208 0.999937 1.000000 #> ENSG00000007312 0.991763 0.987639 0.986535 0.989077 0.994068 #> ENSG00000104894 0.994081 0.992534 0.992201 0.998305 0.989896 #> ENSG00000196735 0.998990 0.996269 0.998657 0.996798 0.981063 #> ENSG00000156738 0.999930 0.995512 0.999664 0.972316 1.000000 #> ... ... ... ... ... ... #> ENSG00000184274 0.499609 0.5 0.500000 0.5 0.5 #> ENSG00000273796 0.499218 0.5 0.500000 0.5 0.5 #> ENSG00000274248 0.498826 0.5 0.500000 0.5 0.5 #> ENSG00000160282 0.499609 0.5 0.500000 0.5 0.5 #> ENSG00000228137 0.500000 0.5 0.497585 0.5 0.5 top_markers <- unlist(lapply(markers, function(x) head(rownames(x), 1))) top_markers_symbol <- rowData(sce)[top_markers, \"Symbol\"] plotExpression(sce, top_markers_symbol, x = \"cluster\", swap_rownames = \"Symbol\", point_fun = function(...) list()) gget_info <- gget$info(top_markers) rownames(gget_info) <- gget_info$ensembl_gene_name select(gget_info, ncbi_description) #> ncbi_description #> TRAC T cell receptors recognize foreign antigens which have been processed as small peptides and bound to major histocompatibility complex (MHC) molecules at the surface of antigen presenting cells (APC). Each T cell receptor is a dimer consisting of one alpha and one beta chain or one delta and one gamma chain. In a single cell, the T cell receptor loci are rearranged and expressed in the order delta, gamma, beta, and alpha. If both delta and gamma rearrangements produce functional chains, the cell expresses delta and gamma. If not, the cell proceeds to rearrange the beta and alpha loci. This region represents the germline organization of the T cell receptor alpha and delta loci. Both the alpha and delta loci include V (variable), J (joining), and C (constant) segments and the delta locus also includes diversity (D) segments. The delta locus is situated within the alpha locus, between the alpha V and J segments. During T cell development, the delta chain is synthesized by a recombination event at the DNA level joining a D segment with a J segment; a V segment is then joined to the D-J gene. The alpha chain is synthesized by recombination joining a single V segment with a J segment. For both chains, the C segment is later joined by splicing at the RNA level. Recombination of many different V segments with several J segments provides a wide range of antigen recognition. Additional diversity is attained by junctional diversity, resulting from the random additional of nucleotides by terminal deoxynucleotidyltransferase. Five variable segments can be used in either alpha or delta chains and are described by TRAV/DV symbols. Several V and J segments of the alpha locus are known to be incapable of encoding a protein and are considered pseudogenes. [provided by RefSeq, Aug 2016] #> MNDA The myeloid cell nuclear differentiation antigen (MNDA) is detected only in nuclei of cells of the granulocyte-monocyte lineage. A 200-amino acid region of human MNDA is strikingly similar to a region in the proteins encoded by a family of interferon-inducible mouse genes, designated Ifi-201, Ifi-202, and Ifi-203, that are not regulated in a cell- or tissue-specific fashion. The 1.8-kb MNDA mRNA, which contains an interferon-stimulated response element in the 5-prime untranslated region, was significantly upregulated in human monocytes exposed to interferon alpha. MNDA is located within 2,200 kb of FCER1A, APCS, CRP, and SPTA1. In its pattern of expression and/or regulation, MNDA resembles IFI16, suggesting that these genes participate in blood cell-specific responses to interferons. [provided by RefSeq, Jul 2008] #> RPL32 Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L32E family of ribosomal proteins. It is located in the cytoplasm. Although some studies have mapped this gene to 3q13.3-q21, it is believed to map to 3p25-p24. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding the same protein have been observed for this gene. [provided by RefSeq, Jul 2008] #> CD79A The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-alpha protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. [provided by RefSeq, Jul 2008] #> NKG7 Predicted to be integral component of plasma membrane. Predicted to be active in plasma membrane. [provided by Alliance of Genome Resources, Apr 2022] #> MALAT1 This gene produces a precursor transcript from which a long non-coding RNA is derived by RNase P cleavage of a tRNA-like small ncRNA (known as mascRNA) from its 3' end. The resultant mature transcript lacks a canonical poly(A) tail but is instead stabilized by a 3' triple helical structure. This transcript is retained in the nucleus where it is thought to form molecular scaffolds for ribonucleoprotein complexes. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer metastasis and cell migration, and it is involved in cell cycle regulation. Its upregulation in multiple cancerous tissues has been associated with the proliferation and metastasis of tumor cells. [provided by RefSeq, Mar 2015] #> CLU The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants.[provided by RefSeq, May 2011] #> BCL11A This gene encodes a C2H2 type zinc-finger protein by its similarity to the mouse Bcl11a/Evi9 protein. The corresponding mouse gene is a common site of retroviral integration in myeloid leukemia, and may function as a leukemia disease gene, in part, through its interaction with BCL6. During hematopoietic cell differentiation, this gene is down-regulated. It is possibly involved in lymphoma pathogenesis since translocations associated with B-cell malignancies also deregulates its expression. Multiple transcript variants encoding several different isoforms have been found for this gene. [provided by RefSeq, Jul 2008]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"spatial-analyses-for-qc-metrics","dir":"Articles","previous_headings":"","what":"“Spatial” analyses for QC metrics","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"Find k nearest neighbor graph PCA space Moran’s : using spdep since nb2listwdist() function distance based edge weighting requires 2-3 dimensional spatial coordinates coordinates 10 dimensions. , inverse distance weighting used edge weights. histological space, convert SCE object SpatialFeatureExperiment (SFE) use spatial analysis plotting functions Voyager, pretend first 2 PCs histological space. Add k nearest neighbor graph SFE object:","code":"foo <- findKNN(reducedDim(sce, \"PCA\")[,1:10], k=10, BNPARAM=AnnoyParam()) # Split by row foo_nb <- asplit(foo$index, 1) dmat <- 1/foo$distance # Row normalize the weights dmat <- sweep(dmat, 1, rowSums(dmat), FUN = \"/\") glist <- asplit(dmat, 1) # Sort based on index ord <- lapply(foo_nb, order) foo_nb <- lapply(seq_along(foo_nb), function(i) foo_nb[[i]][ord[[i]]]) class(foo_nb) <- \"nb\" glist <- lapply(seq_along(glist), function(i) glist[[i]][ord[[i]]]) listw <- list(style = \"W\", neighbours = foo_nb, weights = glist) class(listw) <- \"listw\" attr(listw, \"region.id\") <- colnames(sce) (sfe <- toSpatialFeatureExperiment(sce, spatialCoords = reducedDim(sce, \"PCA\")[,1:2], spatialCoordsNames = NULL)) #> class: SpatialFeatureExperiment #> dim: 21932 4629 #> metadata(1): Samples #> assays(2): counts logcounts #> rownames(21932): ENSG00000238009 ENSG00000239945 ... ENSG00000275063 #> ENSG00000271254 #> rowData names(3): ID Symbol Type #> colnames(4629): AAACCCAAGACAGCTG-1 AAACCCAAGTTAACGA-1 ... #> TTTGTTGTCACGGACC-1 TTTGTTGTCCACACCT-1 #> colData names(11): Sample Barcode ... cluster sample_id #> reducedDimNames(1): PCA #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : PC1 PC2 #> imgData names(0): #> #> unit: #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: colGraph(sfe, \"knn10\") <- listw"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"morans-i","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"total UMI counts (sum) genes detected (detected), Moran’s quite strong, ’s positive weaker percentage mitochondrial counts. second column, K, kurtosis feature interest.","code":"sfe <- colDataMoransI(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\")) colFeatureData(sfe)[c(\"sum\", \"detected\", \"subsets_mito_percent\"),] #> DataFrame with 3 rows and 2 columns #> moran_sample01 K_sample01 #> #> sum 0.655173 16.44603 #> detected 0.750133 6.01002 #> subsets_mito_percent 0.438934 6.13555"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"moran-plot","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Moran plot","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"local variations k nearest neighbors graph? Moran plot, x axis value cell, y axis average value among neighboring cells graph weighted edge weights. slope fitted line Moran’s . Sometimes clusters plot, showing different kinds neighborhoods. dashed lines averages x y axes. cells cluster around average, cluster cells lower total counts whose neighbors also lower total counts. also cluster cells higher total counts whose neighbors also higher total counts. clusters seem somewhat related gene expression based clusters. one main cluster plot number genes detected percentage mitochondrial counts. However, cells somewhat separated gene expression clusters. surprising gene expression clusters also based k nearest neighbor graph. Cluster 4 cells higher percentage mitochondrial counts neighbors.","code":"sfe <- colDataUnivariate(sfe, \"moran.plot\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) moranPlot(sfe, \"sum\", color_by = \"cluster\") moranPlot(sfe, \"detected\", color_by = \"cluster\") moranPlot(sfe, \"subsets_mito_percent\", color_by = \"cluster\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-morans-i","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Local Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"Also see local Moran’s 3 QC metrics: , don’t histological space. can visualize local “spatial” statistics? UMAP bad, case PCA can somewhat separate clusters. can use first 2 PCs histological space. reference, plot metrics clusters first 2 PCs. Plot local Moran’s metrics first 2 PCs: However, good 2D representation data easy plotting? Remember k nearest neighbor graph computed first 10 PCs rather first 2 PCs. graph tied 2D representation. can still plot histograms show distribution scatter plots compare local metric different variables, can colored another variable cluster. may added next release Voyager. now, add results interest colData(sfe) use existing colData plotting functions scater Voyager. y axis log transformed (hence warning bins cells), color cells long tail can seen cells don’t strong local Moran’s . Cells cluster 7 high local Moran’s total UMI counts genes detected, means tend homogeneous QC metrics. local Moran’s QC metrics relate ? Cells locally homogeneous total UMI counts also homogeneous number genes detected, surprising given correlation two. local Moran’s , sum vs percentage mitochondrial counts shows interesting pattern, highlighting clusters 4 7 Moran plots. local Moran’s relate value ? case, generally cells higher total counts also tend higher local Moran’s total counts. However, another wing cells lower total counts slightly higher local Moran’s total counts ’s central value total counts near 0 local Moran’s . density contour shows cells concentrated central value.","code":"sfe <- colDataUnivariate(sfe, \"localmoran\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) plotSpatialFeature(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\", \"cluster\")) plotLocalResult(sfe, \"localmoran\", c(\"sum\", \"detected\", \"subsets_mito_percent\"), colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2) localResultAttrs(sfe, \"localmoran\", \"sum\") #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" sfe$sum_localmoran <- localResult(sfe, \"localmoran\", \"sum\")[,\"Ii\"] sfe$detected_localmoran <- localResult(sfe, \"localmoran\", \"detected\")[,\"Ii\"] sfe$pct_mito_localmoran <- localResult(sfe, \"localmoran\", \"subsets_mito_percent\")[,\"Ii\"] # Colorblind friendly palette data(\"ditto_colors\") plotColDataFreqpoly(sfe, c(\"sum_localmoran\", \"detected_localmoran\", \"pct_mito_localmoran\"), bins = 50, color_by = \"cluster\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. plotColData(sfe, x = \"sum_localmoran\", y = \"detected_localmoran\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum_localmoran\", y = \"pct_mito_localmoran\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum\", y = \"sum_localmoran\", color_by = \"cluster\") + geom_density2d(data = as.data.frame(colData(sfe)), mapping = aes(x = sum, y = sum_localmoran), color = \"blue\", linewidth = 0.3) + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-spatial-heteroscedasticity-losh","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Local spatial heteroscedasticity (LOSH)","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"LOSH indicates heterogeneity around cell k nearest neighbor graph. make non-spatial plots LOSH local Moran’s . , clusters 2 6 tend locally heterogeneous. total counts genes detected relate LOSH? generally cells higher LOSH total counts also higher LOSH genes detected, outliers high , heterogeneous neighborhoods. Absolute distance neighbors taken account adjacency matrix row normalized. interesting see outliers tend away 10 nearest neighbors, region PCA space cells apart. total counts relate LOSH? seem clear relationship case.","code":"sfe <- colDataUnivariate(sfe, \"LOSH\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) plotLocalResult(sfe, \"LOSH\", c(\"sum\", \"detected\", \"subsets_mito_percent\"), colGeometryName = \"centroids\", ncol = 2) localResultAttrs(sfe, \"LOSH\", \"sum\") #> [1] \"Hi\" \"E.Hi\" \"Var.Hi\" \"Z.Hi\" \"x_bar_i\" \"ei\" sfe$sum_losh <- localResult(sfe, \"LOSH\", \"sum\")[,\"Hi\"] sfe$detected_losh <- localResult(sfe, \"LOSH\", \"detected\")[,\"Hi\"] sfe$pct_mito_losh <- localResult(sfe, \"LOSH\", \"subsets_mito_percent\")[,\"Hi\"] plotColDataFreqpoly(sfe, c(\"sum_losh\", \"detected_losh\", \"pct_mito_losh\"), bins = 50, color_by = \"cluster\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. plotColData(sfe, x = \"sum_losh\", y = \"detected_losh\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum\", y = \"sum_losh\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"spatial-analyses-for-gene-expression","dir":"Articles","previous_headings":"","what":"“Spatial” analyses for gene expression","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"First, need reorganize differential expression results:","code":"top_markers_df <- lapply(seq_along(markers), function(i) { out <- markers[[i]][markers[[i]]$FDR < 0.05, c(\"FDR\", \"summary.AUC\")] if (nrow(out)) out$cluster <- i out }) top_markers_df <- do.call(rbind, top_markers_df) top_markers_df$symbol <- rowData(sce)[rownames(top_markers_df), \"Symbol\"]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"morans-i-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"results added rowData(sfe). NA’s non-highly variable genes, Moran’s computed highly variable genes . Moran’s ’s highly variable genes distributed? Also, top cluster marker genes distribution? top marker genes quite positive Moran’s k nearest neighbor graph. also interesting color histogram gene sets. Since k nearest neighbor graph found PCA space, based gene expression, expected, Moran’s graph mostly positive, although often strong. small number genes slightly negative Moran’s . top genes look like PCA? marker genes cluster, cluster 9. Perhaps genes high Moran’s specific cell type. Moran’s relate cluster AUC cluster differential expression p-value? differential expression p-value relate Moran’s ? Generally, significant marker genes tend higher Moran’s . surprising clusters Moran’s based k nearest neighbor graph. Similarly, genes higher AUC tend higher Moran’s . clusters, generally speaking, genes specific cluster tend higher Moran’s . Let’s use permutation testing see Moran’s statistically significant: seem significant. correlogram finds Moran’s higher order neighbors can proxy distance. see different patterns decay spatial autocorrelation different length scales spatial autocorrelation. CLU marker gene specific smallest cluster, higher order neighbors likely clusters. Marker genes larger clusters hundreds cells nevertheless display different patterns correlogram.","code":"sfe <- runMoransI(sfe, features = hvgs, BPPARAM = MulticoreParam(2)) rowData(sfe) #> DataFrame with 21932 rows and 5 columns #> ID Symbol Type moran_sample01 #> #> ENSG00000238009 ENSG00000238009 AL627309.1 Gene Expression NA #> ENSG00000239945 ENSG00000239945 AL627309.3 Gene Expression NA #> ENSG00000241599 ENSG00000241599 AL627309.4 Gene Expression NA #> ENSG00000229905 ENSG00000229905 AL669831.2 Gene Expression NA #> ENSG00000237491 ENSG00000237491 AL669831.5 Gene Expression NA #> ... ... ... ... ... #> ENSG00000278817 ENSG00000278817 AC007325.4 Gene Expression NA #> ENSG00000278384 ENSG00000278384 AL354822.1 Gene Expression NA #> ENSG00000277856 ENSG00000277856 AC233755.2 Gene Expression NA #> ENSG00000275063 ENSG00000275063 AC233755.1 Gene Expression NA #> ENSG00000271254 ENSG00000271254 AC240274.1 Gene Expression NA #> K_sample01 #> #> ENSG00000238009 NA #> ENSG00000239945 NA #> ENSG00000241599 NA #> ENSG00000229905 NA #> ENSG00000237491 NA #> ... ... #> ENSG00000278817 NA #> ENSG00000278384 NA #> ENSG00000277856 NA #> ENSG00000275063 NA #> ENSG00000271254 NA plotRowDataHistogram(sfe, \"moran_sample01\", bins = 50) + geom_vline(data = as.data.frame(rowData(sfe)[top_markers,]) |> mutate(index = seq_along(top_markers)), aes(xintercept = moran_sample01, color = index)) + scale_color_continuous(breaks = scales::breaks_width(2)) #> Warning: Removed 19932 rows containing non-finite outside the scale range #> (`stat_bin()`). top_moran <- head(rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)], 4) plotSpatialFeature(sfe, top_moran, ncol = 2) top_moran_symbol <- rowData(sfe)[top_moran, \"Symbol\"] plotExpression(sfe, top_moran_symbol, swap_rownames = \"Symbol\") # See if markers are unique to clusters anyDuplicated(rownames(top_markers_df)) #> [1] 0 top_markers_df$moran <- rowData(sfe)[rownames(top_markers_df), \"moran_sample01\"] top_markers_df$log_p_adj <- -log10(top_markers_df$FDR) top_markers_df$cluster <- factor(top_markers_df$cluster, levels = seq_len(length(unique(top_markers_df$cluster)))) as.data.frame(top_markers_df) |> ggplot(aes(log_p_adj, moran)) + geom_point(aes(color = cluster)) + geom_smooth(method = \"lm\") + scale_color_manual(values = ditto_colors) #> `geom_smooth()` using formula = 'y ~ x' #> Warning: Removed 574 rows containing non-finite outside the scale range #> (`stat_smooth()`). #> Warning: Removed 574 rows containing missing values or values outside the scale range #> (`geom_point()`). as.data.frame(top_markers_df) |> ggplot(aes(summary.AUC, moran)) + geom_point(aes(color = cluster)) + geom_smooth(method = \"lm\") + scale_color_manual(values = ditto_colors) #> `geom_smooth()` using formula = 'y ~ x' #> Warning: Removed 574 rows containing non-finite outside the scale range #> (`stat_smooth()`). #> Warning: Removed 574 rows containing missing values or values outside the scale range #> (`geom_point()`). sfe <- runUnivariate(sfe, \"moran.mc\", features = top_markers, nsim = 200) top_markers_symbol #> [1] \"TRAC\" \"MNDA\" \"RPL32\" \"CD79A\" \"NKG7\" \"MALAT1\" \"CLU\" \"BCL11A\" plotMoranMC(sfe, top_markers, swap_rownames = \"Symbol\") system.time({ sfe <- runUnivariate(sfe, \"sp.correlogram\", top_markers, order = 6, zero.policy = TRUE, BPPARAM = MulticoreParam(2)) }) #> user system elapsed #> 213.532 17.137 231.456 plotCorrelogram(sfe, top_markers, swap_rownames = \"Symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-morans-i-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Local Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"also plot histograms, now results need added colData first. , y axis log transformed make tail visible. clusters, top marker gene’s local Moran’s forms peak cells cluster higher local Moran’s cells. However, sometimes cells within cluster form long tail shared cells clusters. local Moran’s another method differential expression. since local Moran’s Leiden clustering use k nearest neighbor graph PCA space, local Moran’s marker genes perhaps eigengenes signifying gene programs cell type k nearest neighbor graph can validate criticize Leiden clusters. Furthermore, interestingly, genes, tallest peak histogram away 0. scatter plots shown “spatial” analyses QC metrics section can made see local Moran’s relates expression gene . gene, just like total UMI counts, two wings central value local Moran’s around 0. Generally, cells higher expression gene higher local Moran’s gene well. density contours show cells concentrate around 0 expression weaker positive local Moran. streak cells 0 expression means many cells don’t express gene, neighbors low slightly homogeneous expression gene. pattern may different different genes. Also, p-values cell local Moran’s available corrected multiple hypothesis testing, can plotted. p-values based z score local Moran statistic, although statistic distributed gene expression data warrants investigation. p-value can also computed permutation (see localmoran_perm()).","code":"sfe <- runUnivariate(sfe, \"localmoran\", features = top_markers) plotLocalResult(sfe, \"localmoran\", top_markers, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 3, swap_rownames = \"Symbol\") new_colname <- paste0(\"cluster\", seq_along(top_markers), \"_\", top_markers_symbol, \"_localmoran\") for (i in seq_along(top_markers)) { g <- top_markers[i] colData(sfe)[[new_colname[i]]] <- localResult(sfe, \"localmoran\", g)[,\"Ii\"] } plotColDataFreqpoly(sfe, new_colname, color_by = \"cluster\") + ggtitle(\"Local Moran's I\") + theme(legend.position = \"top\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. i <- 6 # Change if running this notebook plotExpression(sfe, top_markers_symbol[i], x = new_colname[i], color_by = \"cluster\", swap_rownames = \"Symbol\") + scale_color_manual(values = ditto_colors) + coord_flip() + # comment out in case of error after changing i geom_density2d(data = as.data.frame(colData(sfe)) |> mutate(gene = logcounts(sfe)[top_markers[i],]), mapping = aes(x = .data[[new_colname[i]]], y = gene), color = \"blue\", linewidth = 0.3) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. localResultAttrs(sfe, \"localmoran\", top_markers[1]) #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\""},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"losh","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"LOSH","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"two genes right, ’s interesting see higher LOSH middle cluster. two genes left outliers throwing dynamic range, seems high LOSH regions different. , plot histograms: relationship expression LOSH complicated. genes, top marker gene cluster 1 LYAR, cells cluster higher expression also higher LOSH - much like Poisson negative binomial distributions, higher mean also means higher variance. However, genes, top marker gene cluster 2 CTSS, lower LOSH among cells higher expression, means expression gene homogeneous within cluster, consistent local Moran. gene, density contour indicates many cells don’t express gene homogeneous neighborhoods also low expression. streak around 0 expression means neighbors cells don’t express gene different levels heterogeneity gene.","code":"sfe <- runUnivariate(sfe, \"LOSH\", top_markers) plotLocalResult(sfe, \"LOSH\", top_markers, colGeometryName = \"centroids\", ncol = 3, swap_rownames = \"Symbol\") new_colname2 <- paste0(\"cluster\", seq_along(top_markers), \"_\", top_markers_symbol, \"_losh\") for (i in seq_along(top_markers)) { g <- top_markers[i] colData(sfe)[[new_colname2[i]]] <- localResult(sfe, \"LOSH\", g)[,\"Hi\"] } plotColDataFreqpoly(sfe, new_colname2, color_by = \"cluster\") + ggtitle(\"Local heteroscedasticity\") + theme(legend.position = \"top\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. i <- 6 # Change if running this notebook plotExpression(sfe, top_markers_symbol[i], x = new_colname2[i], color_by = \"cluster\", swap_rownames = \"Symbol\") + scale_color_manual(values = ditto_colors) + coord_flip() + # comment out in case of error after changing i geom_density2d(data = as.data.frame(colData(sfe)) |> mutate(gene = logcounts(sfe)[top_markers[i],]), mapping = aes(x = .data[[new_colname2[i]]], y = gene), color = \"blue\", linewidth = 0.3) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"moran-plot-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Moran plot","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"make Moran plots top marker genes. reference, show Moran’s top marker genes, slope line fitted Moran scatter plot. significant marker gene cluster 7. plots shown sequence genes, points concentrated around origin aren’t “enough” points elsewhere plot density contours. cells express genes, clusters plot. genes expressed many cells, cells neighbors express gene, hence vertical streak x = 0. tutorial, applied univariate spatial statistics k nearest neighbor graph gene expression PCA space rather histological space. Just like histological space, impractical examine statistics gene gene, multivariate analyses incorporate k nearest neighbor graph may interesting.","code":"sfe <- runUnivariate(sfe, \"moran.plot\", features = top_markers, colGraphName = \"knn10\") top_markers_df[top_markers,] #> DataFrame with 8 rows and 6 columns #> FDR summary.AUC cluster symbol moran #> #> ENSG00000277734 3.38982e-13 0.975227 1 TRAC 0.768167 #> ENSG00000163563 2.71016e-14 0.999028 2 MNDA 0.955553 #> ENSG00000144713 8.43108e-15 0.999609 3 RPL32 0.789326 #> ENSG00000105369 6.38923e-14 1.000000 4 CD79A 0.944921 #> ENSG00000105374 6.08330e-14 1.000000 5 NKG7 0.931310 #> ENSG00000251562 9.26308e-09 0.930695 6 MALAT1 0.811310 #> ENSG00000120885 6.88523e-08 1.000000 7 CLU 0.902698 #> ENSG00000119866 3.82513e-08 1.000000 8 BCL11A 0.648106 #> log_p_adj #> #> ENSG00000277734 12.46982 #> ENSG00000163563 13.56701 #> ENSG00000144713 14.07412 #> ENSG00000105369 13.19455 #> ENSG00000105374 13.21586 #> ENSG00000251562 8.03324 #> ENSG00000120885 7.16208 #> ENSG00000119866 7.41735 plts <- lapply(top_markers, moranPlot, sfe = sfe, color_by = \"cluster\", swap_rownames = \"Symbol\") #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. wrap_plots(plts, widths = 1, heights = 1) + plot_layout(ncol = 3, guides = \"collect\") + plot_annotation(tag_levels = \"1\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 dplyr_1.1.4 #> [3] patchwork_1.2.0 spdep_1.3-3 #> [5] sf_1.0-16 spData_2.3.0 #> [7] BiocSingular_1.18.0 stringr_1.5.1 #> [9] BiocParallel_1.36.0 bluster_1.12.0 #> [11] scran_1.30.2 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] BiocNeighbors_1.20.2 DropletUtils_1.22.0 #> [17] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [19] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [21] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [23] IRanges_2.36.0 S4Vectors_0.40.2 #> [25] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [27] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [29] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] memoise_2.0.1 DelayedMatrixStats_1.24.0 #> [15] RCurl_1.98-1.14 terra_1.7-71 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] Rhdf5lib_1.24.2 s2_1.1.6 #> [21] SparseArray_1.2.4 rhdf5_2.46.1 #> [23] sass_0.4.9 KernSmooth_2.23-22 #> [25] bslib_0.7.0 htmlwidgets_1.6.4 #> [27] desc_1.4.3 cachem_1.0.8 #> [29] igraph_2.0.3 lifecycle_1.0.4 #> [31] pkgconfig_2.0.3 rsvd_1.0.5 #> [33] Matrix_1.6-5 R6_2.5.1 #> [35] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [37] digest_0.6.35 colorspace_2.1-0 #> [39] ggnewscale_0.4.10 dqrng_0.3.2 #> [41] RSpectra_0.16-1 irlba_2.3.5.1 #> [43] textshaping_0.3.7 beachmat_2.18.1 #> [45] labeling_0.4.3 fansi_1.0.6 #> [47] mgcv_1.9-1 abind_1.4-5 #> [49] compiler_4.3.3 proxy_0.4-27 #> [51] withr_3.0.0 viridis_0.6.5 #> [53] DBI_1.2.2 highr_0.10 #> [55] HDF5Array_1.30.1 R.utils_2.12.3 #> [57] MASS_7.3-60.0.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] tools_4.3.3 units_0.8-5 #> [63] vipor_0.4.7 beeswarm_0.4.0 #> [65] R.oo_1.26.0 glue_1.7.0 #> [67] nlme_3.1-164 rhdf5filters_1.14.1 #> [69] grid_4.3.3 cluster_2.1.6 #> [71] generics_0.1.3 isoband_0.2.7 #> [73] gtable_0.3.5 R.methodsS3_1.8.2 #> [75] class_7.3-22 metapod_1.10.1 #> [77] ScaledMatrix_1.10.0 sp_2.1-4 #> [79] utf8_1.2.4 XVector_0.42.0 #> [81] ggrepel_0.9.5 pillar_1.9.0 #> [83] limma_3.58.1 splines_4.3.3 #> [85] lattice_0.22-6 deldir_2.0-4 #> [87] tidyselect_1.2.1 locfit_1.5-9.9 #> [89] knitr_1.45 gridExtra_2.3 #> [91] edgeR_4.0.16 xfun_0.43 #> [93] statmod_1.5.0 stringi_1.8.3 #> [95] yaml_2.3.8 boot_1.3-30 #> [97] evaluate_0.23 codetools_0.2-20 #> [99] tibble_3.2.1 cli_3.6.2 #> [101] systemfonts_1.0.6 munsell_0.5.1 #> [103] jquerylib_0.1.4 Rcpp_1.0.12 #> [105] png_0.1-8 parallel_4.3.3 #> [107] pkgdown_2.0.9 sparseMatrixStats_1.14.0 #> [109] bitops_1.0-7 viridisLite_0.4.2 #> [111] scales_1.3.0 e1071_1.7-14 #> [113] purrr_1.0.2 crayon_1.5.2 #> [115] scico_1.5.0 rlang_1.1.3 #> [117] cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium v3 preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium v3 preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"preprocessing-for-chromium-v3-chemistry","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium V3 Chemistry","title":"10X Chromium v3 preprocessing with cellatlas","text":"data example located cellatlas/examples/rna-10xv3/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-10xv3/* .\") system(\"gunzip 3M-february-2018.txt.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Fetch the references","title":"10X Chromium v3 preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Build the pipeline","title":"10X Chromium v3 preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Run the pipeline","title":"10X Chromium v3 preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Inspect the output","title":"10X Chromium v3 preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"preprocessing-for-chromium-single-cell-atac-seq","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell ATAC-seq","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"data example located cellatlas/examples/atac-10xatac/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/atac-10xatac/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Fetch the references","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Build the pipeline","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m atac\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz fastqs/I2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Run the pipeline","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_trim(cmds) cmds <- str_remove_all(cmds, '\\\\\\\",$|\\\\\\\"$|^\\\\\\\"') cmds <- str_replace_all(cmds, fixed(\"\\\\\\\"\"), \"\\\"\") cmds <- str_replace_all(cmds, fixed(\"\\\\t\"), \"\\t\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Inspect the output","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"ClickTags preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"ClickTags preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"preprocessing-for-clicktags","dir":"Articles","previous_headings":"","what":"Preprocessing for ClickTags","title":"ClickTags preprocessing with cellatlas","text":"data example located cellatlas/examples/tag-clicktag/* directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/tag-clicktag/* .\") system(\"gunzip 737K-august-2016.txt.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Fetch the references","title":"ClickTags preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Build the pipeline","title":"ClickTags preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m tag\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Run the pipeline","title":"ClickTags preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Inspect the output","title":"ClickTags preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"preprocessing-for-chromium-single-cell-crispr-screening","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell CRISPR Screening","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"data example located cellatlas/examples/crispr-10xcrispr/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/crispr-10xcrispr/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Fetch the references","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Build the pipeline","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m crispr\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Run the pipeline","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Inspect the output","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Multiome ATAC preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Multiome ATAC preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"preprocessing-for-chromium-single-cell-atac-multiome-atac","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","title":"10X Multiome ATAC preprocessing with cellatlas","text":"data example located cellatlas/examples/atac-10xmultiome/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/atac-10xmultiome/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Fetch the references","title":"10X Multiome ATAC preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Build the pipeline","title":"10X Multiome ATAC preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m atac\", \"-fa\", FA, \"-g\", GTF, \"fastqs/atac_R1.fastq.gz fastqs/atac_R2.fastq.gz fastqs/atac_I2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Run the pipeline","title":"10X Multiome ATAC preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_trim(cmds) cmds <- str_remove_all(cmds, '\\\\\\\",$|\\\\\\\"$|^\\\\\\\"') cmds <- str_replace_all(cmds, fixed(\"\\\\\\\"\"), \"\\\"\") cmds <- str_replace_all(cmds, fixed(\"\\\\t\"), \"\\t\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Inspect the output","title":"10X Multiome ATAC preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium nuclei preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium nuclei preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"preprocessing-for-chromium-nuclei-isolation","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Nuclei Isolation","title":"10X Chromium nuclei preprocessing with cellatlas","text":"data example located cellatlas/examples/rna-10xv3-nuclei/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-10xv3-nuclei/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Fetch the references","title":"10X Chromium nuclei preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Build the pipeline","title":"10X Chromium nuclei preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fb feature_barcodes.txt\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Run the pipeline","title":"10X Chromium nuclei preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Inspect the output","title":"10X Chromium nuclei preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"Split-seq preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"Split-seq preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"preprocessing-for-split-seq","dir":"Articles","previous_headings":"","what":"Preprocessing for SPLiT-seq","title":"Split-seq preprocessing with cellatlas","text":"Note: move relevant data working directory gunzip barcode onlist. data example located cellatlas/examples/rna-splitseq/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note names nodes seqspec must match names FASTQ files. seqspec SPLiT-seq contains specification multiple split-pool rounds. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-splitseq/* .\") system(\"gunzip barcode*\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Fetch the references","title":"Split-seq preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Build the pipeline","title":"Split-seq preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Run the pipeline","title":"Split-seq preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Inspect the output","title":"Split-seq preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"Visium preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"Visium preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"examine-the-spec","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Examine the spec","title":"Visium preprocessing with cellatlas","text":"Note: move relevant data working directory gunzip barcode onlist. first use seqspec print check read structure matches expect. command prints ordered tree representation sequenced elements contained FASTQ files. Note names nodes seqspec must match names FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-visium-spatial/* .\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Fetch the references","title":"Visium preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Build the pipeline","title":"Visium preprocessing with cellatlas","text":"now supply relevant objects cellatlas build produce appropriate commands run build pipeline. includes reference building step read counting quantification step performed kallisto bustools part kb-python package.","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Run the pipeline","title":"Visium preprocessing with cellatlas","text":"can extract view commands pipeline using jq. Now can run commands /cellatlas_info.json command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Inspect the output","title":"Visium preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/seqfish_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"seqFISH Processing Workflows with Voyager","text":"Pros: Single cell resolution High detection efficiency Commercial kit coming Get subcellular transcript localization information Compatible histological features DAPI membrane staining Cons: Need pre-select panel usually hundred genes","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/seqfish_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"seqFISH Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Use your own spatial method in Voyager","text":"multiple different ways certain things. different ways pros cons, sometimes can tell somewhat different stories. Often different ways come different syntaxes, increasing learning curve users. Voyager took inspiration caret tidymodels (Kuhn Wickham 2020) machine learning, foreach, future, BiocParallel parallel processing different backends, bluster different clustering algorithms, BiocNeighbors different algorithms find nearest neighbors. packages provide uniform user interfaces different methods achieve given goal. caret tidymodels, users can make uniform user interface fit custom models included package eliminate lot duplicate code. Voyager, done SFEMethod S4 class. vignette shows use SFEMethod class use Voyager’s uniform user interface custom methods. load packages used: Voyager categorizes exploratory spatial data analysis (ESDA) methods number variables whether method gives one result entire dataset (global) gives results location (local). process create SFEMethod object mostly across categories, category specific arguments. Also, make SFEMethod object, see method interest already Voyager. methods can listed listSFEMethods() function. calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods() Voyager search S4 object name matching string.","code":"library(Voyager) library(spdep) #> Loading required package: spData #> To access larger datasets in this package, install the spDataLarge #> package with: `install.packages('spDataLarge', #> repos='https://nowosad.github.io/drat/', type='source')` #> Loading required package: sf #> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"global","dir":"Articles","previous_headings":"Univariate","what":"Global","title":"Use your own spatial method in Voyager","text":"univariate global methods Voyager: code used create SFEMethod object run Moran’s , SFEMethod() constructor: package argument used check package installed method run. function run method fun argument. univariate methods use spatial neighborhood graph (use_graph = TRUE) must arguments: x vector input listw spatial neighborhood graph listw object, zero.policy cells spots don’t spatial neighbors. See spdep documentation (e.g. spdep::moran()) zero.policy argument behaves. case wrote think wrapper fill confusing arguments may confuse users. function running method another package different arguments, write thin wrapper make required arguments. Extra arguments can passed fun .... reorganize_fun argument takes function reorganize output fun form DataFrame results genes can added rowData(sfe). Moran’s , function univariate bivariate global methods, function must : argument take output fun multiple genes features name take name results stored case method run genes different parameters don’t want overwrite previous results. name name specified SFEMethod() constructor default, can set user calling calculate*variate() run*variate(), ... reorganize_fun univariate global methods Voyager, sp.correlogram, needs arguments. spatial methods use spatial distances rather graphs, variogram. code used create SFEMethod object variogram: function fun univariate methods don’t use spatial neighborhood graph must arguments x coords_df (sf data frame spatial coordinates) arguments allowed. .variogram function: rule reorganize_fun remains , .other2df function:","code":"listSFEMethods(\"uni\", \"global\") #> name description #> 1 moran Moran's I #> 2 geary Geary's C #> 3 moran.mc Moran's I with permutation testing #> 4 geary.mc Geary's C with permutation testing #> 5 sp.mantel.mc Mantel-Hubert spatial general cross product statistic #> 6 moran.test Moran's I test #> 7 geary.test Geary's C test #> 8 globalG.test Global G test #> 9 sp.correlogram Correlogram #> 10 variogram Variogram with model #> 11 variogram_map Variogram map moran <- SFEMethod( name = \"moran\", title = \"Moran's I\", package = \"spdep\", variate = \"uni\", scope = \"global\", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), use_graph = TRUE, reorganize_fun = .moran2df ) .moran2df <- function(out, name, ...) { rns <- names(out) out <- lapply(out, unlist, use.names = TRUE) out <- Reduce(rbind, out) if (!is.matrix(out)) out <- t(as.matrix(out)) rownames(out) <- rns out <- DataFrame(out) names(out)[1] <- name out } variogram <- SFEMethod(package = \"automap\", variate = \"uni\", scope = \"global\", default_attr = NA, name = \"variogram\", title = \"Variogram\", fun = .variogram, reorganize_fun = .other2df, use_graph = FALSE) .variogram <- function(x, coords_df, formula = x ~ 1, scale = TRUE, ...) { coords_df$x <- x if (scale) coords_df$x <- scale(coords_df$x) dots <- list(...) # Deal with alpha myself and fit a global variogram to avoid further gstat warnings have_alpha <- \"alpha\" %in% names(dots) if (have_alpha) { empirical <- gstat::variogram(formula, data = coords_df, alpha = dots$alpha) dots$alpha <- NULL } out <- do.call(automap::autofitVariogram, c(list(formula = formula, input_data = coords_df, map = FALSE, cloud = FALSE), dots)) if (have_alpha) { out$exp_var <- empirical } out } .other2df <- function(out, name, ...) { if (!is.atomic(out)) out <- I(out) out_df <- DataFrame(res = out) names(out_df) <- name rownames(out_df) <- names(out) out_df }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"local","dir":"Articles","previous_headings":"Univariate","what":"Local","title":"Use your own spatial method in Voyager","text":"univariate local methods Voyager: code used create SFEMethod object localmoran: spdep::localmoran already right arguments, including x, listw, zero.policy. local methods, title default_attr arguments important, used plotLocalResults() plot title. Many local methods return matrix data frame results gene, default_attr specifies column use default plotting, local Moran’s values (Ii) case. fields results can p-values adjusted p-values. reorganize_fun different univariate global methods local results organized differently. .localmoran2df function: function must arguments: results fun genes, list element results one gene. nb neighbor object class nb, part listw object spatial neighborhood graphs. used correct multiple hypothesis testing p.adjustSP() p.adjust.method specify method correct multiple testing. See p.adjust() available methods. output list organized results, element one gene, converted DataFrame added localResults(sfe).","code":"listSFEMethods(\"uni\", \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot localmoran <- SFEMethod( name = \"localmoran\", title = \"Local Moran's I\", package = \"spdep\", scope = \"local\", default_attr = \"Ii\", fun = spdep::localmoran, use_graph = TRUE, reorganize_fun = .localmoran2df ) .localmoran2df <- function(out, nb, p.adjust.method) { lapply(out, function(o) { o1 <- as.data.frame(o) quadr <- attr(o, \"quadr\") I(.add_log_p(cbind(o1, quadr), nb, p.adjust.method)) }) }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"bivariate","dir":"Articles","previous_headings":"","what":"Bivariate","title":"Use your own spatial method in Voyager","text":"bivariate global methods Voyager: bivariate local methods Voyager: SFEMethod construction bivariate methods similar univariate methods, except function fun must argument y x. code used create SFEMethod object lee, Lee’s L: Note use_matrix argument, specific bivariate methods. means whether method can take matrix argument compute statistic pairwise combinations matrix’s rows. way computation can expressed matrix operations much efficient R loops loops pushed underlying C Fortran code BLAS Matrix package sparse matrices. ’s .lee_mat function: Due matrix operation, listw can sparse dense adjacency matrix spatial neighborhood graph. conform scRNA-seq conventions, x y genes rows matrices. reorganize_fun bivariate global methods don’t return DataFrame, bivariate global results can’t stored SFE object. However, reorganize_fun bivariate local methods follow rules univariate local methods results also go localResults(sfe).","code":"listSFEMethods(\"bi\", \"global\") #> name description #> 1 lee Lee's bivariate statistic #> 2 lee.mc Lee's bivariate static with permutation testing #> 3 lee.test Lee's L test #> 4 cross_variogram Cross variogram #> 5 cross_variogram_map Cross variogram map listSFEMethods(\"bi\", \"local\") #> name description #> 1 locallee Local Lee's bivariate statistic #> 2 localmoran_bv Local bivariate Moran's I lee <- SFEMethod(name = \"lee\", fun = .lee_mat, title = \"Lee's bivariate statistic\", reorganize_fun = function(out, name, ...) out, package = \"Voyager\", variate = \"bi\", scope = \"global\", use_matrix = TRUE) .lee_mat <- function(x, y = NULL, listw, zero.policy = TRUE, ...) { # X has genes in rows if (is(listw, \"listw\")) W <- listw2sparse(listw) else W <- listw x <- .scale_n(x) if (!is.null(y)) { y <- .scale_n(y) } else y <- x n <- ncol(x) # dimension of y is checked in calculateBivariate out <- x %*% (t(W) %*% W) %*% t(y)/sum(rowSums(W)^2) * n if (all(dim(out) == 1L)) out <- out[1,1] out }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"multivariate","dir":"Articles","previous_headings":"","what":"Multivariate","title":"Use your own spatial method in Voyager","text":"multivariate methods Voyager: SFEMethod construction bivariate methods similar univariate methods, except two arguments: joint indicate whether makes sense run method multiple samples jointly just like non-spatial PCA, dest indicate whether results go reducedDims(sfe) colData(sfe). code multivariate generalization local Geary’s C (Anselin 2019) permutation testing: results, single vector, goes colData(sfe), make sense run across multiple samples jointly sample separate spatial neighborhood graph, run sample separately. function reorganize_fun return vector, matrix, data frame ready added reducedDims(sfe) colData(sfe). results can go colData, rules arguments univariate local methods, permutation testing multivariate local Geary’s C, multiple testing correction performed reorganize_fun. results go reducedDims, needs one argument output.","code":"listSFEMethods(\"multi\") #> name description #> 1 multispati MULTISPATI PCA #> 2 localC_multi Multivariate local Geary's C #> 3 localC_perm_multi Multivariate local Geary's C permutation testing .localC_multi_fun <- function(perm = FALSE) { function(x, listw, ..., zero.policy) { x <- as.matrix(x) fun <- if (perm) spdep::localC_perm else spdep::localC fun(x, listw = listw, zero.policy = zero.policy, ...) } } .localCpermmulti2df <- function(out, nb, p.adjust.method) { .attrmat2df(list(out), \"pseudo-p\", \"localC_perm_multi\", nb, p.adjust.method)[[1]] } localC_perm_multi <- SFEMethod( name = \"localC_perm_multi\", title = \"Multivariate local Geary's C permutation testing\", package = \"spdep\", variate = \"multi\", default_attr = \"localC\", fun = .localC_multi_fun(TRUE), reorganize_fun = .localCpermmulti2df, dest = \"colData\" )"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Slide-seqV2 Processing Workflows with Voyager","text":"Pros: Higher resolution Visium, beads 10 \\(\\mu\\)m diameter Transcriptome wide Recently commercialized Curio, commercial kit coming Cons: Still single cell resolution two cells can occupy bead Relatively low detection efficiency transcripts Existing datasets may come histology image","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"dowload-data-and-create-a-spatialfeatureexperiment-object","dir":"Articles","previous_headings":"Getting Started","what":"Dowload Data and Create a SpatialFeatureExperiment object","title":"Slide-seqV2 Processing Workflows with Voyager","text":"vignettes demonstrate convert sequencing data spatial transcriptomics experiment SpatialFeatureExperiment object R. Many technologies yet standardized output formats, vignettes provide examples generate SFE object various output file types.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Slide-seqV2 Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated Slide-seqV2 platform. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"SPLiT-seq Processing Workflows with Voyager","text":"Pros: Commercial kit Low cost Single well capture randomly primed polyT oligos library Cons: * Fewer datasets available compared single cell technologies","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"SPLiT-seq Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"SPLiT-seq Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Variogram","text":"geostatistical data, underlying spatial process sampled known locations. Kriging uses Gaussian process model interpolate values sample locations, semivariogram used model spatial dependency locations covariance Gaussian process. kriging, semivariogram can used exploratory data analysis tool find length scale anisotropy spatial autocorrelation. semivariogram defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Var}(X_t - X_0), \\] \\(X\\) value gene expression, \\(t\\) spatial vector. \\(X_0\\) value location interest, \\(X_t\\) value lagged \\(t\\). positive spatial autocorrelation, variance smaller among nearby values, variogram increase distance, eventually leveling distance beyond length scale spatial autocorrelation. “semi” comes 1/2, comes assumption Gaussian process weakly stationary, .e. covariance two locations depends spatial lag : \\[\\begin{align} \\mathrm{Var}(X_{t_2} - X_{t_1}) &= \\mathrm{Var}(X_{t_2}) + \\mathrm{Var}(X_{t_1}) - 2\\mathrm{Cov}(X_{t_2}, X_{t_1}) \\\\ &= 2\\rho(0) - 2\\rho(t_2 - t_1), \\end{align}\\] \\(\\rho\\) covariance function \\(t_1\\) \\(t_2\\) spatial locations. model can fitted empirical semivariogram, model \\(\\rho\\). variance differences value across locations depends spatial lag means intrinsically stationary, even weaker generalizable weakly stationary. weaker assumption used kriging. vignette demonstrates variogram ESDA tool, including interpretation univariate variogram, anisotropic variograms (variograms different directions), variogram maps, bivariate cross variograms. load packages: Slide-seq melanoma metastasis data (Biermann et al. 2022) used demonstration. QC performed another vignette. Variograms demonstrated top highly variable genes (HVGs)","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(BiocParallel) library(bluster) library(dplyr) theme_set(theme_bw()) (sfe <- BiermannMelaMetasData(dataset = \"MBM05_rep1\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 27566 29536 #> metadata(0): #> assays(1): counts #> rownames(27566): A1BG A1BG-AS1 ... ZZZ3 snoZ196 #> rowData names(3): means vars cv2 #> colnames(29536): ACCACTCATTTCTC-1 GTTCANTCCACGTA-1 ... ACGCGCAATCGTAG-1 #> TTGTTCCGTTCATA-1 #> colData names(4): sample_id nCounts nGenes prop_mito #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : xcoord ycoord #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: sfe <- sfe[, colData(sfe)$prop_mito < 0.1] sfe <- sfe[rowSums(counts(sfe)) > 0,] sfe <- logNormCounts(sfe) dec <- modelGeneVar(sfe) hvgs <- getTopHVGs(dec, n = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"variogram","dir":"Articles","previous_headings":"","what":"Variogram","title":"Variogram","text":"user interface used run Moran’s can used compute variograms. However, since variogram uses spatial distances instead spatial neighborhood graph, colGraph need specified. Instead, colGeometry can specified, geometry POINT, spatialCoords(sfe) used compute distances. Behind scene, automap package used, fits number different variogram models empirical variogram chooses one fits best. automap package user friendly wrapper gstat, time honored package geostatistics. data binned distance spots variance computed bin. gstat’s plotting functions say “semivariance”, data scaled variance 1, think variance rather semivariance plotted. numbers points plot indicate number pairs spots bin. “Ste” means Matern model M. Stein’s parameterization fitted points. Nugget variance distance 0, variance within first distance bin. data scaled default prior variogram computation make variograms multiple genes comparable. Spatial autocorrelation makes variance smaller shorter distances. variogram levels , means spatial autocorrelation longer effect distance. Sill variance variogram levels . Range distance variogram levels . first 4 genes, IGHG3 IGKC seem stronger spatial autocorrelation dissipate 100 200 units (whether ’s microns pixels unclear publication), whereas spatial autocorrelation B2M MT-RNR1 much weaker longer length scale. genes plotted space: length scales spatial autocorrelation genes quite obvious just plotting genes. ’s point plotting variograms ESDA? can also compute variograms larger number genes cluster variograms patterns spatial autocorrelation length scales, compare variograms genes across different samples. cluster variograms top highly variable genes (HVGs): BLUSPARAM argument used specify methods clustering, implemented bluster package. use hierarchical clustering. plot clusters: seems many genes, like MT-RNR1, weak spatial autocorrelation longer length scales, genes stronger shorter range spatial autocorrelation (around 150 200 units) like IGKC, genes somewhat longer length scale spatial autocorrelation (around 400 units). Plot one gene cluster space: MT-RNR1 widely expressed. IGKC ICHC3 restricted smaller areas, IGHM restricted even smaller areas. Note genes variograms cluster don’t co-expressed; need similar length scales strengths spatial autocorrelation.","code":"sfe <- runUnivariate(sfe, \"variogram\", hvgs, BPPARAM = SnowParam(2), model = \"Ste\") #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogram(sfe, hvgs[1:4], name = \"variogram\") plotSpatialFeature(sfe, hvgs[1:4], size = 0.3) & theme_bw() # To show the length units clusts <- clusterVariograms(sfe, hvgs, BLUSPARAM = HclustParam()) plotVariogram(sfe, hvgs, color_by = clusts, group = \"feature\", use_lty = FALSE, show_np = FALSE) genes_clusts <- clusts |> group_by(cluster) |> slice_head(n = 1) |> pull(feature) plotSpatialFeature(sfe, genes_clusts, size = 0.3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"anisotropy","dir":"Articles","previous_headings":"","what":"Anisotropy","title":"Variogram","text":"Anisotropy means different different directions. example cerebral cortex, layered structure. variogram can computed different directions.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"anisotropic-variogram","dir":"Articles","previous_headings":"Anisotropy","what":"Anisotropic variogram","title":"Variogram","text":"directions compute variograms can explicitly specified, alpha argument. However, since gstat fit anisotropic variograms, model fitted directions empirical variograms angle plotted separately. compute anisotropic variograms 4 genes : line variogram model fitted directions text describes model. points show angles different colors. Zero degree points north (), angles go clockwise.","code":"sfe <- runUnivariate(sfe, \"variogram\", genes_clusts, alpha = c(0, 45, 90, 135), # To not to overwrite omnidirectional variogram results name = \"variogram_anis\", model = \"Ste\", BPPARAM = SnowParam(2)) #> gstat does not fit anisotropic variograms. Variogram model is fitted to the whole dataset. #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogram(sfe, genes_clusts, group = \"angle\", name = \"variogram_anis\", show_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"variogram-map","dir":"Articles","previous_headings":"Anisotropy","what":"Variogram map","title":"Variogram","text":"variogram map another way visualize spatial autocorrelation different directions. bins distances x distances y, grid distances variance computed. Just like variograms , origin usually low value, spatial autocorrelation reduces variance short distance, values increase increasing distance origin, can increase quickly directions others. compute variogram maps 4 genes : width argument width bins, cutoff maximum distance.","code":"sfe <- runUnivariate(sfe, \"variogram_map\", genes_clusts, width = 100, cutoff = 800, BPPARAM = SnowParam(2), name = \"variogram_map2\") #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogramMap(sfe, genes_clusts, name = \"variogram_map2\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"cross-variogram","dir":"Articles","previous_headings":"","what":"Cross variogram","title":"Variogram","text":"cross variogram used cokriging, uses multiple variables spatial interpolation model. cross variogram defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Cov}(X_t - X_0, Y_t - Y_0), \\] \\(Y\\) another variable. cross variogram also nugget, sill, range. shows covariance two variables changes distance. Voyager supports multiple bivariate spatial methods, cross variogram one . Just like univariate spatial methods, Voyager provides uniform user interface bivariate methods. However, bivariate local methods can’t stored SFE object present tend different formats outputs (e.g. correlation matrix Lee’s L list methods) may straightforward store SFE object. facets shown matrix, whose diagonal variogram gene, diagonal entries cross variograms. IGKC IGHG3, length scale covariance similar spatial autocorrelation. also cross variogram map show cross variogram different directions:","code":"cross_v <- calculateBivariate(sfe, \"cross_variogram\", feature1 = \"IGKC\", feature2 = \"IGHG3\") plotCrossVariogram(cross_v, show_np = FALSE) cross_v_map <- calculateBivariate(sfe, \"cross_variogram_map\", feature1 = \"IGKC\", feature2 = \"IGHG3\", width = 100, cutoff = 800) plotCrossVariogramMap(cross_v_map)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Variogram","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] dplyr_1.1.4 bluster_1.12.0 #> [3] BiocParallel_1.36.0 scran_1.30.2 #> [5] scater_1.30.1 ggplot2_3.5.1 #> [7] scuttle_1.12.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [19] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] xts_0.13.2 lifecycle_1.0.4 #> [7] sf_1.0-16 edgeR_4.0.16 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] RColorBrewer_1.1-3 DBI_1.2.2 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 units_0.8-5 #> [31] RSpectra_0.16-1 dqrng_0.3.2 #> [33] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [35] codetools_0.2-20 DelayedArray_0.28.0 #> [37] gstat_2.1-1 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 snow_0.4-4 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] xfun_0.43 HDF5Array_1.30.1 #> [57] withr_3.0.0 BiocManager_1.30.22 #> [59] fastmap_1.1.1 ggh4x_0.2.8 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 spData_2.3.0 #> [65] digest_0.6.35 rsvd_1.0.5 #> [67] R6_2.5.1 mime_0.12 #> [69] textshaping_0.3.7 colorspace_2.1-0 #> [71] wk_0.9.1 RSQLite_2.3.6 #> [73] intervals_0.15.4 utf8_1.2.4 #> [75] generics_0.1.3 FNN_1.1.4 #> [77] class_7.3-22 httr_1.4.7 #> [79] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [81] spdep_1.3-3 pkgconfig_2.0.3 #> [83] scico_1.5.0 gtable_0.3.5 #> [85] blob_1.2.4 XVector_0.42.0 #> [87] htmltools_0.5.8.1 automap_1.1-9 #> [89] scales_1.3.0 png_0.1-8 #> [91] SpatialExperiment_1.12.0 knitr_1.45 #> [93] rjson_0.2.21 spacetime_1.3-1 #> [95] curl_5.2.1 proxy_0.4-27 #> [97] cachem_1.0.8 zoo_1.8-12 #> [99] rhdf5_2.46.1 BiocVersion_3.18.1 #> [101] KernSmooth_2.23-22 parallel_4.3.3 #> [103] vipor_0.4.7 AnnotationDbi_1.64.1 #> [105] desc_1.4.3 s2_1.1.6 #> [107] reshape_0.8.9 pillar_1.9.0 #> [109] grid_4.3.3 vctrs_0.6.5 #> [111] promises_1.3.0 BiocSingular_1.18.0 #> [113] dbplyr_2.5.0 beachmat_2.18.1 #> [115] xtable_1.8-4 cluster_2.1.6 #> [117] beeswarm_0.4.0 evaluate_0.23 #> [119] magick_2.8.3 cli_3.6.2 #> [121] locfit_1.5-9.9 compiler_4.3.3 #> [123] rlang_1.1.3 crayon_1.5.2 #> [125] labeling_0.4.3 classInt_0.4-10 #> [127] plyr_1.8.9 fs_1.6.4 #> [129] ggbeeswarm_0.7.2 viridisLite_0.4.2 #> [131] deldir_2.0-4 stars_0.6-5 #> [133] munsell_0.5.1 Biostrings_2.70.3 #> [135] Matrix_1.6-5 ExperimentHub_2.10.0 #> [137] patchwork_1.2.0 sparseMatrixStats_1.14.0 #> [139] bit64_4.0.5 Rhdf5lib_1.24.2 #> [141] KEGGREST_1.42.0 statmod_1.5.0 #> [143] shiny_1.8.1.1 highr_0.10 #> [145] interactiveDisplayBase_1.40.0 AnnotationHub_3.10.1 #> [147] igraph_2.0.3 memoise_2.0.1 #> [149] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig10_10x_nuclei.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Chromium nuclei isolation basic quality control","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10x_nuclei.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10x_nuclei.rds\", destfile = \"10x_nuclei.rds\") sce <- readRDS(\"10x_nuclei.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 37 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") #> Warning: Removed 2931 rows containing non-finite outside the scale range #> (`stat_ydensity()`). #> Warning: Removed 2931 rows containing missing values or values outside the scale range #> (`position_quasirandom()`). plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$subsets_mito_percent < 20)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 5260 9091 #> metadata(0): #> assays(1): counts #> rownames(5260): ENSG00000142611.17 ENSG00000142655.13 ... #> ENSG00000225685.2 ENSG00000291031.1 #> rowData names(1): gene_name #> colnames(9091): AAACCCAAGACCATAA AAACCCAAGGTTTGAA ... TTTGTTGTCATCTGTT #> TTTGTTGTCCTCCACA #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] Matrix_1.6-5 stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] abind_1.4-5 compiler_4.3.3 #> [51] proxy_0.4-27 withr_3.0.0 #> [53] BiocParallel_1.36.0 viridis_0.6.5 #> [55] DBI_1.2.2 highr_0.10 #> [57] HDF5Array_1.30.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] bluster_1.12.0 tools_4.3.3 #> [63] units_0.8-5 vipor_0.4.7 #> [65] beeswarm_0.4.0 glue_1.7.0 #> [67] rhdf5filters_1.14.1 grid_4.3.3 #> [69] sf_1.0-16 cluster_2.1.6 #> [71] generics_0.1.3 gtable_0.3.5 #> [73] class_7.3-22 BiocSingular_1.18.0 #> [75] ScaledMatrix_1.10.0 sp_2.1-4 #> [77] utf8_1.2.4 XVector_0.42.0 #> [79] ggrepel_0.9.5 pillar_1.9.0 #> [81] limma_3.58.1 dplyr_1.1.4 #> [83] lattice_0.22-6 deldir_2.0-4 #> [85] tidyselect_1.2.1 locfit_1.5-9.9 #> [87] knitr_1.45 gridExtra_2.3 #> [89] edgeR_4.0.16 xfun_0.43 #> [91] statmod_1.5.0 stringi_1.8.3 #> [93] yaml_2.3.8 boot_1.3-30 #> [95] evaluate_0.23 codetools_0.2-20 #> [97] tibble_3.2.1 cli_3.6.2 #> [99] systemfonts_1.0.6 munsell_0.5.1 #> [101] jquerylib_0.1.4 Rcpp_1.0.12 #> [103] parallel_4.3.3 pkgdown_2.0.9 #> [105] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [107] viridisLite_0.4.2 scales_1.3.0 #> [109] e1071_1.7-14 purrr_1.0.2 #> [111] crayon_1.5.2 scico_1.5.0 #> [113] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig11_clicktags.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic quality control on scRNA-seq data with ClickTag barcodes","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"clicktags.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/clicktags.rds\", destfile = \"clicktags.rds\") sce <- readRDS(\"clicktags.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. bcrank <- barcodeRanks(counts(sce)) knee <- metadata(bcrank)$knee inflection <- metadata(bcrank)$inflection plot(bcrank$rank, bcrank$total, log=\"xy\", xlab=\"Rank\", ylab=\"Total ClickTags count\", cex.lab=1.2) #> Warning in xy.coords(x, y, xlabel, ylabel, log): 1 y value <= 0 omitted from #> logarithmic plot abline(h=inflection, col=\"darkgreen\", lty=2) abline(h=knee, col=\"dodgerblue\", lty=2) sce <- sce[, colSums(counts(sce)) > inflection] sce #> class: SingleCellExperiment #> dim: 20 3368 #> metadata(0): #> assays(1): counts #> rownames(20): ClickTag1 ClickTag2 ... ClickTag19 ClickTag20 #> rowData names(1): feature_name #> colnames(3368): AAACCTGCAAACTGCT AAACCTGGTAGCTTGT ... TTTGTCAGTCACCCAG #> TTTGTCATCTCTTATG #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig12_crispr.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Quality control on Chromium CRISPR Guide Capture libraries","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(Matrix) library(DropletUtils) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xcrispr.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xcrispr.rds\", destfile = \"10xcrispr.rds\") sce <- readRDS(\"10xcrispr.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 0 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. #> Warning: Computation failed in `stat_bin2d()`. #> Caused by error in `bin2d_breaks()`: #> ! `origin` must be a number, not `NaN`. bcrank <- barcodeRanks(counts(sce)) knee <- metadata(bcrank)$knee inflection <- metadata(bcrank)$inflection plot(bcrank$rank, bcrank$total, log=\"xy\", xlab=\"Rank\", ylab=\"Total ClickTags count\", cex.lab=1.2) #> Warning in xy.coords(x, y, xlabel, ylabel, log): 3 y values <= 0 omitted from #> logarithmic plot abline(h=inflection, col=\"darkgreen\", lty=2) abline(h=knee, col=\"dodgerblue\", lty=2) sce <- sce[, which(sce$total > inflection)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 89 293 #> metadata(0): #> assays(1): counts #> rownames(89): Non-Targeting-5 Non-Targeting-7 ... HDAC1-1 HDAC1-2 #> rowData names(1): feature_name #> colnames(293): AAAGAACAGAAACGAA AAAGAACGTTTGTCGA ... TTTGATCCAGGAGAAA #> TTTGATCGTGGTAGTG #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] DropletUtils_1.22.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 Matrix_1.6-5 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig13_10xatac.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"10X ATAC-seq basic quality control ","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xatac.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xatac.rds\", destfile = \"10xatac.rds\") sce <- readRDS(\"10xatac.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$total > 0)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 209 166 #> metadata(0): #> assays(1): counts #> rownames(209): 1:9410718-9410885 1:14968574-14969617 ... #> X:119775524-119775794 X:154317937-154318131 #> rowData names(0): #> colnames(166): AAACTCGCATTCTCGC AAAGGGCGTTGGCTTA ... TTGTCTACAGGTCCTG #> TTTGTGTCATCGTACA #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig14_10xmultiome.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"10X Multiome ATAC-seq basic quality control ","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xmultiome.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xmultiome.rds\", destfile = \"10xmultiome.rds\") sce <- readRDS(\"10xmultiome.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$total > 0)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 277 198 #> metadata(0): #> assays(1): counts #> rownames(277): 1:39574808-39575296 1:43131572-43131673 ... #> X:152281428-152281521 X:166010316-166010375 #> rowData names(0): #> colnames(198): AAACGGTTCATTAGCT AAACGGTTCCGAAACG ... TTTCAAGGTACTAACC #> TTTCCGGCATTAGCAG #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic Visium exploratory data analysis","text":"introductory vignette SpatialFeatureExperiment data representation Voyager analysis package, demonstrate basic exploratory data analysis (EDA) spatial transcriptomics data. Basic knowledge R SingleCellExperiment assumed. vignette showcases packages Visium spatial gene expression system dataset. technology chosen due popularity, therefore availability numerous publicly available datasets analysis (Moses Pachter 2022). Voyager developed goal facilitating use geospatial methods spatial genomics, introductory vignette restricted non-spatial scRNA-seq EDA Visium dataset. vignette illustrating univariate spatial analysis dataset, see advanced exploratory spatial data analyis vignette dataset. load packages used vignette.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(scran) library(patchwork) library(bluster) library(SFEData) library(BiocParallel) library(stringr) library(ggplot2) library(sparseMatrixStats) library(dplyr) library(reticulate) library(concordexR) library(BiocNeighbors) theme_set(theme_bw(10)) # Specify Python version to use gget PY_PATH <- Sys.which(\"python\") use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function gget <- import(\"gget\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"mouse-skeletal-muscle-dataset","dir":"Articles","previous_headings":"","what":"Mouse skeletal muscle dataset","title":"Basic Visium exploratory data analysis","text":"dataset used vignette paper Large-scale integration single-cell transcriptomic data captures transitional progenitor states mouse skeletal muscle regeneration (McKellar et al. 2021). Notexin injected tibialis anterior muscle mice induce injury, healing muscle collected 2, 5, 7 days post injury Visium analysis. dataset vignette timepoint day 2. vignette starts SpatialFeatureExperiment (SFE) object. gene count matrix directly downloaded GEO. 4992 spots, whether tissue , included. tissue boundary found thresholding H&E image OpenCV, small polygons removed likely debris. Spot polygons constructed spot centroid coordinates diameter Space Ranger output. in_tissue column colData indicates spot polygons intersect tissue polygons, based st_intersects(). Tissue boundary, nuclei, myofiber, Visium spot polygons stored sf data frames SFE object. Visium spot polygons called “spotPoly” SFE object. SpatialFeatureExperiment package convenience wrappers get set common types geometries, including spotPoly() Visium (technologies relevant) spot polygons, cellSeg() cell segmentation, nucSeg() nuclei segmentation, centroids() cell centroids. Behind scene specially named sf data frames. See vignette SpatialFeatureExperiment details structure SFE object. SFE object dataset provided SFEData package; begin downloading data loading R. authors provided full resolution hematoxylin eosin (H&E) image GEO, downsized facilitate display: image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"(sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"spots","dir":"Articles","previous_headings":"Quality control","what":"Spots","title":"Basic Visium exploratory data analysis","text":"begin quality control (QC) plotting various metrics violin plots space. QC metrics pre-computed stored colData (spots) rowData SFE object. plot total unique molecular identifier (UMI) counts per spot. commented line code shows compute total UMI counts. maxcell argument maximum number pixels plot image; image downsampled pixels maxcells. can speed plotting plotting image multiple facets. spots injury site leukocyte infiltration high total counts. Spatial autocorrelation total counts apparent, discussed later section vignette. Next find number genes detected per spot. commented line code shows find number genes detected. commonly done scRNA-seq data, plot nCounts vs. nGenes plot two branches spots tissue, turn related myofiber size. See exploratory spatial data analysis (ESDA) Visium vignette. commonly done scRNA-seq data, plot proportion mitochondrially encoded counts. commented code shows find proportion: expected, spots outside tissue higher proportion mitochondrial counts, tissue lysed, mitochondrial transcripts less likely degrade cytosolic transcripts protected double membrane. However, spots myofibers also high proportion mitochondrial counts, function myofibers. injury site leukocyte infiltration lower proportion mitochondrial counts. see relationship proportion mitochondrial counts total UMI counts, plot commonly done scRNA-seq analysis identify low quality cells, .e. cells UMI counts high proportion mitochondrial counts. two clusters spots tissue, also turn related myofiber size. See ESDA Visium vignette. far haven’t seen spots obvious outliers QC metrics. following analyses use spots tissue, selected follows:","code":"names(colData(sfe)) #> [1] \"barcode\" \"col\" \"row\" \"x\" \"y\" \"dia\" #> [7] \"tissue\" \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" \"in_tissue\" # colData(sfe)$nCounts <- colSums(counts(sfe)) violin <- plotColData(sfe, \"nCounts\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial # colData(sfe)$nGenes <- colSums(counts(sfe) > 0) violin <- plotColData(sfe, \"nGenes\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial plotColData(sfe, x = \"nCounts\", y = \"nGenes\", colour_by = \"in_tissue\") # mito_ind <- str_detect(rowData(sfe)$symbol, \"^Mt-\") # colData(sfe)$prop_mito <- colSums(counts(sfe)[mito_ind,]) / colData(sfe)$nCounts violin <- plotColData(sfe, \"prop_mito\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"prop_mito\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial plotColData(sfe, x = \"nCounts\", y = \"prop_mito\", colour_by = \"in_tissue\") sfe_tissue <- sfe[, colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"Basic Visium exploratory data analysis","text":"scRNA-seq, gene expression variance Visium measurements overdispersed compared variance counts Poisson distributed. understand mean-variance relationship, compute mean, variance, coefficient variance (CV2) gene among spots tissue: avoid overplotting better show point density plot, use 2D histogram. color bin indicates number points bin. red line, \\(y = x\\) expected Poisson distributed data, find variance higher highly expressed genes expected Poisson distributed counts. coefficient variation shows .","code":"rowData(sfe_tissue)$means <- rowMeans(counts(sfe_tissue)) rowData(sfe_tissue)$vars <- rowVars(counts(sfe_tissue)) # Coefficient of variance rowData(sfe_tissue)$cv2 <- rowData(sfe_tissue)$vars/rowData(sfe_tissue)$means^2 plotRowData(sfe, x = \"means\", y = \"vars\", bins = 50) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + scale_fill_distiller(palette = \"Blues\", direction = 1) + annotation_logticks() + coord_equal() #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotRowData(sfe, x = \"means\", y = \"cv2\", bins = 50) + geom_abline(slope = -1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + scale_fill_distiller(palette = \"Blues\", direction = 1) + annotation_logticks() + coord_equal() #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"normalize-data","dir":"Articles","previous_headings":"","what":"Normalize data","title":"Basic Visium exploratory data analysis","text":"demonstrate use scater normalization , although note necessarily best approach normalizing spatial transcriptomics data. problem normalize spatial transcriptomics data non-trivial , nCounts plot space shows , spatial autocorrelation evident. Furthemrore, Visium, reverse transcription occurs situ spots, PCR amplification occurs cDNA dissociated spots. Artifacts may subsequently introduced amplification step, associated spatial origin. Spatial artifacts may arise diffusion transcripts tissue permeablization. However, given total counts seem correspond histological regions, total counts may biological component hence treated technical artifact normalized away scRNA-seq data normalization methods. words, issue normalization spatial transcriptomics data, Visium particular, complex currently unsolved. one way normalize non-spatial scRNA-seq data. commented code implements scran method (Lun, Bach, Marioni 2016). simplify matter, perform logNormCounts() introductory vignette. Note scater’s logNormCounts() quite different Seurat. Let \\(N\\) denote total UMI count one Visium spot, \\(\\bar N\\) average total UMI count spots dataset, \\(x\\) denote UMI count one gene Visium spot interest. Seurat performs log normalization \\(\\mathrm{log}\\left( \\frac{x}{N/10000} + 1 \\right)\\), natural log used. contrast, default parameters, scater uses \\(\\mathrm{log_2}\\left( \\frac{x}{N/\\bar N} + 1 \\right)\\). pseudocount (default 1), library size factors (default \\(N/\\bar N\\)), transform (default log2) can changed. Log 2 used differences values can interpreted log fold change. Next, identify highly variable genes (HVGs), used principal component analysis (PCA) dimensionality reduction. , different ways identify HVGs, scater differently Seurat. frameworks, log normalized data used default. summary, Seurat, default parameters, Loess curve fitted log transformed data (log normalized data log transformed fitting purposes), fitted values exponentiated expected variance gene. expected variance mean used standardize log normalized gene expression; standardized values used calculate standardized variance gene. top HVGs genes largest standardized variance. scater, default parameters, parametric non-linear curve variance vs. mean gene log normalized data. log ratio actual variance fitted variance curve calculated, Loess curve fitted log ratio vs. mean gene. “technical” component variance fitted values Loess curve. “biological” component difference actual variance Loess fitted variance. top HVGs genes largest biological component. See documentation modelGeneVar(), fitTrendVar(), getTopHVGs() details. differences can lead different downstream results. don’t comment way better vignette, ’s important aware differences.","code":"# clusters <- quickCluster(sfe_tissue) # sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) # sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension reduction and clustering","title":"Basic Visium exploratory data analysis","text":"clustering show dimension reduction plots principal components (PCs) can plotted space. Due spatial autocorrelation many genes spatial regions different histological characters, even though spatial information used PCA procedure, PCs may show spatial structure. PC1, explains far variance PC2, separates injury site leukocytes myofibers close site Visium myofibers. PC2 highlights center injury site myofibers near edge. PC3 highlights muscle tendon junctions. PC4 seem informative; might picked outlier. also possible run UMAP following PCA, done scRNA-seq. recommend producing UMAP since procedure distorts distances, respect either local global structure data (Chari, Banerjee, Pachter 2021). However, completeness, show compute UMAP : UMAP often used visualize clusters. alternative UMAP concordex, quantitatively shows proportion neighbors k nearest neighbor graph cluster label. consistent default igraph Leiden clustering, use k = 10. cluster labels permuted estimate null distribution, observed values can compared simulated values: observed value much higher simulated values, indicating good clustering. single number average clusters. Values different clusters can plotted heatmap: diagonal represents proportion neighbors cells cluster cluster. diagonal entries low indicate good clustering. interesting spatial transcriptomics, locate clusters space, can done follows: spatial information explicitly used clustering, due spatial autocorrelation gene expression histological regions, clusters spatially contiguous. many methods find spatially informed clusters, BayesSpace (E. Zhao et al. 2021), Bioconductor. Remark spatial regions: geographical space, usually one single way define spatial regions. example, influenced sociology geology, LA county can partitioned regions Eastside, Westside, South Central, San Fernado Valley, San Gabriel Valley, Pomona Valley, Gateway Cities, South Bay, etc., containing multiple smaller cities parts LA City, can divided many neighborhoods, Koreatown, Highland Park, Lincoln Heights, etc. Definitions regions subject dispute. Meanwhile, LA county can also partitioned watersheds LA River, San Gabriel River, Ballona Creek, etc., well different rock formations. kind spatial region resolution relevant depends question asked. also gray areas spatial regions. example, Whittier Narrows dam intercepts San Gabriel River Rio Hondo (large tributary LA River), whether dam area belongs watershed San Gabriel River LA River unclear. Similarly, spatial transcriptomics, methods identifying spatial regions currently generally aim give one result, multiple results different resolutions depending question asked may relevant. Furthermore, methods spatial region demarcation used spatial -omics ideally provide uncertainty assessments assignment cells Visium spots. existing geospatial method accounts uncertainty geocmeans (F. Zhao, Jiao, Liu 2013), CRAN. geographical histological space, conflicting views spatial variation. one hand, methods identify spatially variable genes SpatialDE often assume gene expression vary smoothly continuously space. hand, methods identifying spatial regions attempt identify discrete regions. continuous variation features might definitions geographical neighborhoods often subject dispute. existing methods attempt harmonize two views. example, spatially variable gene method belayer (Ma et al. 2022) takes discrete tissue layers account.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat ElbowPlot(sfe_tissue, ndims = 30) plotDimLoadings(sfe_tissue, dims = 1:4, swap_rownames = \"symbol\") colData(sfe_tissue)$cluster <- clusterRows(reducedDim(sfe_tissue, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotPCA(sfe_tissue, ncomponents = 3, colour_by = \"cluster\") spatialReducedDim(sfe_tissue, \"PCA\", ncomponents = 4, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image = \"lowres\", maxcell = 5e4) set.seed(29) sfe_tissue <- runUMAP(sfe_tissue, dimred = \"PCA\", n_dimred = 3) plotUMAP(sfe_tissue, colour_by = \"cluster\") g <- findKNN(reducedDim(sfe_tissue, \"PCA\")[,1:3], k = 10) res <- calculateConcordex(g$index, labels = sfe_tissue$cluster, k = 10, return.map = TRUE) plotConcordexSim(res) heatConcordex(res, angle_col = 0, cluster_rows = FALSE, cluster_cols = FALSE) plotSpatialFeature(sfe_tissue, \"cluster\", colGeometryName = \"spotPoly\", image = \"lowres\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"non-spatial-differential-expression","dir":"Articles","previous_headings":"","what":"Non-spatial differential expression","title":"Basic Visium exploratory data analysis","text":"Cluster marker genes can found using differential analysis methods commonly done scRNA-seq. example Wilcoxon rank sum test: result sorted p-values: can use gget enrichr module gget package perform gene enrichment analysis. can choose >200 enrichment databases listed Enrichr website. , analyzing top 20 genes cluster 1 using default ontology database GO_Biological_Process_2021: Significant markers cluster can obtained follows: ’ll use module gget info get additional information genes, descriptions, synonyms, transcripts collection reference databases including Ensembl, UniProt, NCBI. , showing gene descriptions NCBI: genes interesting view spatial context:","code":"markers <- findMarkers(sfe_tissue, groups = colData(sfe_tissue)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[1]] #> DataFrame with 15043 rows and 8 columns #> p.value FDR summary.AUC AUC.2 AUC.3 #> #> ENSMUSG00000051747 5.28233e-10 5.46992e-06 0.686982 0.686982 0.909644 #> ENSMUSG00000064360 7.27238e-10 5.46992e-06 0.685410 0.685410 0.978921 #> ENSMUSG00000019787 2.37596e-09 1.19139e-05 0.693069 0.706395 0.812064 #> ENSMUSG00000030730 4.06144e-09 1.52740e-05 0.676723 0.676723 0.898037 #> ENSMUSG00000064341 5.90896e-07 1.77777e-03 0.648920 0.648920 0.969773 #> ... ... ... ... ... ... #> ENSMUSG00000087095 1 1 0.5 0.497525 0.5 #> ENSMUSG00000043969 1 1 0.5 0.500000 0.5 #> ENSMUSG00000091378 1 1 0.5 0.500000 0.5 #> ENSMUSG00000072437 1 1 0.5 0.500000 0.5 #> ENSMUSG00000094649 1 1 0.5 0.497525 0.5 #> AUC.4 AUC.5 AUC.6 #> #> ENSMUSG00000051747 0.971508 0.808830 0.857509 #> ENSMUSG00000064360 0.993854 0.751129 0.889987 #> ENSMUSG00000019787 0.913713 0.693069 0.809057 #> ENSMUSG00000030730 0.976749 0.796210 0.810742 #> ENSMUSG00000064341 0.989804 0.785362 0.909585 #> ... ... ... ... #> ENSMUSG00000087095 0.496212 0.496644 0.5 #> ENSMUSG00000043969 0.496212 0.500000 0.5 #> ENSMUSG00000091378 0.496212 0.500000 0.5 #> ENSMUSG00000072437 0.492424 0.500000 0.5 #> ENSMUSG00000094649 0.500000 0.500000 0.5 enrichr_genes <- rownames(markers[[1]])[1:20] gget_e <- gget$enrichr(enrichr_genes, ensembl=TRUE, database = \"ontology\") # Plot results of gene enrichment analysis # Count number of overlapping genes gget_e$overlapping_genes_count <- lapply(gget_e$overlapping_genes, length) |> as.numeric() # Only keep the top 10 results gget_e <- gget_e[1:10,] gget_e |> ggplot() + geom_bar(aes( x = -log10(adj_p_val), y = reorder(path_name, -adj_p_val) ), stat = \"identity\", fill = \"lightgrey\", width = 0.5, color = \"black\") + geom_text( aes( y = path_name, x = (-log10(adj_p_val)), label = overlapping_genes_count ), nudge_x = 0.25, show.legend = NA, color = \"red\" ) + geom_text( aes( y = Inf, x = Inf, hjust = 1, vjust = 1, label = \"# of overlapping genes\" ), show.legend = NA, size=4, color = \"red\" ) + geom_vline(linetype = \"dashed\", linewidth = 0.5, xintercept = -log10(0.05)) + ylab(\"Pathway name\") + xlab(\"-log10(adjusted P value)\") #> Warning in geom_text(aes(y = Inf, x = Inf, hjust = 1, vjust = 1, label = \"# of overlapping genes\"), : All aesthetics have length 1, but the data has 10 rows. #> ℹ Please consider using `annotate()` or provide this layer with data containing #> a single row. genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe_tissue, rowData(sfe_tissue)[genes_use, \"symbol\"], x = \"cluster\", colour_by = \"cluster\", swap_rownames = \"symbol\") gget_info <- gget$info(genes_use) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> Ttn Enables ankyrin binding activity. Involved in regulation of relaxation of cardiac muscle. Acts upstream of or within several processes, including chordate embryonic development; forward locomotion; and heart development. Located in M band and Z disc. Is expressed in several structures, including diaphragm; embryo mesenchyme; heart; musculature; and tarsus. Used to study autosomal recessive limb-girdle muscular dystrophy type 2J; dilated cardiomyopathy 1G; and tibial muscular dystrophy. Human ortholog(s) of this gene implicated in intrinsic cardiomyopathy (multiple) and myopathy (multiple). Orthologous to human TTN (titin). [provided by Alliance of Genome Resources, Apr 2022] #> Gapdh This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The encoded protein was originally identified as a key glycolytic enzyme that converts D-glyceraldehyde 3-phosphate (G3P) into 3-phospho-D-glyceroyl phosphate. Subsequent studies have assigned a variety of additional functions to the protein including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Alternative splicing results in multiple transcript variants. Many pseudogenes similar to this locus are found throughout the mouse genome. [provided by RefSeq, Jan 2014] #> Hsp90ab1 Enables protein folding chaperone; protein kinase binding activity; and tau protein binding activity. Contributes to protein kinase regulator activity. Involved in several processes, including axonogenesis; positive regulation of protein kinase B signaling; and regulation of cellular protein metabolic process. Acts upstream of or within cellular response to interleukin-4; negative regulation of apoptotic process; and placenta development. Located in growth cone; neuronal cell body; and perinuclear region of cytoplasm. Part of HSP90-CDC37 chaperone complex. Is expressed in several structures, including branchial arch; central nervous system; eye; limb; and placenta. Human ortholog(s) of this gene implicated in multiple sclerosis. Orthologous to human HSP90AB1 (heat shock protein 90 alpha family class B member 1). [provided by Alliance of Genome Resources, Apr 2022] #> Tmsb4x Predicted to enable actin monomer binding activity and enzyme binding activity. Acts upstream of or within regulation of cell migration. Located in cytosol and nucleus. Is expressed in several structures, including alimentary system; central nervous system; eye; heart; and somite. Orthologous to several human genes including TMSB4X (thymosin beta 4 X-linked). [provided by Alliance of Genome Resources, Apr 2022] #> Des This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane and are essential for maintaining the strength and integrity of skeletal, cardiac and smooth muscle fibers. Mutations in this gene affect assembly of intermediate filaments. Mice lacking this gene are able to develop and reproduce but exhibit abnormal muscle fibers. Mutations in the human gene are associated with myofibrillar myopathy, dilated cardiomyopathy, neurogenic scapuloperoneal syndrome and autosomal recessive limb-girdle muscular dystrophy, type 2R. [provided by RefSeq, Jan 2014] plotSpatialFeature(sfe_tissue, genes_use, colGeometryName = \"spotPoly\", ncol = 3, image = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"Basic Visium exploratory data analysis","text":"Tobler’s first law geography (Tobler 1970) states Everything related everything else. near things related distant things. observation motivates examination spatial autocorrelation. Positive spatial autocorrelation evident nearby things tend similar, weather Pasadena downtown Los Angeles (opposed weather Pasadena San Francisco). Negative spatial autocorrelation evident nearby things tend dissimilar, like squares chessboard. Spatial autocorrelation can arise intrinsic process diffusion communication physical contact, result covariate intrinsic process, areal data, areal units observation smaller scale spatial process. commonly used measure spatial autocorrelation Moran’s (Moran 1950), defined \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) variable values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood whether normalize number neighbors. spdep package uses neighborhood. Moran’s similar Pearson correlation value location average value neighbors (identical, see (Lee 2001)). Just like Pearson correlation, Moran’s generally bound -1 1, positive value indicates positive spatial autocorrelation negative value indicates negative spatial autocorrelation. Spatial dependence analysis spdep requires spatial neighborhood graph. graph adjacent Visium spot can found mentioned spatial autocorrelation apparent total UMI counts. ’s Moran’s shows: K means kurtosis. positive values Moran’s indicate positive spatial autocorrelation.","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) calculateMoransI(t(colData(sfe_tissue)[,c(\"nCounts\", \"nGenes\")]), listw = colGraph(sfe_tissue, \"visium\")) #> DataFrame with 2 rows and 2 columns #> moran K #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"spatially-variable-genes","dir":"Articles","previous_headings":"Moran’s I","what":"Spatially variable genes","title":"Basic Visium exploratory data analysis","text":"spatially variable gene gene whose expression depends spatial locations, rather spatially random, like salt grains spread soup. Spatially variable genes can identified spatial autocorrelation signatures, sometimes Moran’s used compare assess spatially variable genes identified different methods. BPPARAM used paralelize computation Moran’s 2000 highly variable genes, 2 cores used SNOW backend. results stored rowData NA’s genes highly variable Moran’s computed genes. rank genes Moran’s plot space follows: see genes strong positive spatial autocorrelation, don’t observe strong negative spatial autocorrelation. Let’s get additional information genes strongest positive spatial autocorrelation space using gget info : Let’s plot genes: genes indeed look spatially variable. However, spatial variability can simply due histological regions space, words, spatial distribution different cell types. many methods identify spatially variable genes, often involving Gaussian process modeling, far complex Moran’s , SpatialDE (Svensson, Teichmann, Stegle 2018). However, methods usually don’t account histological regions, except C-SIDE (Cable et al. 2022), identifies spatially variable genes within cell types. leads question really meant “cell type”. remains see spatial methods made specifically identifying spatially variable genes compare methods don’t explicitly use spatial information simply perform differential analysis cell types often spatially defined histological regions. Another consideration using Moran’s extent strength spatial autocorrelation varies space. gene exhibits strong spatial autocorrelation one region, another? different histological regions analyzed separately cases? ways see whether Moran’s statistically significant, many methods explore spatial autocorrelation. discussed advanced ESDA Visium vignette.","code":"sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, colGraphName = \"visium\", BPPARAM = SnowParam(2)) #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' rowData(sfe_tissue) #> DataFrame with 15043 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000025902 ENSMUSG00000025902 Sox17 Gene Expression 0.03969957 #> ENSMUSG00000096126 ENSMUSG00000096126 Gm22307 Gene Expression 0.00107296 #> ENSMUSG00000033845 ENSMUSG00000033845 Mrpl15 Gene Expression 0.38197425 #> ENSMUSG00000025903 ENSMUSG00000025903 Lypla1 Gene Expression 0.28755365 #> ENSMUSG00000033813 ENSMUSG00000033813 Tcea1 Gene Expression 0.26502146 #> ... ... ... ... ... #> ENSMUSG00000064360 ENSMUSG00000064360 mt-Nd3 Gene Expression 56.445279 #> ENSMUSG00000064363 ENSMUSG00000064363 mt-Nd4 Gene Expression 123.991416 #> ENSMUSG00000064367 ENSMUSG00000064367 mt-Nd5 Gene Expression 14.645923 #> ENSMUSG00000064368 ENSMUSG00000064368 mt-Nd6 Gene Expression 0.109442 #> ENSMUSG00000064370 ENSMUSG00000064370 mt-Cytb Gene Expression 121.273605 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000025902 0.04460915 28.30429 NA NA #> ENSMUSG00000096126 0.00107296 932.00000 NA NA #> ENSMUSG00000033845 0.47048031 3.22458 NA NA #> ENSMUSG00000025903 0.34686963 4.19497 NA NA #> ENSMUSG00000033813 0.32388797 4.61140 0.0489758 19.2181 #> ... ... ... ... ... #> ENSMUSG00000064360 2.47976e+03 0.778314 0.410657 11.31069 #> ENSMUSG00000064363 1.45282e+04 0.944991 0.546964 13.62886 #> ENSMUSG00000064367 2.34858e+02 1.094895 0.480634 3.75345 #> ENSMUSG00000064368 1.31941e-01 11.015664 NA NA #> ENSMUSG00000064370 1.48225e+04 1.007833 0.621060 10.71784 df <- rowData(sfe_tissue)[hvgs,] ord <- order(df$moran_Vis5A, decreasing = TRUE) df[ord, c(\"symbol\", \"moran_Vis5A\")] #> DataFrame with 2000 rows and 2 columns #> symbol moran_Vis5A #> #> ENSMUSG00000064351 mt-Co1 0.764044 #> ENSMUSG00000050335 Lgals3 0.741474 #> ENSMUSG00000029304 Spp1 0.734937 #> ENSMUSG00000021939 Ctsb 0.708362 #> ENSMUSG00000004207 Psap 0.706552 #> ... ... ... #> ENSMUSG00000039911 Spsb1 -0.0333357 #> ENSMUSG00000015711 Prune -0.0354638 #> ENSMUSG00000042675 Ypel3 -0.0369055 #> ENSMUSG00000090262 Mpv17 -0.0412250 #> ENSMUSG00000020964 Sel1l -0.0443975 gget_info2 <- gget$info(rownames(df)[1:6]) rownames(gget_info2) <- gget_info2$primary_gene_name select(gget_info2, ncbi_description) #> ncbi_description #> Spp1 Enables extracellular matrix binding activity. Acts upstream of or within several processes, including cellular ion homeostasis; cellular response to leukemia inhibitory factor; and neutrophil chemotaxis. Located in apical part of cell and cytoplasm. Is expressed in several structures, including alimentary system; brain; metanephros; reproductive system; and skeleton. Human ortholog(s) of this gene implicated in several diseases, including autoimmune disease (multiple); biliary atresia; coronary artery disease (multiple); disease of cellular proliferation (multiple); and hepatitis. Orthologous to human SPP1 (secreted phosphoprotein 1). [provided by Alliance of Genome Resources, Apr 2022] #> Ftl1 Predicted to enable ferric iron binding activity; ferrous iron binding activity; and identical protein binding activity. Predicted to be involved in intracellular sequestering of iron ion. Predicted to be located in autolysosome. Predicted to be part of intracellular ferritin complex. Predicted to be active in cytoplasm. Is expressed in several structures, including central nervous system; ciliary body; liver; and retina nuclear layer. Human ortholog(s) of this gene implicated in basal ganglia disease; hyperferritinemia-cataract syndrome; neurodegeneration with brain iron accumulation 3; and neurodegenerative disease. Orthologous to human FTL (ferritin light chain). [provided by Alliance of Genome Resources, Apr 2022] #> Lgals3 Predicted to enable several functions, including IgE binding activity; advanced glycation end-product receptor activity; and signaling receptor binding activity. Involved in negative regulation of T cell receptor signaling pathway; negative regulation of endocytosis; and negative regulation of lymphocyte activation. Acts upstream of or within extracellular matrix organization and skeletal system development. Located in several cellular components, including external side of plasma membrane; glial cell projection; and immunological synapse. Is expressed in several structures, including alimentary system; genitourinary system; respiratory system; skeleton; and skin. Used to study fatty liver disease. Human ortholog(s) of this gene implicated in asthma. Orthologous to human LGALS3 (galectin 3). [provided by Alliance of Genome Resources, Apr 2022] #> Ctsb This gene encodes a member of the peptidase C1 family and preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to generate the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. Homozygous knockout mice for this gene exhibit reduced pancreatic damage following induced pancreatitis and reduced hepatocyte apoptosis in a model of liver injury. Pseudogenes of this gene have been identified in the genome. [provided by RefSeq, Aug 2015] #> Lgmn This gene encodes a member of the cysteine peptidase family C13 that plays an important role in the endosome/lysosomal degradation system. The encoded inactive preproprotein undergoes autocatalytic removal of the C-terminal inhibitory propeptide to generate the active endopeptidase that cleaves protein substrates on the C-terminal side of asparagine residues. Mice lacking the encoded protein exhibit defects in the lysosomal processing of proteins resulting in their accumulation in the lysosomes, and develop symptoms resembling hemophagocytic lymphohistiocytosis. [provided by RefSeq, Aug 2016] #> Mb Predicted to enable oxygen binding activity. Acts upstream of or within several processes, including brown fat cell differentiation; enucleate erythrocyte differentiation; and response to hypoxia. Is expressed in brown fat; heart; skeletal muscle; and somite. Human ortholog(s) of this gene implicated in acute kidney failure. Orthologous to human MB (myoglobin). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, rownames(df)[1:6], colGeometryName = \"spotPoly\", image = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Basic Visium exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocNeighbors_1.20.2 concordexR_1.2.0 #> [3] reticulate_1.36.1 dplyr_1.1.4 #> [5] sparseMatrixStats_1.14.0 stringr_1.5.1 #> [7] BiocParallel_1.36.0 SFEData_1.4.0 #> [9] bluster_1.12.0 patchwork_1.2.0 #> [11] scran_1.30.2 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [17] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [19] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [21] IRanges_2.36.0 S4Vectors_0.40.2 #> [23] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [25] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [27] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 sf_1.0-16 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 cowplot_1.1.3 #> [19] RColorBrewer_1.1-3 DBI_1.2.2 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 pheatmap_1.0.12 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 e1071_1.7-14 #> [45] systemfonts_1.0.6 dbscan_1.1-12 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 snow_0.4-4 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] xfun_0.43 HDF5Array_1.30.1 #> [57] withr_3.0.0 BiocManager_1.30.22 #> [59] fastmap_1.1.1 boot_1.3-30 #> [61] rhdf5filters_1.14.1 fansi_1.0.6 #> [63] spData_2.3.0 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] RSQLite_2.3.6 utf8_1.2.4 #> [73] generics_0.1.3 FNN_1.1.4 #> [75] class_7.3-22 httr_1.4.7 #> [77] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [79] spdep_1.3-3 uwot_0.2.2 #> [81] pkgconfig_2.0.3 scico_1.5.0 #> [83] gtable_0.3.5 blob_1.2.4 #> [85] XVector_0.42.0 htmltools_0.5.8.1 #> [87] scales_1.3.0 png_0.1-8 #> [89] knitr_1.45 rjson_0.2.21 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 BiocSingular_1.18.0 #> [107] dbplyr_2.5.0 beachmat_2.18.1 #> [109] xtable_1.8-4 cluster_2.1.6 #> [111] beeswarm_0.4.0 evaluate_0.23 #> [113] magick_2.8.3 cli_3.6.2 #> [115] locfit_1.5-9.9 compiler_4.3.3 #> [117] rlang_1.1.3 crayon_1.5.2 #> [119] labeling_0.4.3 classInt_0.4-10 #> [121] fs_1.6.4 ggbeeswarm_0.7.2 #> [123] stringi_1.8.3 viridisLite_0.4.2 #> [125] deldir_2.0-4 munsell_0.5.1 #> [127] Biostrings_2.70.3 Matrix_1.6-5 #> [129] ExperimentHub_2.10.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] highr_0.10 interactiveDisplayBase_1.40.0 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Spatial Visium exploratory data analysis","text":"vignette provides introduction exploratory spatial data analysis methods via Voyager package context Visium dataset.","code":"library(Voyager) library(SpatialFeatureExperiment) library(scater) library(scran) library(SFEData) library(sf) library(ggplot2) library(scales) library(patchwork) library(BiocParallel) library(bluster) library(dplyr) library(reticulate) theme_set(theme_bw(10)) # Specify Python version to use gget PY_PATH <- system(\"which python\", intern = TRUE) use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function # Load gget gget <- import(\"gget\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"Spatial Visium exploratory data analysis","text":"dataset used vignette paper Large-scale integration single-cell transcriptomic data captures transitional progenitor states mouse skeletal muscle regeneration (McKellar et al. 2021). Notexin injected tibialis anterior muscle mice induce injury, healing muscle collected 2, 5, 7 days post injury Visium analysis. dataset vignette timepoint day 2. vignette starts SpatialFeatureExperiment (SFE) object. gene count matrix directly downloaded GEO. 4992 spots, whether tissue , included. H&E image used nuclei myofiber segmentation. subset nuclei randomly selected regions 3 timepoints manually annotated train StarDist model segment rest nuclei, myofibers manually segmented. tissue boundary found thresholding OpenCV, small polygons removed likely debris. Spot polygons constructed spot centroid coordinates diameter Space Ranger output. in_tissue column colData indicates spot polygons intersect tissue polygons, based st_intersects(). Tissue boundary, nuclei, myofiber, Visium spot polygons stored sf data frames SFE object. See vignette SpatialFeatureExperiment details structure SFE object. SFE object dataset provided SFEData package; begin downloading data loading R. H&E image section: image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"(sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"spots-in-tissue","dir":"Articles","previous_headings":"Exploratory data analysis","what":"Spots in tissue","title":"Spatial Visium exploratory data analysis","text":"example dataset Visium spots whether tissue , spots intersect tissue used analyses. Total UMI counts (nCounts), number genes detected per spot (nGenes), proportion mitochondrially encoded counts (prop_mito) precomputed colData(sfe). plotSpatialFeature function can used visualize various attributes space: expression gene, colData values, geometry attributes colGeometry annotGeometry. Visium spots plotted polygons reflecting actual size relative tissue, rather points, case packages plot Visium data. plotting geometries performed hood geom_sf. tissue boundary found thresholding H&E image removing small polygons likely debris. in_tissue column colData(sfe) indicates Visium spot polygon intersects tissue polygon; can found SpatialFeatureExperiment::annotPred(). demonstrate use scran (Lun, Bach, Marioni 2016) normalization , although note necessarily best approach normalizing spatial transcriptomics data. problem normalize spatial transcriptomics data non-trivial , nCounts plot space shows , spatial autocorrelation evident. Furthemrore, Visium, reverse transcription occurs situ spots, PCR amplification occurs cDNA dissociated spots. Artifacts may subsequently introduced amplification step, associated spatial origin. Spatial artifacts may arise diffusion transcripts tissue permeablization. However, given total counts seem correspond histological regions, total counts may biological component hence treated technical artifact normalized away scRNA-seq data normalization methods. words, issue normalization spatial transcriptomics data, Visium particular, complex currently unsolved. Myofiber nuclei segmentation polygons available dataset annotGeometries field. Myofibers manually segmented, nuclei segmented StarDist trained manually segmented subset.","code":"names(colData(sfe)) #> [1] \"barcode\" \"col\" \"row\" \"x\" \"y\" \"dia\" #> [7] \"tissue\" \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" \"in_tissue\" sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] #clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) annotGeometryNames(sfe_tissue) #> [1] \"tissueBoundary\" \"myofiber_full\" \"myofiber_simplified\" #> [4] \"nuclei\" \"nuclei_centroid\""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"from-myofibers-and-nuclei-to-visium-spots","dir":"Articles","previous_headings":"Exploratory data analysis > Spots in tissue","what":"From myofibers and nuclei to Visium spots","title":"Spatial Visium exploratory data analysis","text":"plotSpatialFeature() function can also used plot attributes geometries, .e. non-geometry columns sf data frames rowGeometries, colGeometries, annotGeometries fields SFE object. rowGeometries colGeometries, columns associated sf data frames rather rowData colData, allowed one can specify columns associate geometries (see st_agr documentation st_sf). attribute annotGeometry plotted along side gene expression colData colGeometry attribute, annotGeometry attribute plotted different color palette distinguish column associated values. myofiber polygons annotGeometries can plotted shown , colored cross section area observed tissue section. aes_use argument set color rather fill (default polygons) plot Visium spot outlines make myofiber polygons visible. fill argument set NA make Visium spots look hollow, size argument controls thickness outlines. annot_aes argument specifies column annotGeometry use specify values aesthstic, just like aes ggplot2 (aes_string precise, since tidyeval used ). annot_fixed argument (used ) can set fixed size, alpha, color, etc. annotGeometry. larger myofibers seem fewer total counts, possibly larger size myofibers dilutes transcripts. hints need normalization procedure. SpatialFeatureExperiment, can find number myofibers nuclei intersect Visium spot. predicate can anything implemented sf, example, number nuclei fully covered Visium spot can also found. default predicate st_intersects(). one--one mapping Visium spots myofibers. However, can relate attributes myofibers gene expression detected Visium spots. One way summarize attributes myofibers intersect (choose another better predicate implemented sf) spot, calculate mean, median, sum. can done annotSummary() function SpatialFeatureExperiment. default predicate st_intersects(), default summary function mean(). reveals relationship mean area myofibers intersecting Visium spot aspects spots, total counts gene expression. NAs designate spots intersecting myofibers, e.g. inflammatory region. Basic Visium vignette, encountered two mysterious branches two clusters nGenes vs. nCounts plot proportion mitochondrial counts vs. nCounts plot. Now see two clusters seem related myofiber size.","code":"plotSpatialFeature(sfe_tissue, features = \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\", aes_use = \"color\", linewidth = 0.5, fill = NA, annot_aes = list(fill = \"area\")) colData(sfe_tissue)$n_myofibers <- annotNPred(sfe_tissue, colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\") plotSpatialFeature(sfe_tissue, features = \"n_myofibers\", colGeometryName = \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1) colData(sfe_tissue)$mean_myofiber_area <- annotSummary(sfe_tissue, \"spotPoly\", \"myofiber_simplified\", annotColNames = \"area\")[,1] # it always returns a data frame # The gray spots don't intersect any myofiber plotSpatialFeature(sfe_tissue, \"mean_myofiber_area\", \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1) plotColData(sfe_tissue, x = \"nCounts\", y = \"nGenes\", colour_by = \"mean_myofiber_area\") plotColData(sfe_tissue, x = \"nCounts\", y = \"prop_mito\", colour_by = \"mean_myofiber_area\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"myofiber-types","dir":"Articles","previous_headings":"Exploratory data analysis > Spots in tissue","what":"Myofiber types","title":"Spatial Visium exploratory data analysis","text":"Marker genes: Myh7 (Type , slow twitch, aerobic), Myh2 (Type IIa, fast twitch, somewhat aerobic), Myh4 (Type IIb, fast twitch, anareobic), Myh1 (Type IIx, fast twitch, anaerobic), protocol (Wang, Yue, Kuang 2017) can use gget search gget info modules gget package get Ensembl IDs additional information (example NCBI description) marker genes: first examine Type myofibers. fast twitch muscle, don’t expect many slow twitch Type myofibers. Row names sfe_tissue Ensembl IDs order avoid ambiguity sometimes multiple Ensembl IDs gene symbol genes aliases. However, gene symbols shorter human readable Ensembl IDs, better suited display plots. plotSpatialFeature() function functions Voyager, even row names recorded Ensembl IDs, features argument can take gene symbols swap_rownames argument indicating column rowData(sfe) stores gene symbols. Gene symbols also shown plots instead Ensembl IDs. one gene symbol matches multiple Ensembl IDs dataset, warning given. exprs_values argument specifies assay use, default “logcounts”, .e. log normalized data. default may may suitable practice given total UMI counts may biological relevance spatial data. Therefore, plot raw counts log normalized counts: marker gene type IIa myofibers shown . straightforward modify plotting display markers type IIb type IIx myofibers: Type IIa myofibers also tend clustered together left side tissue. SFE inherits SCE, non-spatial EDA plots scater package can also used: Plotting proportion mitochondrial counts vs. mean myofiber area, see two clusters, one higher proportion mitochondrial counts smaller area, another lower proportion mitochondrial counts average slightly larger area. Type IIa myofibers tend smaller area larger proportion mitochondrial counts.","code":"markers <- c(I = \"Myh7\", IIa = \"Myh2\", IIb = \"Myh4\", IIx = \"Myh1\") gget_search <- gget$search(list(\"Myh7\", \"Myh2\", \"Myh4\", \"Myh1\"), species=\"mouse\") gget_search <- gget_search[gget_search$gene_name %in% list(\"Myh7\", \"Myh2\", \"Myh4\", \"Myh1\"), ] gget_search #> ensembl_id gene_name #> 4 ENSMUSG00000033196 Myh2 #> 5 ENSMUSG00000053093 Myh7 #> 6 ENSMUSG00000056328 Myh1 #> 7 ENSMUSG00000057003 Myh4 #> ensembl_description #> 4 myosin, heavy polypeptide 2, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339710] #> 5 myosin, heavy polypeptide 7, cardiac muscle, beta [Source:MGI Symbol;Acc:MGI:2155600] #> 6 myosin, heavy polypeptide 1, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339711] #> 7 myosin, heavy polypeptide 4, skeletal muscle [Source:MGI Symbol;Acc:MGI:1339713] #> ext_ref_description biotype #> 4 myosin, heavy polypeptide 2, skeletal muscle, adult protein_coding #> 5 myosin, heavy polypeptide 7, cardiac muscle, beta protein_coding #> 6 myosin, heavy polypeptide 1, skeletal muscle, adult protein_coding #> 7 myosin, heavy polypeptide 4, skeletal muscle protein_coding #> synonym #> 4 MHC2A, M.... #> 5 B-MHC, M.... #> 6 A530084A.... #> 7 MHC2B, M.... #> url #> 4 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000033196 #> 5 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000053093 #> 6 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000056328 #> 7 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000057003 gget_info <- gget$info(gget_search$ensembl_id) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> Myh2 Acts upstream of or within actin-mediated cell contraction; plasma membrane repair; and response to activity. Located in several cellular components, including A band; Golgi apparatus; and actomyosin contractile ring. Is expressed in several structures, including alimentary system; forelimb bud mesenchyme; and skeletal musculature. Human ortholog(s) of this gene implicated in inclusion body myositis and proximal myopathy and ophthalmoplegia. Orthologous to human MYH2 (myosin heavy chain 2). [provided by Alliance of Genome Resources, Apr 2022] #> Myh7 Predicted to enable several functions, including ATP binding activity; ATP hydrolysis activity; and identical protein binding activity. Acts upstream of or within cardiac muscle hypertrophy in response to stress and transition between fast and slow fiber. Located in Z disc and stress fiber. Part of myosin complex. Is expressed in several structures, including diaphragm; eye; heart; musculature; and somite. Human ortholog(s) of this gene implicated in cardiomyopathy (multiple); congenital heart disease (multiple); distal myopathy 1; and hyaline body myopathy (multiple). Orthologous to human MYH7 (myosin heavy chain 7). [provided by Alliance of Genome Resources, Apr 2022] #> Myh1 Predicted to enable several functions, including ATP binding activity; actin filament binding activity; and calmodulin binding activity. Located in A band and intercalated disc. Is expressed in several structures, including gonad; gut; hemolymphoid system gland; integumental system; and skeletal musculature. Orthologous to human MYH1 (myosin heavy chain 1). [provided by Alliance of Genome Resources, Apr 2022] #> Myh4 Predicted to enable double-stranded RNA binding activity. Acts upstream of or within response to activity. Predicted to be located in myofibril. Predicted to be part of myosin complex. Is expressed in several structures, including brown fat; diaphragm; heart; limb segment; and skeletal musculature. Orthologous to human MYH4 (myosin heavy chain 4). [provided by Alliance of Genome Resources, Apr 2022] # Function specific for this vignette, with some hard coded values plot_counts_logcounts <- function(sfe, feature) { p1 <- plotSpatialFeature(sfe, feature, \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\"), swap_rownames = \"symbol\", exprs_values = \"counts\", aes_use = \"color\", linewidth = 0.5, fill = NA) + ggtitle(\"Raw counts\") p2 <- plotSpatialFeature(sfe, feature, \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\"), swap_rownames = \"symbol\", exprs_values = \"logcounts\", aes_use = \"color\", linewidth = 0.5, fill = NA) + ggtitle(\"Log normalized counts\") p1 + p2 + plot_annotation(title = feature) } plot_counts_logcounts(sfe_tissue, markers[\"I\"]) plot_counts_logcounts(sfe_tissue, markers[\"IIa\"]) plotColData(sfe_tissue, x = \"mean_myofiber_area\", y = \"prop_mito\", colour_by = markers[\"IIa\"], by_exprs_values = \"logcounts\", swap_rownames = \"symbol\") #> Warning: Removed 36 rows containing missing values or values outside the scale range #> (`geom_point()`)."},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"spatial-neighborhood-graphs","dir":"Articles","previous_headings":"","what":"Spatial neighborhood graphs","title":"Spatial Visium exploratory data analysis","text":"spatial neighborhood graph required compute spatial dependency metrics Moran’s Geary’s C. SpatialFeatureExperiment package wraps methods spdep find spatial neighborhood graphs, stored within SFE object (see spdep documentation gabrielneigh(), knearneigh(), poly2nb(), tri2nb()). Voyager package uses graphs spatial dependency analyses, based spdep first version, methods geospatial packages, also use spatial neighborhood graphs, may added later. Visium, spots hexagonal grid, spatial neighborhood graph straightforward. However, spatial technologies single cell resolution, e.g. MERFISH, different methods can used find spatial neighborhood graph. example, method “poly2nb” used myofibers, identifies myofiber polygons physically touch . zero.policy = TRUE allow singletons, .e. nodes without neighbors graph; inflamed region, singletons. yet benchmarked spatial neighborhood construction methods determine “best” different technologies; particular method used demonstration purposes may best practice: plotColGraph() function plots graph space associated colGeometry, along geometry interest. Similarly, plotAnnotGraph() function plots graph associated annotGeometry, along geometry interest. plotRowGraph yet since haven’t worked dataset spatial graphs related genes relevant, although SFE object supports row graphs.","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) annotGraph(sfe_tissue, \"myofiber_poly2nb\") <- findSpatialNeighbors(sfe_tissue, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE) plotColGraph(sfe_tissue, colGraphName = \"visium\", colGeometryName = \"spotPoly\") + theme_void() plotAnnotGraph(sfe_tissue, annotGraphName = \"myofiber_poly2nb\", annotGeometryName = \"myofiber_simplified\") + theme_void()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"exploratory-spatial-data-analysis","dir":"Articles","previous_headings":"","what":"Exploratory spatial data analysis","title":"Spatial Visium exploratory data analysis","text":"spatial autocorrelation metrics package can computed directly vector matrix rather SFE object. user interface emulates dimension reductions scater package (e.g. calculateUMAP() takes matrix SCE object returns matrix, runUMAP() takes SCE object adds results reducedDims field SCE object). calculate* functions take matrix SFE object directly return results (format results depends structure results), run* functions take SFE object add results object. addition, colData* functions compute metrics numeric variables colData. colGeometry* functions compute metrics numeric columns colGeometry. annotGeometry* functions compute metrics numeric columns annotGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"univariate-global","dir":"Articles","previous_headings":"","what":"Univariate global","title":"Spatial Visium exploratory data analysis","text":"Voyager supports many univariate global spatial autocorrelation implemented spdep ESDA: Moran’s Geary’s C, permutation testing Moran’s Geary’s C, Moran plot, correlograms. addition, beyond spdep, Voyager can cluster Moran plots correlograms. Plotting functions taking SFE objects implemented plot results ggplot2 customization options spdep plotting functions. functions calculateUnivariate(), runUnivariate(), colDataUnivariate(), colGeometryUnivariate(), annotGeometryUnivariate() compute univariate spatial statistics. argument type, indicates corresponding function names spdep, determines spatial statistics computed. univariate global methods Voyager listed : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object (see SFEMethod() vignette SFEMethod) string matches entry name column data frame returned listSFEMethods(). demonstrate spatial autocorrelation gene expression, top highly variable genes (HVGs) used. HVGs found scran method. global statistic yields one result entire dataset.","code":"listSFEMethods(variate = \"uni\", scope = \"global\") #> name description #> 1 moran Moran's I #> 2 geary Geary's C #> 3 moran.mc Moran's I with permutation testing #> 4 geary.mc Geary's C with permutation testing #> 5 sp.mantel.mc Mantel-Hubert spatial general cross product statistic #> 6 moran.test Moran's I test #> 7 geary.test Geary's C test #> 8 globalG.test Global G test #> 9 sp.correlogram Correlogram #> 10 variogram Variogram with model #> 11 variogram_map Variogram map dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"morans-i","dir":"Articles","previous_headings":"Univariate global","what":"Moran’s I","title":"Spatial Visium exploratory data analysis","text":"several ways quantify spatial autocorrelation, common Moran’s : \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) variable values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood whether normalize number neighbors. spdep package uses neighborhood. Moran’s can understood Pearson correlation value location average value neighbors. Just like Pearson correlation, Moran’s generally bound -1 1, positive value indicates positive spatial autocorrelation negative value indicates negative spatial autocorrelation. Upon visual inspection, total UMI counts per spot seem spatial autocorrelation. spatial neighborhood graph required compute Moran’s , specified listw argument. matrices, rows features, gene count matrix. “moran” Moran’s , K sample kurtosis. add results SFE object, specifically colData: colData, results added colFeatureData(sfe), features Moran’s calculated NA. column names featureData distinguishes different samples (’s one sample dataset), parsed plotting functions. add results SFE object, specifically geometries: “area” area cross section myofiber seen tissue section “eccentricity” eccentricity ellipse fitted myofiber. non-geometry column colGeometry, colGeometryUnivariate() like annotGeometryUnivariate() , none colGeometries dataset extra columns. gene expression, logcounts assay used default (use exprs_values argument change assay), though may may best practice. metrics computed large number features, parallel computing supported, BiocParallel, BPPARAM argument.","code":"# Directly use vector or matrix, and multiple features can be specified at once calculateUnivariate(t(colData(sfe_tissue)[,c(\"nCounts\", \"nGenes\")]), type = \"moran\", listw = colGraph(sfe_tissue, \"visium\")) #> DataFrame with 2 rows and 2 columns #> moran K #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036 sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"moran\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036 # Remember zero.policy = TRUE since there're singletons sfe_tissue <- annotGeometryUnivariate(sfe_tissue, type = \"moran\", features = c(\"area\", \"eccentricity\"), annotGeometryName = \"myofiber_simplified\", annotGraphName = \"myofiber_poly2nb\", zero.policy = TRUE) head(attr(annotGeometry(sfe_tissue, \"myofiber_simplified\"), \"featureData\")) #> DataFrame with 6 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> lyr.1 NA NA #> area 0.327888 4.95675 #> perimeter NA NA #> eccentricity 0.110938 3.26913 #> theta NA NA #> sine_theta NA NA sfe_tissue <- runUnivariate(sfe_tissue, type = \"moran\", features = hvgs, colGraphName = \"visium\", BPPARAM = MulticoreParam(2)) rowData(sfe_tissue)[head(hvgs),] #> DataFrame with 6 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000029304 ENSMUSG00000029304 Spp1 Gene Expression 1.63722 #> ENSMUSG00000050708 ENSMUSG00000050708 Ftl1 Gene Expression 2.37981 #> ENSMUSG00000050335 ENSMUSG00000050335 Lgals3 Gene Expression 1.43189 #> ENSMUSG00000021939 ENSMUSG00000021939 Ctsb Gene Expression 2.73117 #> ENSMUSG00000021190 ENSMUSG00000021190 Lgmn Gene Expression 1.11278 #> ENSMUSG00000018893 ENSMUSG00000018893 Mb Gene Expression 2.11118 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000029304 60.1583 22.4430 0.734937 1.63516 #> ENSMUSG00000050708 162.1931 28.6384 0.665563 1.81841 #> ENSMUSG00000050335 48.0739 23.4471 0.741474 1.68098 #> ENSMUSG00000021939 131.6232 17.6455 0.708362 1.86896 #> ENSMUSG00000021190 21.4505 17.3228 0.659916 1.66838 #> ENSMUSG00000018893 74.1782 16.6428 0.675840 1.82510"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"gearys-c","dir":"Articles","previous_headings":"Univariate global","what":"Geary’s C","title":"Spatial Visium exploratory data analysis","text":"Another spatial autocorrelation metric Geary’s C, defined : \\[ C = \\frac{(n-1)}{2\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}(x_i - x_j)^2}{{\\sum_{=1}^n (x_i - \\bar{x})^2}} \\] Geary’s C 1 indicates positive spatial autocorrelation, 1 indicates negative spatial autocorrelation. compute Geary’s C features interest replace type = \"moran\" previous section type = \"geary\", add results SFE object. example, colData ’s one column K since ’s Moran’s Geary’s C. Moran’s Geary’s C suggest positive spatial autocorrelation nCounts nGenes. univariate global methods, including permutation testing Moran’s Geary’s C, correlograms, Moran scatter plot can also called functions runUnivariate, specifying type argument. See documentation runUnivariate see available methods see documentation corresponding spdep functions see extra arguments required method.","code":"sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"geary\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 3 columns #> moran_Vis5A K_Vis5A geary_Vis5A #> #> nCounts 0.528705 3.00082 0.474892 #> nGenes 0.384028 3.88036 0.605797"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"permutation-testing","dir":"Articles","previous_headings":"Univariate global","what":"Permutation testing","title":"Spatial Visium exploratory data analysis","text":"establish whether spatial autocorrelation statistically significant, moran.test() function spdep can used. provides p-value, p-value may accurate data normally distributed. gene expression data generally normally distributed data normalization doesn’t always work well, use permutation testing test significance Moran’s Geary’s C, wrapping moran.mc() spdep. “mc” stands Monte Carlo. nsim argument specifies number simulations. following adds results SFE object: Note test performed multiple features, p-values corrected multiple hypothesis testing. results can plotted: default, colorblind friendly palette dittoSeq used categorical variables. density plot Moran’s simulations values permuted disconnected spatial locations, vertical line actual Moran’s value. simulation indicates actual Moran’s much higher simulations values dissociated spatial locations permuted among locations, indicating spatial autocorrelation significant. Use type = \"geary.mc\" permutation testing Geary’s C. spdep package can also compute p-values Moran’s analytically, theory behind mean variance null distribution Moran’s assumes normal distribution data, gene expression data generally non-normal. However, according (Griffith 2010), large sample size (“preferably least 100”), mean variance Moran’s several iid non-normal simulated datasets (including negative binomial, commonly used model gene expression data) don’t seem deviate much values expected normally distributed data. Spatial transcriptomics datasets typically thousands spots cells, sample size likely large enough. Hence using analytical test non-normal data might bad. However, large sample size, minuscule difference can create significant p-values. perform analytical test Moran’s : Now compare p-values permutation analytical test; cases , default alternative hypothesis positive spatial autocorrelation: p-values permutation limited number permutations (1000 ). Either way, permutation analytical tests indicate significant positive spatial autocorrelation. limitation permutation testing Moran’s assumes permutation values among locations equally likely, necessarily true. instance, epidemiology, disease rate regions small population likely assumes extreme values (Assunção Reis 1999), analogous rare cell types lowly expressed genes histological space given divide total UMI counts per spot. extent happens may depend tissue, gene interest, technology, data normalization method.","code":"set.seed(29) sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", nsim = 1000, type = \"moran.mc\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 9 columns #> moran_Vis5A K_Vis5A geary_Vis5A moran.mc_statistic_Vis5A #> #> nCounts 0.528705 3.00082 0.474892 0.528705 #> nGenes 0.384028 3.88036 0.605797 0.384028 #> moran.mc_parameter_Vis5A moran.mc_p.value_Vis5A #> #> nCounts 1001 0.000999001 #> nGenes 1001 0.000999001 #> moran.mc_alternative_Vis5A moran.mc_method_Vis5A #> #> nCounts greater Monte-Carlo simulati.. #> nGenes greater Monte-Carlo simulati.. #> moran.mc_res_Vis5A #> #> nCounts -0.02610680, 0.00305305,-0.01996753,... #> nGenes 0.02274607,-0.02127688, 0.00705138,... plotMoranMC(sfe_tissue, c(\"nCounts\", \"nGenes\")) sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"moran.test\") names(colFeatureData(sfe_tissue)) #> [1] \"moran_Vis5A\" \"K_Vis5A\" #> [3] \"geary_Vis5A\" \"moran.mc_statistic_Vis5A\" #> [5] \"moran.mc_parameter_Vis5A\" \"moran.mc_p.value_Vis5A\" #> [7] \"moran.mc_alternative_Vis5A\" \"moran.mc_method_Vis5A\" #> [9] \"moran.mc_res_Vis5A\" \"moran.test_statistic_Vis5A\" #> [11] \"moran.test_p.value_Vis5A\" \"moran.test_alternative_Vis5A\" #> [13] \"moran.test_method_Vis5A\" \"moran.test_Moran.I.statistic_Vis5A\" #> [15] \"moran.test_Expectation_Vis5A\" \"moran.test_Variance_Vis5A\" # permutation colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"), c(\"moran.mc_p.value_Vis5A\", \"moran.test_p.value_Vis5A\")] #> DataFrame with 2 rows and 2 columns #> moran.mc_p.value_Vis5A moran.test_p.value_Vis5A #> #> nCounts 0.000999001 5.41958e-163 #> nGenes 0.000999001 2.82666e-87"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"correlogram","dir":"Articles","previous_headings":"Univariate global","what":"Correlogram","title":"Spatial Visium exploratory data analysis","text":"correlogram, spatial autocorrelation higher orders neighbors (e.g. second order neighbors neighbors neighbors) calculated see decays orders. Visium, regular hexagonal grid, order neighbors proxy distance. irregular patterns single cells, different methods find spatial neighbors may give different results. colData, Moran’s correlogram computed results can plotted plotCorrelogram: error bars twice standard deviation Moran’s value. standard deviation p-values (null hypothesis Moran’s 0) come moran.test() (Geary’s C correlogram, geary.test()); taken grain salt data normally distributed. p-values corrected multiple hypothesis testing across orders features. usual, . means p < 0.1, * means p < 0.05, ** means p < 0.01, *** means p < 0.001. , can done Geary’s C, colData, annotGeometry, etc.","code":"sfe_tissue <- runUnivariate(sfe_tissue, hvgs[1:2], colGraphName = \"visium\", order = 10, type = \"sp.correlogram\") plotCorrelogram(sfe_tissue, hvgs[1:2], swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"univariate-local","dir":"Articles","previous_headings":"","what":"Univariate local","title":"Spatial Visium exploratory data analysis","text":"Local statistics yield result location rather whole dataset, global statistics may obscure local heterogeneity. See (Fotheringham 2009) interesting discussion relationships global local spatial statistics. Local statistics stored localResults field SFE object, can accessed localResult() localResults() functions SpatialFeatureExperiment package. univariate local methods Voyager listed :","code":"listSFEMethods(variate = \"uni\", scope = \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"moran-scatter-plot","dir":"Articles","previous_headings":"Univariate local","what":"Moran scatter plot","title":"Spatial Visium exploratory data analysis","text":"Moran scatter plot (Anselin 1996), x axis value spot, y axis average value neighbors. slope fitted line Moran’s . Sometimes clusters appear plot, showing different kinds neighborhoods. gene expression, use one gene (log normalized value) demonstrate: dashed lines mark mean Myh2 spatially lagged Myh2. singletons . Visium spots lower Myh2 expression neighbors don’t express Myh2 spots don’t express Myh2 usually least neighbors . twp main clusters spots whose neighbors express Myh2: high (average) expression whose neighbors also high expression, low expression whose neighbors also low expression. features may show different kinds clusters. can use k-means clustering identify clusters, though clustering method supported bluster package can used. can use gget search module get Ensembl ID Myh2: Plot clusters space can also done colData, annotGeometry, etc. Moran’s permutation testing.","code":"sfe_tissue <- runUnivariate(sfe_tissue, \"Myh2\", colGraphName = \"visium\", type = \"moran.plot\", swap_rownames = \"symbol\") moranPlot(sfe_tissue, \"Myh2\", graphName = \"visium\", swap_rownames = \"symbol\") set.seed(29) clusts <- clusterMoranPlot(sfe_tissue, \"Myh2\", BLUSPARAM = KmeansParam(2), swap_rownames = \"symbol\") gget$search(\"Myh2\", species=\"mouse\") #> ensembl_id gene_name #> 1 ENSMUSG00000033196 Myh2 #> ensembl_description #> 1 myosin, heavy polypeptide 2, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339710] #> ext_ref_description biotype #> 1 myosin, heavy polypeptide 2, skeletal muscle, adult protein_coding #> synonym #> 1 MHC2A, M.... #> url #> 1 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000033196 moranPlot(sfe_tissue, \"Myh2\", graphName = \"visium\", color_by = clusts$ENSMUSG00000033196, swap_rownames = \"symbol\") colData(sfe_tissue)$Myh2_moranPlot_clust <- clusts$ENSMUSG00000033196 plotSpatialFeature(sfe_tissue, \"Myh2_moranPlot_clust\", colGeometryName = \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"local-morans-i","dir":"Articles","previous_headings":"Univariate local","what":"Local Moran’s I","title":"Spatial Visium exploratory data analysis","text":"recap, global Moran’s defined \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}. \\] Local Moran’s (Anselin 1995) defined \\[ I_i = (n-1)\\frac{(x_i - \\bar{x})\\sum_{j=1}^n w_{ij} (x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}. \\] ’s similar global Moran’s , values locations \\(\\) summed ’s normalization sum spatial weights. useful plot log normalized Myh2 gene expression context interpret local results: see regions higher Myh2 expression also stronger spatial autocorrelation. interesting see spatial autocorrelation relates gene expression level, much finding variance relates mean expression gene, usually indicates overdispersion compared Poisson scRNA-seq Visium data: gene, Visium spots higher expression also tend higher local Moran’s , may may apply genes. Local spatial analyses often return matrix data frame. plotLocalResult() function default column local spatial method, columns can plotted well. Use localResultAttrs() function see columns present, use attribute argument specify column plot. local spatial methods return p-values location, column name like Pr(z != E(Ii)), test two sided (default, can changed alternative argument runUnivariate() passed relevant underlying function spdep). Negative log p-value computed facilitate visualization, p-value corrected multiple hypothesis testing p.adjustSP() spdep, number tests number neighbors location rather total number locations (-log10p_adj). plot following plots p-values, divergent palette used show locations significant adjusting multiple testing significant different colors. center divergent palette p = 0.05, brown spots significant dark blue means really significant. “pysal” column displays quadrants relative means Moran plot. result similar k-means clustering shown .","code":"sfe_tissue <- runUnivariate(sfe_tissue, type = \"localmoran\", features = \"Myh2\", colGraphName = \"visium\", swap_rownames = \"symbol\") plotSpatialFeature(sfe_tissue, features = \"Myh2\", colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1) plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\",divergent = TRUE, diverge_center = 0, image_id = \"lowres\", swap_rownames = \"symbol\", color = \"black\", linewidth = 0.1) df <- data.frame(myh2 = logcounts(sfe_tissue)[rowData(sfe_tissue)$symbol == \"Myh2\",], Ii = localResult(sfe_tissue, \"localmoran\", \"Myh2\", swap_rownames = \"symbol\")[,\"Ii\"]) ggplot(df, aes(myh2, Ii)) + geom_point(alpha = 0.3) + labs(x = \"Myh2 (log counts)\", y = \"localmoran\") localResultAttrs(sfe_tissue, \"localmoran\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\", attribute = \"-log10p_adj\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1) plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\", attribute = \"pysal\", swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"getis-ord-gi","dir":"Articles","previous_headings":"Univariate local","what":"Getis-Ord Gi*","title":"Spatial Visium exploratory data analysis","text":"Getis-Ord Gi* used find hotspots coldspots feature space. hotspot cluster high values space, coldspot cluster low values space. Getis-Ord Gi* essentially z-score spatially lagged value feature location \\(\\) ($j w{ij}x_j $), \\(w_{ij}\\) spatial weight. original publication Getis-Ord Gi* 1992 (Getis Ord 1992), spatial weight distance-based binary weight indicating whether another location within certain distance location \\(\\). Getis-Ord Gi excludes location \\(\\) computation mean variance lagged value, Gi* includes location \\(\\) . Usually Gi Gi* yield similar results. mean variance used z-score differ Gi Gi* described paper 1995 (J. K. Ord Getis 1995) derived (Getis Ord 1992). Binary weights recommended Getis-Ord Gi*. High values Gi* indicate hotspots, low values Gi* indicate coldspots. Plot pseudo-p-values simulation hotspots expected. warm color indicates adjusted \\(p < 0.05\\). Local results can also computed annotation geometries. hotspots coldspots expected. Warm color indicates adjusted \\(p < 0.05\\).","code":"colGraph(sfe_tissue, \"visium_B\") <- findVisiumGraph(sfe_tissue, style = \"B\") sfe_tissue <- runUnivariate(sfe_tissue, type = \"localG_perm\", features = \"Myh2\", colGraphName = \"visium_B\", include_self = TRUE, swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"localG_perm\", features = \"Myh2\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image_id = \"lowres\", swap_rownames = \"symbol\", color = \"black\", linewidth = 0.1) localResultAttrs(sfe_tissue, \"localG_perm\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"localG\" \"Gi\" \"E.Gi\" #> [4] \"Var.Gi\" \"StdDev.Gi\" \"Pr(z != E(Gi))\" #> [7] \"Pr(z != E(Gi)) Sim\" \"Pr(folded) Sim\" \"Skewness\" #> [10] \"Kurtosis\" \"-log10p Sim\" \"-log10p_adj Sim\" #> [13] \"cluster\" plotLocalResult(sfe_tissue, \"localG_perm\", features = \"Myh2\", attribute = \"-log10p_adj Sim\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\") annotGraph(sfe_tissue, \"myofiber_poly2nb_B\") <- findSpatialNeighbors(sfe_tissue, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE, style = \"B\") sfe_tissue <- annotGeometryUnivariate(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", annotGraphName = \"myofiber_poly2nb_B\", include_self = TRUE, zero.policy = TRUE) plotLocalResult(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", attribute = \"-log10p_adj Sim\", divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"local-spatial-heteroscedasticity-losh","dir":"Articles","previous_headings":"Univariate local","what":"Local spatial heteroscedasticity (LOSH)","title":"Spatial Visium exploratory data analysis","text":"LOSH (J. Keith Ord Getis 2012) defined \\[ H_i = \\frac{\\sum_j w_{ij}\\left| e_j \\right|^}{h_1\\sum_j w_{ij}} \\] \\(h_1 = \\sum_i \\left| e_i \\right|^/n\\), \\(e_j = x_j - \\bar{x}_j\\), \\[ \\bar{x}_j = \\frac{\\sum_j w_{jk}x_k}{\\sum_j w_{jk}}. \\] default, \\(= 2\\) LOSH like local variance. See (J. Keith Ord Getis 2012) details interpretation. gene, isn’t clear whether LOSH relates gene expression levels. Voyager wrap LOSH.mc() perform permutation testing LOSH, time consuming. chi-squared approximation described 2012 LOSH paper account non-normality data approximate mean variance permutation distributions, p-values LOSH can quickly computed, LOSH.cs(). gene, local conditions mostly homogenous, except spots injury site. Warm color indicates adjusted \\(p < 0.05\\).","code":"sfe_tissue <- runUnivariate(sfe_tissue, \"LOSH.cs\", \"Myh2\", colGraphName = \"visium\", swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"LOSH.cs\", features = \"Myh2\", colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\") localResultAttrs(sfe_tissue, \"LOSH.cs\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"Hi\" \"E.Hi\" \"Var.Hi\" \"Z.Hi\" \"x_bar_i\" #> [6] \"ei\" \"Pr()\" \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe_tissue, \"LOSH.cs\", features = \"Myh2\", attribute = \"-log10p_adj\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\") + theme_void()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"caveats","dir":"Articles","previous_headings":"","what":"Caveats","title":"Spatial Visium exploratory data analysis","text":"H&E image can alter perception colors geometries. 2D data supported present, although principle, sf GEOS support 3D data. Spatial neighborhoods make sense within tissue section. multiple tissue sections, biological replica, different conditions? mouse brain, different biological replica can registered Allen Common Coordinate Framework (CCF) spatially comparable. Indeed, interesting see biological variability healthy wild type gene expression fine scaled region brain. However, CCF tissues without stereotypical structure, adipose skeletal muscle. don’t good solution spatially compare different tissue sections yet. Perhaps global spatial statistics whole section histological regions within section can compared. problem remains select informative metrics compare. Perhaps spatially-informed dimension reduction method, taking gene count matrix, also adjacency matrices spatial neighborhood graphs (different sections different blocks matrix) projecting cells Visium spots different sections shared low dimensional space can facilitate comparison. batch effect must corrected, dimension reduction interpretable, scalable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Spatial Visium exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 dplyr_1.1.4 #> [3] bluster_1.12.0 BiocParallel_1.36.0 #> [5] patchwork_1.2.0 scales_1.3.0 #> [7] sf_1.0-16 SFEData_1.4.0 #> [9] scran_1.30.2 scater_1.30.1 #> [11] ggplot2_3.5.1 scuttle_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SpatialFeatureExperiment_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] edgeR_4.0.16 MASS_7.3-60.0.1 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] cowplot_1.1.3 DBI_1.2.2 #> [21] RColorBrewer_1.1-3 abind_1.4-5 #> [23] zlibbioc_1.48.2 purrr_1.0.2 #> [25] RCurl_1.98-1.14 rappdirs_0.3.3 #> [27] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [29] irlba_2.3.5.1 terra_1.7-71 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] dbscan_1.1-12 tools_4.3.3 #> [49] ggnewscale_0.4.10 ragg_1.3.0 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] mgcv_1.9-1 xfun_0.43 #> [57] HDF5Array_1.30.1 withr_3.0.0 #> [59] BiocManager_1.30.22 fastmap_1.1.1 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 spData_2.3.0 #> [65] digest_0.6.35 rsvd_1.0.5 #> [67] R6_2.5.1 mime_0.12 #> [69] textshaping_0.3.7 colorspace_2.1-0 #> [71] wk_0.9.1 RSQLite_2.3.6 #> [73] utf8_1.2.4 generics_0.1.3 #> [75] class_7.3-22 httr_1.4.7 #> [77] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [79] spdep_1.3-3 pkgconfig_2.0.3 #> [81] scico_1.5.0 gtable_0.3.5 #> [83] blob_1.2.4 XVector_0.42.0 #> [85] htmltools_0.5.8.1 png_0.1-8 #> [87] SpatialExperiment_1.12.0 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 BiocSingular_1.18.0 #> [107] dbplyr_2.5.0 beachmat_2.18.1 #> [109] xtable_1.8-4 cluster_2.1.6 #> [111] beeswarm_0.4.0 evaluate_0.23 #> [113] isoband_0.2.7 magick_2.8.3 #> [115] cli_3.6.2 locfit_1.5-9.9 #> [117] compiler_4.3.3 rlang_1.1.3 #> [119] crayon_1.5.2 labeling_0.4.3 #> [121] classInt_0.4-10 fs_1.6.4 #> [123] ggbeeswarm_0.7.2 viridisLite_0.4.2 #> [125] deldir_2.0-4 munsell_0.5.1 #> [127] Biostrings_2.70.3 Matrix_1.6-5 #> [129] ExperimentHub_2.10.0 sparseMatrixStats_1.14.0 #> [131] bit64_4.0.5 Rhdf5lib_1.24.2 #> [133] KEGGREST_1.42.0 statmod_1.5.0 #> [135] shiny_1.8.1.1 highr_0.10 #> [137] interactiveDisplayBase_1.40.0 AnnotationHub_3.10.1 #> [139] igraph_2.0.3 memoise_2.0.1 #> [141] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Slide-seq V2 spatial transcriptomic tool measures genome-wide expression using DNA-barcoded beads patterned slide non-regular array. beads used current protocol diameter \\(10 \\mu m\\) thus larger single cell, number detected transcripts order magnitude higher compared previous iteration technology. vignette, use Voyager analyze dataset generated using Slide-Seq V2 technology. data described Dissecting treatment-naive ecosystem human melanoma brain metastasis (Biermann et al. 2022). raw counts cell metadata publicly available GEO. focus one human melanoma brain metastasis (MBM) samples provided SFEData package SpatialFeatureExperiment(SFE) object. SFE object contains raw counts, QC metrics number UMIs genes detected per barcode, centroid coordinates barcode sf POINT geometry. SFE object SFEData package includes information 27,566 features 29,536 beads/barcodes.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(scran) library(bluster) library(ggplot2) library(patchwork) library(spdep) library(BiocParallel) theme_set(theme_bw()) (sfe <- BiermannMelaMetasData(dataset = \"MBM05_rep1\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 27566 29536 #> metadata(0): #> assays(1): counts #> rownames(27566): A1BG A1BG-AS1 ... ZZZ3 snoZ196 #> rowData names(3): means vars cv2 #> colnames(29536): ACCACTCATTTCTC-1 GTTCANTCCACGTA-1 ... ACGCGCAATCGTAG-1 #> TTGTTCCGTTCATA-1 #> colData names(4): sample_id nCounts nGenes prop_mito #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : xcoord ycoord #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Slide-Seq V2 Exploratory Data Analysis","text":"begin performing exploratory data analysis barcodes tissue. pre-computed QC measures stored object. Total UMI counts (nCounts), number genes detected per spot (nGenes), proportion mitochondrially encoded counts (prop_mito). , plot total number UMI counts per barcode violin plot space. latter task, leverage function plotSpatialFeature() uses geom_sf() plot geometries applicable. first lines compute average number UMI counts per barcode average plotted red line violin plot. barcode represented sf POINT geometry plot , note many beads quite low UMI counts, small regions throughout tissue appear high counts. perhaps due high cellular density melanoma cells, can speculate without image tissue. Interestingly, barcodes zero counts. contrast many scRNA-seq dataset many cells zero counts. Given density points, may choose aggregate points hexagonal grid avoid overplotting. hexagon colored total number UMI counts space hexagon may represent one barcode. worthwhile note cell segmentation data included dataset. Even though Slide-Seq V2 profile gene expression single cell resolution, cell segmentation data can flexibly stored annotGeometries SFE object. geometries can plotted barcode-level data can used sf operations like finding number barcodes localized single cell. plot visualizes number UMI counts per barcode log scale. appears barcodes higher counts co-localized regions throughout tissue, however, regions rather small may suggest spatial autocorrelation. Next find number genes detected per barcode. , QC feature provided nGenes colData attribute barcodes. Similar number UMI counts per barcode, seem small regions higher number genes throughout tissue. may correspond regions cellular diversity high cellular density, might expected context melanoma. can compute degree number UMI counts per barcode depends spatial location measurement. relationship, spatial autocorrelation, can quantified using Moran’s index spatial autocorrelation, Moran’s . computation Moran’s requires first definition constitutes objects “near” . simply, represented spatial weights matrix. One possible representation adjacency matrix. matrix can computed polygonal data resulting matrix can binary, entries 1 polygons share border, 0 elsewhere (including diagonal). entries can weighted different ways, including length border shared two polygons. schema necessarily lend well spatial transcriptomic technologies, polygonal boundaries cell objects may correspond measurements count matrix, individual spots barcodes may correspond multiple neighborhoods cells. Certainly, interpretation spatial weights matrix change depending technology. case, can generate putative spatial graph using k-nearest neighbors algorithm. implemented findSpatialNeighbors() function argument method = \"knearneigh\" . store result colGraphs() slot SFE object. Now compute Moran's barcode QC metrics using colDataMoransI(). results substantiate visual check spatial autocorrelation. continue investigating QC metrics. proportion UMIs mapping mitochondrial genes useful metric assessing cell quality scRNA-seq data. examine QC metric plotting versus total number UMI counts barcode. keeping expectations, barcodes associated fewer counts appear associated higher proportions mitochondrial reads. exclude barcodes containing >10% mitochondrial reads subsequent analysis. second line removes barcodes zero counts, necessary dataset barcodes zero counts. keep just demonstrate method.","code":"names(colData(sfe)) #> [1] \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" avg <- as.data.frame(colData(sfe)) |> dplyr::summarise(across(-sample_id, mean)) violin <- plotColData(sfe, \"nCounts\") + geom_hline(aes(yintercept = nCounts), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"nCounts\", colGeometryName = \"centroids\", size = 0.2) + theme_void() violin + spatial as.data.frame(cbind(spatialCoords(sfe), colData(sfe))) |> ggplot(aes(xcoord, ycoord, z=nCounts)) + stat_summary_hex(fun = function(x) sum(x), bins=100) + scale_fill_distiller(palette = \"Blues\", direction = 1) + labs(fill='nCounts') + theme_bw() + coord_equal() + scale_x_continuous(expand = expansion()) + scale_y_continuous(expand = expansion()) + theme_void() colData(sfe)$log_nCounts <- log(colData(sfe)$nCounts) avg <- as.data.frame(colData(sfe)) |> dplyr::summarise(across(-sample_id, mean)) violin <- plotColData(sfe, \"log_nCounts\") + geom_hline(aes(yintercept = log_nCounts), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"log_nCounts\", colGeometryName = \"centroids\", size = 0.2) violin + spatial violin <- plotColData(sfe, \"nGenes\") + geom_hline(aes(yintercept = nGenes), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"nGenes\", colGeometryName = \"centroids\", size = 0.2) violin + spatial colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' features_use <- c(\"nCounts\", \"nGenes\") sfe <- colDataMoransI(sfe, features_use, colGraphName = \"knn5\") colFeatureData(sfe)[features_use,] #> DataFrame with 2 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.0965909 48.6328 #> nGenes 0.0957030 11.2037 violin <- plotColData(sfe, \"prop_mito\") + geom_hline(aes(yintercept = prop_mito), avg, color=\"red\") + theme(legend.position = \"top\") mito <- plotColData(sfe, x = \"nCounts\", y = \"prop_mito\") violin + mito # Spatial neighborhood graph is reconstructed when subsetting columns # Use drop = TRUE to drop the graph without reconstruction, whose indices are # no longer valid sfe_filt <- sfe[, colData(sfe)$prop_mito < 0.1] #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' sfe_filt <- sfe_filt[rowSums(counts(sfe_filt)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"data-normalization","dir":"Articles","previous_headings":"","what":"Data Normalization","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Normalization spatial transcriptomics data non-trivial requires thoughtful consideration. Similarly scRNA-seq data analysis, goal normalization remove effects technical variation derive quantity reflects biological variation. However, several questions arise considering best practices spatial data normalization. example, spatial methods average detect fewer UMIs single-cell counterparts, may preclude use normalization techniques log transformation shown . ’s , always evident whether spatial autocorrelation genes (QC measures) artifact technology, thus, whether normalization methods preserve spatial autocorrelation architecture. questions provide avenues active research development, currently unresolved. end, log-normalize data cell identify variable genes subsequent analysis.","code":"sfe_filt <- logNormCounts(sfe_filt) dec <- modelGeneVar(sfe_filt) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension Reduction and Clustering","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Much like scRNA-seq analysis, perform principal component analysis (PCA) clustering. note method use spatial information. can plot variance explained PC. see first components explain variance data. principal components (PCs) can plotted space. notice PCs may show spatial structure correlates biological niches cells. Without cellular overlays, can speculate potential relevance barcodes seem separated PC, PC doe seem separate distinct neighborhoods barcodes. Now can cluster barcodes using graph-based clustering algorithm plot space. plot colored cluster id. naive interpretation plot shows distinct niches barcodes separated abundant, intervening types. may indicative biological processes hand, namely melanoma metastasis, ‘hotspots’ melanoma proliferation separated unaffected normal tissue.","code":"set.seed(29) sfe_filt <- runPCA(sfe_filt, ncomponents = 30, subset_row = hvgs, scale = TRUE, BSPARAM = BiocSingular::IrlbaParam()) # scale as in Seurat ElbowPlot(sfe_filt, ndims = 30) + theme_bw() spatialReducedDim(sfe_filt, \"PCA\", ncomponents = 4, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, scattermore = TRUE, pointsize = 0.5) colData(sfe_filt)$cluster <- clusterRows(reducedDim(sfe_filt, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotSpatialFeature(sfe_filt, \"cluster\", colGeometryName = \"centroids\") + guides(colour = guide_legend(override.aes = list(size=3)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"morans-i","dir":"Articles","previous_headings":"Dimension Reduction and Clustering","what":"Moran’s I","title":"Slide-Seq V2 Exploratory Data Analysis","text":"One avenue future analysis includes identifying genes differentially expressed cluster, can interrogated findMarkers() non-spatial context calculateMoransI() spatial context. spatial case, consideration given whether differences seen across tissue represent biological difference artifacts field view. run global Moran’s log normalized gene expression. Now, might ask: genes display spatial autocorrelation? Spatial variability can also investigated using differential expression testing known anatomical regions complemented spatial location. One potential drawback approach variability induced melanoma, rather native tissue architecture, may preclude identification typical structures. analyses can done stage: gene expression patterns, , differentiate neighborhoods melanoma cells? genes differentially expressed cluster?","code":"sfe_filt <- runMoransI(sfe_filt, features = hvgs, BPPARAM = MulticoreParam(2)) top_moran <- rownames(sfe_filt)[order(rowData(sfe_filt)$moran_sample01, decreasing = TRUE)[1:4]] plotSpatialFeature(sfe_filt, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, pointsize = 0.5)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Slide-Seq V2 Exploratory Data Analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 BiocParallel_1.36.0 #> [3] spdep_1.3-3 sf_1.0-16 #> [5] spData_2.3.0 patchwork_1.2.0 #> [7] bluster_1.12.0 scran_1.30.2 #> [9] scater_1.30.1 ggplot2_3.5.1 #> [11] scuttle_1.12.0 SpatialExperiment_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] cowplot_1.1.3 DBI_1.2.2 #> [19] RColorBrewer_1.1-3 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] dqrng_0.3.2 pkgdown_2.0.9 #> [33] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [35] DelayedArray_0.28.0 tidyselect_1.2.1 #> [37] farver_2.1.1 ScaledMatrix_1.10.0 #> [39] viridis_0.6.5 BiocFileCache_2.10.2 #> [41] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [43] e1071_1.7-14 systemfonts_1.0.6 #> [45] tools_4.3.3 ggnewscale_0.4.10 #> [47] ragg_1.3.0 Rcpp_1.0.12 #> [49] glue_1.7.0 gridExtra_2.3 #> [51] SparseArray_1.2.4 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 fansi_1.0.6 #> [61] digest_0.6.35 rsvd_1.0.5 #> [63] R6_2.5.1 mime_0.12 #> [65] textshaping_0.3.7 colorspace_2.1-0 #> [67] wk_0.9.1 scattermore_1.2 #> [69] RSQLite_2.3.6 hexbin_1.28.3 #> [71] utf8_1.2.4 generics_0.1.3 #> [73] class_7.3-22 httr_1.4.7 #> [75] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [77] pkgconfig_2.0.3 scico_1.5.0 #> [79] gtable_0.3.5 blob_1.2.4 #> [81] XVector_0.42.0 htmltools_0.5.8.1 #> [83] scales_1.3.0 png_0.1-8 #> [85] knitr_1.45 rjson_0.2.21 #> [87] curl_5.2.1 proxy_0.4-27 #> [89] cachem_1.0.8 rhdf5_2.46.1 #> [91] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [93] parallel_4.3.3 vipor_0.4.7 #> [95] AnnotationDbi_1.64.1 desc_1.4.3 #> [97] s2_1.1.6 pillar_1.9.0 #> [99] grid_4.3.3 vctrs_0.6.5 #> [101] promises_1.3.0 BiocSingular_1.18.0 #> [103] dbplyr_2.5.0 beachmat_2.18.1 #> [105] xtable_1.8-4 cluster_2.1.6 #> [107] beeswarm_0.4.0 evaluate_0.23 #> [109] magick_2.8.3 cli_3.6.2 #> [111] locfit_1.5-9.9 compiler_4.3.3 #> [113] rlang_1.1.3 crayon_1.5.2 #> [115] labeling_0.4.3 classInt_0.4-10 #> [117] fs_1.6.4 ggbeeswarm_0.7.2 #> [119] viridisLite_0.4.2 deldir_2.0-4 #> [121] munsell_0.5.1 Biostrings_2.70.3 #> [123] Matrix_1.6-5 ExperimentHub_2.10.0 #> [125] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [127] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [129] statmod_1.5.0 shiny_1.8.1.1 #> [131] interactiveDisplayBase_1.40.0 highr_0.10 #> [133] AnnotationHub_3.10.1 igraph_2.0.3 #> [135] memoise_2.0.1 bslib_0.7.0 #> [137] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"CosMX non-small cell lung cancer data","text":"Nanostring GeoMX DSP popular spatial transcriptomics technology formalin fixed paraffin embedded (FFPE) tissues, doesn’t single cell resolution. CosMX FISH based technology FFPE tissue (He2021-oy?) single cell resolution, vignette provides example analyze CosMX data voyager. Note FFPE common way preserve archive tissue, cases, samples available may FFPE. CosMX dataset non-small cell lung cancer used described (He2021-oy?). processed data available download Nanostring website. gene count matrix, cell metadata, cell segmentation polygon coordinates downloaded Nanostring website CSV files read R data frames. gene count matrix converted sparse matrix. cell metadata contains centroid coordinates cells. cell polygon data frames converted sf data frame df2sf() function SpatialFeatureExperiment (SFE). used construct SFE object. Cell segmentation available one z-plane. first biological replicate included SFEData package. biological replicate 980 features 100,290 cells. Take look cells space: single cell resolution, lot details can seen, although ’s artifact borders fields view (FOVs). Plot cell density","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) # devel version of plotExpression library(scran) library(bluster) library(ggplot2) library(patchwork) library(stringr) library(spdep) library(BiocParallel) library(BiocSingular) theme_set(theme_bw()) (sfe <- HeNSCLCData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 980 100290 #> metadata(0): #> assays(1): counts #> rownames(980): AATK ABL1 ... NegPrb22 NegPrb23 #> rowData names(3): means vars cv2 #> colnames(100290): 1_1 1_2 ... 30_4759 30_4760 #> colData names(17): Area AspectRatio ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : CenterX_global_px CenterY_global_px #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: plotGeometry(sfe, MARGIN = 2L, type = \"cellSeg\") plotCellBin2D(sfe, hex = TRUE)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"cells","dir":"Articles","previous_headings":"Quality control (QC)","what":"Cells","title":"CosMX non-small cell lung cancer data","text":"Single cell RNA-seq (scRNA-seq) technologies typically don’t quantify cell morphology, gene expression Visium doesn’t single cell resolution. single cell resolution smFISH based data, cell gene expression related QC metrics total number transcripts detected number genes detected, also cell morphology area (z-plane segmentation polygons provided) aspect ratio. Area relevant QC since can flag falsely undersegmented cells, .e. several cells falsely considered one cell segmentation program. However, since pre-defined gene panel used mitochondrially encoded genes quantified, scRNA-seq QC metric proportion mitochondrially encoded counts applicable. QC metrics precomputed stored colData Cell area, aspect ratio, marker stain intensities, .e. columns “sample_id” come Nanostring’s website. sf package can compute areas cell polygons. R, EBImage package can compute morphological metrics aspect ratio, eccentricity, orientation, etc., requires data converted raster. OpenCV can compute morphological metrics polygons without converting raster, needs called Python C++. Since math behind many basic morphological metrics pretty simple, may add Voyager future version. Since plotting 100,000 polygons slow plot isn’t large enough us see polygons anyway, use scattermore rasterize plot speed plotting. Instead plotting every single point, now ggplot merely displays rasterized image. Number transcript spots detected per cell make nCounts nGenes comparable across datasets, divide number genes probed. dataset, 960 genes, 20 negative controls. However, different genes may probed different datasets, can different tissues, make nCounts nGenes completely comparable across datasets. However, may still somewhat comparable, since genes highly expressed major cell types tissue tend selected gene panel. means cells mostly less 1 transcript count per gene average, surprising since cells express genes. cells detected express less 30% genes probed. Number genes (980) detected per cell Based spatial plot, seems nCounts nGenes biologically relevant, cells transcripts detected. nCounts relates nGenes ’s nature cells without transcripts? cells without transcripts central cavity. “empty” cells tend smaller cells also really large ones. Cell area distribution Larger cells likely found certain areas tissue. biological, -segmentation likely cell type tissue region. area relate total counts? may vaguely seem cells total counts tend larger (least z-plane), cells large low total counts. Negative control probes used dataset QC. calculate proportion transcripts attributed negative controls. NA’s empty cells, proportion low except outliers. prop_neg relate nCounts? looks kind like proportion mitochondrial counts vs. nCounts plot scRNA-seq, cells fewer total counts tend higher proportion mitochondrial counts. distribution obviously bimodal, since x-axis log transformed better visualize distribution, 0’s removed. ’s kind arbitrary; now ’ll remove cells 10% transcripts negative controls. removing low quality cells, 100,095 cells left.","code":"names(colData(sfe)) #> [1] \"Area\" \"AspectRatio\" \"Width\" #> [4] \"Height\" \"Mean.MembraneStain\" \"Max.MembraneStain\" #> [7] \"Mean.PanCK\" \"Max.PanCK\" \"Mean.CD45\" #> [10] \"Max.CD45\" \"Mean.CD3\" \"Max.CD3\" #> [13] \"Mean.DAPI\" \"Max.DAPI\" \"sample_id\" #> [16] \"nCounts\" \"nGenes\" # Function to plot violin plot for distribution and spatial at once plot_violin_spatial <- function(sfe, feature) { violin <- plotColData(sfe, feature, point_fun = function(...) list()) spatial <- plotSpatialFeature(sfe, feature, colGeometryName = \"centroids\", scattermore = TRUE) violin + spatial + plot_layout(widths = c(1, 2)) } plot_violin_spatial(sfe, \"nCounts\") summary(sfe$nCounts) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.0 135.0 248.0 302.8 409.0 2475.0 n_panel <- 960 colData(sfe)$nCounts_normed <- sfe$nCounts/n_panel colData(sfe)$nGenes_normed <- sfe$nGenes/n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plot_violin_spatial(sfe, \"nGenes\") summary(sfe$nGenes) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.0 75.0 119.0 127.1 171.0 500.0 plotColData(sfe, x = \"nCounts\", y = \"nGenes\", bins = 100) colData(sfe)$is_empty <- colData(sfe)$nCounts < 1 plotSpatialFeature(sfe, \"is_empty\", \"cellSeg\") plotColData(sfe, x = \"Area\", y = \"is_empty\") plot_violin_spatial(sfe, \"Area\") plotColData(sfe, x = \"nCounts\", y = \"Area\", bins = 100) + theme_bw() neg_inds <- str_detect(rownames(sfe), \"^NegPrb\") # Number of negative control probes sum(neg_inds) #> [1] 20 colData(sfe)$prop_neg <- colSums(counts(sfe)[neg_inds,])/colData(sfe)$nCounts plot_violin_spatial(sfe, \"prop_neg\") #> Warning: Removed 142 rows containing non-finite outside the scale range #> (`stat_ydensity()`). plotColData(sfe, x = \"nCounts\",y = \"prop_neg\", bins = 100) #> Warning: Removed 142 rows containing non-finite outside the scale range #> (`stat_bin2d()`). # The zeros are removed plotColDataHistogram(sfe, \"prop_neg\") + scale_x_log10() #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 59213 rows containing non-finite outside the scale range #> (`stat_bin()`). # Remove low quality cells (sfe <- sfe[,!sfe$is_empty & sfe$prop_neg < 0.1]) #> class: SpatialFeatureExperiment #> dim: 980 100095 #> metadata(0): #> assays(1): counts #> rownames(980): AATK ABL1 ... NegPrb22 NegPrb23 #> rowData names(3): means vars cv2 #> colnames(100095): 1_1 1_2 ... 30_4759 30_4760 #> colData names(21): Area AspectRatio ... is_empty prop_neg #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : CenterX_global_px CenterY_global_px #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"markers","dir":"Articles","previous_headings":"Quality control (QC) > Cells","what":"Markers","title":"CosMX non-small cell lung cancer data","text":"Nanostring provides cell stain marker intensities cell metadata. plot aspect ratio mean intensity cells stains markers, plotted . PanCK marker epithelial cells. CD45 leukocyte marker. CD3 T cell marker. Since takes quite plot 100,000 cells 6 times, scattermore really helps.","code":"names(colData(sfe)) #> [1] \"Area\" \"AspectRatio\" \"Width\" #> [4] \"Height\" \"Mean.MembraneStain\" \"Max.MembraneStain\" #> [7] \"Mean.PanCK\" \"Max.PanCK\" \"Mean.CD45\" #> [10] \"Max.CD45\" \"Mean.CD3\" \"Max.CD3\" #> [13] \"Mean.DAPI\" \"Max.DAPI\" \"sample_id\" #> [16] \"nCounts\" \"nGenes\" \"nCounts_normed\" #> [19] \"nGenes_normed\" \"is_empty\" \"prop_neg\" plotSpatialFeature(sfe, c(\"AspectRatio\", \"Mean.DAPI\", \"Mean.MembraneStain\", \"Mean.PanCK\", \"Mean.CD45\", \"Mean.CD3\"), colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"genes","dir":"Articles","previous_headings":"Quality control (QC)","what":"Genes","title":"CosMX non-small cell lung cancer data","text":"red line \\(y = x\\) expected Poisson data. Gene expression dataset variance expected Poisson, even gene lower expression. Zoom negative controls Among “high quality” cells, negative controls still higher variance relative mean compared Poisson. Negative controls vs. real genes negative controls lower mean “expression” vast majority real genes.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_neg <- neg_inds plotRowData(sfe, x = \"means\", y = \"vars\", bins = 50) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() as.data.frame(rowData(sfe)[neg_inds,]) |> ggplot(aes(means, vars)) + geom_point() + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() plotRowData(sfe, x = \"means\", y = \"is_neg\") + scale_y_log10() + annotation_logticks(sides = \"b\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"spatial-autocorrelation-in-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation in QC metrics","title":"CosMX non-small cell lung cancer data","text":"spatial neighborhood graph required spatial dependence analyses spdep. Without benchmark, don’t yet know type neighborhood graph best purpose. Methods find spatial neighborhood graphs spdep knearneigh() (k nearest neighbors), dnearneigh() (find cells within certain distance), poly2nb() (polygon contiguity) recommended larger datasets. cell-cell contact may biologically relevant, cell segmentation imperfect, leading non-contiguous cell segmentation polygons cells appear contiguous H&E, using poly2nb() find polygon contiguity neighbors without supplementing another kind neighborhood problematic. Delaunay triangulation deldir package, used spdep (tri2nb()), takes 4 5 minutes dataset size, run time increases much drastically linearly number cells increases. Sphere Interest (SOI) graph (soi.graph()) prunes edges triangulation long, take long . triangulation SOI graph, slower knearneigh(), dnearneigh(), poly2nb(), somewhat practical considerations. implementation gabrielneigh() relativeneigh() take impracticably long (hour terminated R session impatience) dataset recommended. Methods find approximate nearest neighbors Annoy (AnnoyParam()) HNSW (HnswParam()), supported bluster BiocNeighbors packages might speed finding graphs, haven’t formally benchmarked . See Chapter 14 Spatial Data Science proximity areal data detailed discussion different neighborhood graphs spdep. methods areal data first wrapped Voyager much spatial transcriptomics data analogous areal geospatial data, data several cells aggregated areas, happens Visium spots. Just like geospatial areal data, Visium aggregation areas arbitrary represent underlying spatial process. Although sometimes geographical areal units arbitrary, tissues generally hexagonal grids means Visium spot polygons arbitrary context. Regions interest (ROI) selection spatial transcriptomics methods, laser capture microdissection (LCM) GeoMX DSP obviously analogous geospatial areal data. aggregation also happens analyze smFISH-based data cell level, basic unit observation individual transcript spots. spdep caters areal data, gstat caters geostatistical data, continuous spatial process sampled point locations. ways, spatial transcriptomics data analogous geostatistical data. Visium samples supposed spatial biological process regular hexagonal grid, pretend Visium spots points. smFISH-based single cell resolution data, cells observed can thought sample underlying spatial biological process supervening specific locations cells. sense, cells samples, since smFISH based technologies attempt visualize cells tissue section. However, biological function tissue depend particular spatial arrangement individual cells (.e. supervenes particular spatial arrangement), cell types, specific cell locations observed can thought samples process, consider cell basic unit spatial process. Voyager 1.2.0 (Bioconductor 3.17), added semivariograms (gstat package) exploratory tool identify presence spatial autocorrelation, length scale, anisotropy (.e. different different directions). Covariates can specified computing variogram account spatial trends adjust another spatial variable. However, unlike Morans’s , semivariogram can’t identify negative spatial autocorrelation, although since spatial neighborhood graph typically encode spatial directions, spdep autocorrelation metrics can’t identify anisotropy. Another problem semivariogram assumes data intrinsically stationary, .e. semivariogram holds entire dataset, similarity two cells depends distance , may case spatial autocorrelation varies space evident genes local spatial analyses. Single cell smFISH based data also dissimiliar areal geostatistical data important ways. geospatial areal data, data numerous basic units spatial process (e.g. people epidemiology) aggregated areas (e.g. cities), whereas histological space, cell arguably sensible basic unit biological spatial process individual mRNA molecules. Unlike geostatistical data, cells seen tissue section often polygons tessellating tissue section rather points. Furthermore, ideally samples underlying spatial process affect spatial process geostatistical data, cells play active roles biological spatial process. However, data analysis methods areal geostatistical data can still relevant EDA descriptive models (causal mechanistic) single cell smFISH data. Different types spatial neighborhood graphs cells may relevant different processes. instance, contiguity cell segmentation polygons relevant contact involved cell signaling, although cell segmentation imperfect. Positive spatial autocorrelation can arise contact activation, negative autocorrelation can arise contact inhibition. However, cells may also influenced longer range factors secreted ligands, morphogens, simpler spatial trends like distance artery vein. case, perhaps semivariogram using Euclidean distance cells spatial weights spatial autocorrelation metrics relevant EDA. interesting compare results different spatial neighborhood graphs spatial weights, spdep gstat. Perhaps one best method, different methods reveal different phenomena. problem choosing spatial neighborhood matrix long history far predating spatial transcriptomics. See (Getis 2009) brief discussion decades work around issue. Spatial autocorrelation metrics seek measure nearby things tend similar dissimilar, neighborhood graph edge weights define mean “nearby” areal data. Note Visium spot can contain several dozens cells, spatial neighborhood graphs Visium spots describe neighborhood relationships much longer length scales spatial neighborhood graphs single cells, spatial autocorrelation metrics using Visium graph different meanings cellular neighborhood graphs. now, just demonstrate software usage, use k nearest neighborhood graph distance based edge weights, commonly done graph based clustering scRNA-seq, although don’t yet know best value k scenario. purpose vignette, say use \\(k = 5\\), execution time isn’t outrageous. argument style = \"W\" row normalize adjacency matrix spatial neighborhood graph necessary Moran scatter plot. Inverse distance edge weights can take small values matter relative rather absolute values distance arbitrary unit; row normalizing adjacency matrix makes weighted average value neighbors comparable value cell . tissue, many cells appear contiguous, since cell segmentation imperfect, many false singletons, makes polygon contiguity neighbors poly2nb() problematic without modification. based distribution number neighbors based contiguity, \\(k = 5\\) doesn’t seem bad approximate contiguity. Now compute Moran’s cell QC metrics Positive spatial autocorrelation suggested, stronger nCounts nGenes. length scales spatial autocorrelation QC metrics? nice lagged neighborhood graphs can stored reused features rather recomputed feature spdep::sp.correlogram() called behind scene . takes minutes run, long typical song. Another way find length scale spatial autocorrelation bin cells bins different sizes find spatial autocorrelation bin size, probably faster finding lagged values higher higher neighborhoods since geom_bin2d() geom_hex() ggplot2 run pretty fast even large datasets. use semivariogram; gstat also bins data estimating semivariogram calculating semivariogram long distance much faster correlogram cell-cell neighborhood graphs. Note MulticoreParam() doesn’t work Windows; vignette built Linux. Use SnowParam() DoparParam() Windows. See ?BiocParallelParam available parallel processing backends. notice significant performance differences ShowParam() MulticoreParam() context. seem similar length scales, aspect ratios tend decay quickly. Moran’s scatter plot nCounts. first panel, density points plot, second, points influential fitting line highlighted red, still 2D histogram avoid overplotting. obvious clusters plot. Local Moran’s nCounts Cool, appears epithelial regions tend homogenous nCounts.","code":"system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 5.473 0.024 5.499 features_use <- c(\"nCounts\", \"nGenes\", \"Area\", \"AspectRatio\") sfe <- colDataMoransI(sfe, features_use, colGraphName = \"knn5\") colFeatureData(sfe)[features_use,] #> DataFrame with 4 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.386655 6.80818 #> nGenes 0.434639 3.19599 #> Area 0.198152 8.96966 #> AspectRatio 0.256211 43.05666 system.time( sfe <- colDataUnivariate(sfe, \"sp.correlogram\", features = features_use, colGraphName = \"knn5\", order = 6, zero.policy = TRUE, BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 419.966 10.410 216.121 plotCorrelogram(sfe, features_use) sfe <- colDataUnivariate(sfe, \"moran.plot\", \"nCounts\", colGraphName = \"knn5\") p1 <- moranPlot(sfe, \"nCounts\", binned = TRUE, plot_influential = FALSE) p2 <- moranPlot(sfe, \"nCounts\", binned = TRUE) p1 / p2 + plot_layout(guides = \"collect\") sfe <- colDataUnivariate(sfe, \"localmoran\", \"nCounts\", colGraphName = \"knn5\") plotLocalResult(sfe, \"localmoran\", \"nCounts\", colGeometryName = \"cellSeg\", divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"data-normalization","dir":"Articles","previous_headings":"","what":"Data normalization","title":"CosMX non-small cell lung cancer data","text":"Given may relationship cell size total counts, total counts may biological thus purely treated technical, questions raised data normalization different standard scRNA-seq practices. instance, technical contributions total counts kind data? Furthermore, cell area, since part technical, z-plane cell segmentation polygons intersects cell, cell types, biological? Also, different methods data normalization affect spatial autocorrelation? spatial autocorrelation used ways normalizing data? Besides correcting technical effects making gene expression cells different total counts comparable, data normalization stabilizes variance tries make data normally distributed since many statistical methods assume normally distributed data. don’t know best practice normalize kind data, still normalize data downstream analyses.","code":"sfe <- logNormCounts(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"CosMX non-small cell lung cancer data","text":"run global Moran’s log normalized gene expression. real genes tend spatial autocorrelation negative controls? seems least shorter length scale captured k nearest neighbor graph, genes don’t strong spatial autocorrelation strong positive spatial autocorrelation. contrast, Moran’s negative controls closely packed around 0, indicating lack spatial autocorrelation, good sign, evidence technical artifact manifests spatial trend manifest negative controls. genes highest Moran’s ? highlight epithelial regions. regions spatially organized, short length scale used Moran’s correlogram shows Moran’s decays first order neighbors. wonder using longer length scale change results.","code":"# Note: on your computer, you can put progressbar = TRUE inside MulticoreParam() # to show progress bar. This applies to any BiocParallParam. sfe <- runMoransI(sfe, features = rownames(sfe), BPPARAM = MulticoreParam(2)) plotRowData(sfe, x = \"moran_sample01\", y = \"is_neg\") + geom_hline(yintercept = 0, linetype = 2) top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"non-spatial-dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Non-spatial dimension reduction and clustering","title":"CosMX non-small cell lung cancer data","text":"first PC highlights epithelium. PC2 highlights T cells. PC4 might highlight leukocytes. Need check genes highest loadings find PCs mean. Non-spatial clustering locating clusters space analyses can done stage: many cell types neighborhood cell? subject different definitions neighborhood. cell types tend co-localize ? Find spatial regions based cell type colocalization, can done R package spicyR (Canete et al. 2022)","code":"set.seed(29) sfe <- runPCA(sfe, ncomponents = 30, scale = TRUE, BSPARAM = IrlbaParam()) ElbowPlot(sfe, ndims = 30) plotDimLoadings(sfe, dims = 1:6) spatialReducedDim(sfe, \"PCA\", 6, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2, scattermore = TRUE) colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\")[,1:15], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' data(\"ditto_colors\") plotPCA(sfe, ncomponents = 4, colour_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotSpatialFeature(sfe, \"cluster\", colGeometryName = \"cellSeg\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential expression","title":"CosMX non-small cell lung cancer data","text":"Cluster marker genes found Wilcoxon rank sum test commonly done scRNA-seq. ’s already sorted p-values. Get significant marker cluster plot. Since ’re many points, used development version scater plot points, uninformative due overplotting make plot really slow. Plot top marker genes heatmap","code":"markers <- findMarkers(sfe, groups = colData(sfe)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[6]] #> DataFrame with 980 rows and 12 columns #> p.value FDR summary.AUC AUC.1 AUC.2 AUC.3 #> #> SERPINA1 0.00000e+00 0.00000e+00 0.900457 0.882702 0.917169 0.927197 #> LTF 0.00000e+00 0.00000e+00 0.885649 0.898378 0.896631 0.889493 #> SOD2 1.34767e-268 4.40240e-266 0.763204 0.708843 0.740953 0.759093 #> CXCL2 1.62703e-179 3.98621e-177 0.646570 0.739327 0.700563 0.752507 #> LAMP3 3.01063e-172 5.90084e-170 0.710243 0.718463 0.715373 0.716238 #> ... ... ... ... ... ... ... #> TPSAB1 1 1 0.01337636 0.541769 0.517367 0.530969 #> TPSB2 1 1 0.00641018 0.534170 0.519936 0.530758 #> VIM 1 1 0.16441559 0.588792 0.164416 0.398639 #> VWF 1 1 0.24200118 0.517581 0.242001 0.510094 #> XBP1 1 1 0.21892716 0.676944 0.618364 0.632037 #> AUC.4 AUC.5 AUC.7 AUC.8 AUC.9 AUC.10 #> #> SERPINA1 0.872056 0.928391 0.918108 0.924688 0.900457 0.860946 #> LTF 0.903540 0.905416 0.904550 0.902354 0.885649 0.897075 #> SOD2 0.703673 0.790085 0.690380 0.747167 0.763204 0.824461 #> CXCL2 0.731505 0.754922 0.743084 0.752041 0.738496 0.646570 #> LAMP3 0.714841 0.720568 0.722422 0.715502 0.710243 0.716974 #> ... ... ... ... ... ... ... #> TPSAB1 0.527268 0.532545 0.525186 0.522748 0.01337636 0.518232 #> TPSB2 0.520962 0.507907 0.524842 0.522807 0.00641018 0.524115 #> VIM 0.360806 0.469073 0.344926 0.360144 0.31454577 0.633386 #> VWF 0.511037 0.508046 0.515044 0.509359 0.50754675 0.508200 #> XBP1 0.616943 0.218927 0.611995 0.614587 0.59660992 0.493690 genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe, genes_use, x = \"cluster\", point_fun = function(...) list()) genes_use2 <- unique(unlist(lapply(markers, function(x) rownames(x)[1:5]))) plotGroupedHeatmap(sfe, genes_use2, group = \"cluster\", colour = scales::viridis_pal()(100))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"local-spatial-statistics-of-marker-genes","dir":"Articles","previous_headings":"","what":"Local spatial statistics of marker genes","title":"CosMX non-small cell lung cancer data","text":"Plot genes space Moran’s marker genes Local Moran’s marker genes seems histological regions tend spatially homogenous gene expression others. epithelial region tends homogenous. Run local spatial heteroscdasticity (LOSH) marker genes find local heterogeneity genes heterogeneous also highly expressed, COLA1 IGKC. However case genes. example, MZT2A quite ubiqiutously experssed, heterogeneous regions others, KRT19 seem much heterogeneous ’s highly expressed. MZT2A, LOSH picked artifact edges FOVs, although apparent genes plotted . don’t information cell belongs FOV, FOV edge effects considered data normalization. interesting systematically see LOSH relates gene expression across genes, differs cell types gene functions.","code":"plotSpatialFeature(sfe, genes_use, colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE) rowData(sfe)[genes_use, \"moran_sample01\", drop = FALSE] #> DataFrame with 10 rows and 1 column #> moran_sample01 #> #> MZT2A 0.199130 #> COL4A1 0.213595 #> IGHM 0.293282 #> HLA-DPA1 0.242441 #> IGKC 0.425192 #> SERPINA1 0.254077 #> COL1A1 0.394780 #> IL7R 0.177655 #> TPSB2 0.206115 #> KRT19 0.770433 sfe <- runUnivariate(sfe, \"localmoran\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = genes_use, colGeometryName = \"centroids\", ncol = 2, divergent = TRUE, diverge_center = 0, scattermore = TRUE) sfe <- runUnivariate(sfe, \"LOSH\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"LOSH\", features = genes_use, colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"CosMX non-small cell lung cancer data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 BiocSingular_1.18.0 #> [3] BiocParallel_1.36.0 spdep_1.3-3 #> [5] sf_1.0-16 spData_2.3.0 #> [7] stringr_1.5.1 patchwork_1.2.0 #> [9] bluster_1.12.0 scran_1.30.2 #> [11] scater_1.30.1 ggplot2_3.5.1 #> [13] scuttle_1.12.0 SpatialExperiment_1.12.0 #> [15] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [17] Biobase_2.62.0 GenomicRanges_1.54.1 #> [19] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [21] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [23] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [25] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 cowplot_1.1.3 #> [19] DBI_1.2.2 RColorBrewer_1.1-3 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 pheatmap_1.0.12 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 Rcpp_1.0.12 #> [51] glue_1.7.0 gridExtra_2.3 #> [53] SparseArray_1.2.4 mgcv_1.9-1 #> [55] xfun_0.43 dplyr_1.1.4 #> [57] HDF5Array_1.30.1 withr_3.0.0 #> [59] BiocManager_1.30.22 fastmap_1.1.1 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] scattermore_1.2 RSQLite_2.3.6 #> [73] hexbin_1.28.3 utf8_1.2.4 #> [75] generics_0.1.3 class_7.3-22 #> [77] httr_1.4.7 htmlwidgets_1.6.4 #> [79] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [81] scico_1.5.0 gtable_0.3.5 #> [83] blob_1.2.4 XVector_0.42.0 #> [85] htmltools_0.5.8.1 scales_1.3.0 #> [87] png_0.1-8 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 fs_1.6.4 #> [121] ggbeeswarm_0.7.2 stringi_1.8.3 #> [123] viridisLite_0.4.2 deldir_2.0-4 #> [125] munsell_0.5.1 Biostrings_2.70.3 #> [127] Matrix_1.6-5 ExperimentHub_2.10.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] interactiveDisplayBase_1.40.0 highr_0.10 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Xenium breast cancer dataset","text":"Xenium new technology 10X genomics single cell resolution smFISH based spatial transcriptomics. first Xenium dataset formalin fixed paraffin embedded (FFPE) human breast tumor, reported (Janesick et al. 2022) downloaded 10X website. gene count matrix downloaded HDF5 file read R SingleCellExperiment (SCE) object DropletUtils::read10xCounts(). gene count matrix originally DelayedArray, data loaded memory. now, matrix converted memory dgCMatrix. However, next release, like write another vignette disk analyses. challenge representing sf data frames disk, perhaps sedona SQLDataFrame. cell metadata (including centroid coordinates) cell segmentation polygons downloaded parquet files, compact way store columnar data CSV, read R data frames read_parquet arrow package. cell polygons converted sf data frame SpatialFeatureExperiment::df2sf(). SCE object converted SpatialFeatureExperiment (SFE) polygon geometry added SFE object, SFEData package. load packages used vignette. 118708 cells dataset, little CosMX dataset. SFE object doesn’t column names (.e. cell IDs). assign cell IDs. tissue, cell outlines, looks like Plot cell density space","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(ggplot2) library(stringr) library(scater) library(scuttle) library(BiocParallel) library(BiocSingular) library(bluster) library(scran) library(patchwork) theme_set(theme_bw()) (sfe <- JanesickBreastData(dataset = \"rep2\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 541 118708 #> metadata(1): Samples #> assays(1): counts #> rownames(541): ABCC11 ACTA2 ... BLANK_0497 BLANK_0499 #> rowData names(6): ID Symbol ... vars cv2 #> colnames: NULL #> colData names(10): Sample Barcode ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : x_centroid y_centroid #> imgData names(1): sample_id #> #> unit: #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON), nucSeg (GEOMETRY) #> #> Graphs: #> sample01: colnames(sfe) <- seq_len(ncol(sfe)) plotGeometry(sfe, \"cellSeg\") plotCellBin2D(sfe, hex = TRUE)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"cells","dir":"Articles","previous_headings":"Quality control","what":"Cells","title":"Xenium breast cancer dataset","text":"QC metrics precomputed stored colData Since ’re cells, better plot tissue larger, ’ll plot histogram QC metrics spatial plots separately, unlike CosMx vignette. divided nCounts total number genes probed, histogram comparable smFISH-based datasets. Compared FFPE CosMX non-small cell lung cancer dataset, transcripts per gene average larger proportion genes detected dataset, also FFPE. However, interpreted care, since two datasets different tissues different gene panels, may may indicate Xenium better detection efficiency CosMX. seem FOV artifacts. However, cell ID FOV information unavailable examine . standard examination look relationship nCounts nGenes: appear two branches. plot distribution cell area pixels. ’s long tail. nuclei much smaller cells. cell area distributed space? Cells sparse region tend larger dense region. may biological artifact cell segmentation algorithm . nuclei segmentations plotted instead cell segmentation. nuclei much smaller extent difficult see. ’s outlier near right edge section, throwing dynamic range plot. Upon inspection H&E image, outlier bit tissue debris doesn’t look like cell. can still cells dense, gland like regions tend larger nuclei. may biological, nuclei densely packed regions likely undersegmented, .e. multiple nuclei counted one nuclei segmentation program, . observations motivate examination relationship cell area nuclei area: , two branches, probably related cell density cell type. nucleus outlier also large cell area, though much outlier cell area. However, spatial outlier ’s unusually large compared neighbors (scroll two plots back). Next calculate proportion cell z-plane taken nucleus, examine distribution: distribution generated two peaks combined. histogram, seem cells without nuclei segmentation artifacts nucleus larger cell. However, many cells dataset possible just cells visible histogram. double check: cells without nuclei nuclei larger cells. plot nuclei proportion space: Cells histological regions larger proportions occupied nuclei. interesting check, controlling cell type, cell area, nucleus area, proportion cell occupied nucleus relate gene expression. However, problem performing analysis cell segmentation available one z-plane areas also relate z-plane intersects cell. plot 2D histogram better show density points plot: Smaller cells tend higher proportion occupied nucleus. can related cell type, limitation small nuclei can tissue. also examine relationship nucleus area proportion cell occupied nucleus: outlier obvious. cells small nuclei low proportion area occupied nucleus.","code":"names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" n_panel <- 313 colData(sfe)$nCounts_normed <- sfe$nCounts/n_panel colData(sfe)$nGenes_normed <- sfe$nGenes/n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\") plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"cellSeg\") plotColData(sfe, x=\"nCounts\", y=\"nGenes\", bins = 100) plotColDataHistogram(sfe, c(\"cell_area\", \"nucleus_area\"), scales = \"free_y\") plotSpatialFeature(sfe, \"cell_area\", colGeometryName = \"cellSeg\") plotSpatialFeature(sfe, \"nucleus_area\", colGeometryName = \"nucSeg\") plotColData(sfe, x=\"cell_area\", y=\"nucleus_area\", bins = 100) colData(sfe)$prop_nuc <- sfe$nucleus_area / sfe$cell_area plotColDataHistogram(sfe, \"prop_nuc\") # No nucleus sum(sfe$nucleus_area < 1) #> [1] 0 # Nucleus larger than cell sum(sfe$nucleus_area > sfe$cell_area) #> [1] 0 plotSpatialFeature(sfe, \"prop_nuc\", colGeometryName = \"cellSeg\") plotColData(sfe, x=\"cell_area\", y=\"prop_nuc\") plotColData(sfe, x=\"nucleus_area\", y=\"prop_nuc\", bins = 100)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"negative-controls","dir":"Articles","previous_headings":"Quality control","what":"Negative controls","title":"Xenium breast cancer dataset","text":"Since hundred genes plus negative control probes, row names SFE object can printed find negative control probes called. According Xenium paper (Janesick et al. 2022), 3 types controls: probe controls assess non-specific binding RNA, decoding controls assess misassigned genes, genomic DNA (gDNA) controls ensure signal RNA. paper explain detail control probes designed, explain blank probes . blank probes can used negative control. number 1, probe control number 2, decoding control must number 3, gDNA control Also make indicator whether feature sort negative control addPerCellQCMetrics() function scuttle package can conveniently add transcript counts, proportion total counts, number features detected subset features SCE object. SFE object, SFE inherits SCE. Next plot proportion transcript counts coming negative control. histogram dominated bin zero extreme outliers seen evident scale x axis. also plot histogram cells least 1 count negative control. NA’s come cells got segmented transcripts detected. vast majority cells less 1% transcript counts negative controls, outliers 50%. Next plot distribution number negative control counts per cell: counts low, mostly zero, outliers 10 counts types aggregated. outlier 50% counts negative controls must low total real transcript counts begin . scuttle package can detect outliers, default assigns anything zero outlier, since 3 median absolute deviations (MADs) away median, 0, MAD 0 since vast majority cells don’t negative control count. makes sense allow small proportion negative controls. use distribution just cells least 1 negative control count find outliers. distribution long tail definite outliers. code extracts outliers, based cells least one negative control count examine outliers located space: find outliers difficult see: analysis reveals outliers seem smaller. Outliers negative probe controls negative codeword controls also hard see plot, plots skipped . top left region tissue tends counts antisense controls. Now identified outliers, can remove along empty cells proceeding analysis: 1000 cells removed. Next check many negative control features detected per cell: 3 counts per cell per type. non-outliers, type around 1%, data looks good.","code":"rownames(sfe) #> [1] \"ABCC11\" \"ACTA2\" #> [3] \"ACTG2\" \"ADAM9\" #> [5] \"ADGRE5\" \"ADH1B\" #> [7] \"ADIPOQ\" \"AGR3\" #> [9] \"AHSP\" \"AIF1\" #> [11] \"AKR1C1\" \"AKR1C3\" #> [13] \"ALDH1A3\" \"ANGPT2\" #> [15] \"ANKRD28\" \"ANKRD29\" #> [17] \"ANKRD30A\" \"APOBEC3A\" #> [19] \"APOBEC3B\" \"APOC1\" #> [21] \"AQP1\" \"AQP3\" #> [23] \"AR\" \"AVPR1A\" #> [25] \"BACE2\" \"BANK1\" #> [27] \"BASP1\" \"BTNL9\" #> [29] \"C15orf48\" \"C1QA\" #> [31] \"C1QC\" \"C2orf42\" #> [33] \"C5orf46\" \"C6orf132\" #> [35] \"CAV1\" \"CAVIN2\" #> [37] \"CCDC6\" \"CCDC80\" #> [39] \"CCL20\" \"CCL5\" #> [41] \"CCL8\" \"CCND1\" #> [43] \"CCPG1\" \"CCR7\" #> [45] \"CD14\" \"CD163\" #> [47] \"CD19\" \"CD1C\" #> [49] \"CD247\" \"CD27\" #> [51] \"CD274\" \"CD3D\" #> [53] \"CD3E\" \"CD3G\" #> [55] \"CD4\" \"CD68\" #> [57] \"CD69\" \"CD79A\" #> [59] \"CD79B\" \"CD80\" #> [61] \"CD83\" \"CD86\" #> [63] \"CD8A\" \"CD8B\" #> [65] \"CD9\" \"CD93\" #> [67] \"CDC42EP1\" \"CDH1\" #> [69] \"CEACAM6\" \"CEACAM8\" #> [71] \"CENPF\" \"CLCA2\" #> [73] \"CLDN4\" \"CLDN5\" #> [75] \"CLEC14A\" \"CLEC9A\" #> [77] \"CLECL1\" \"CLIC6\" #> [79] \"CPA3\" \"CRHBP\" #> [81] \"CRISPLD2\" \"CSF3\" #> [83] \"CTH\" \"CTLA4\" #> [85] \"CTSG\" \"CTTN\" #> [87] \"CX3CR1\" \"CXCL12\" #> [89] \"CXCL16\" \"CXCL5\" #> [91] \"CXCR4\" \"CYP1A1\" #> [93] \"CYTIP\" \"DAPK3\" #> [95] \"DERL3\" \"DMKN\" #> [97] \"DNAAF1\" \"DNTTIP1\" #> [99] \"DPT\" \"DSC2\" #> [101] \"DSP\" \"DST\" #> [103] \"DUSP2\" \"DUSP5\" #> [105] \"EDN1\" \"EDNRB\" #> [107] \"EGFL7\" \"EGFR\" #> [109] \"EIF4EBP1\" \"ELF3\" #> [111] \"ELF5\" \"ENAH\" #> [113] \"EPCAM\" \"ERBB2\" #> [115] \"ERN1\" \"ESM1\" #> [117] \"ESR1\" \"FAM107B\" #> [119] \"FAM49A\" \"FASN\" #> [121] \"FBLIM1\" \"FBLN1\" #> [123] \"FCER1A\" \"FCER1G\" #> [125] \"FCGR3A\" \"FGL2\" #> [127] \"FLNB\" \"FOXA1\" #> [129] \"FOXC2\" \"FOXP3\" #> [131] \"FSTL3\" \"GATA3\" #> [133] \"GJB2\" \"GLIPR1\" #> [135] \"GNLY\" \"GPR183\" #> [137] \"GZMA\" \"GZMB\" #> [139] \"GZMK\" \"HAVCR2\" #> [141] \"HDC\" \"HMGA1\" #> [143] \"HOOK2\" \"HOXD8\" #> [145] \"HOXD9\" \"HPX\" #> [147] \"IGF1\" \"IGSF6\" #> [149] \"IL2RA\" \"IL2RG\" #> [151] \"IL3RA\" \"IL7R\" #> [153] \"ITGAM\" \"ITGAX\" #> [155] \"ITM2C\" \"JUP\" #> [157] \"KARS\" \"KDR\" #> [159] \"KIT\" \"KLF5\" #> [161] \"KLRB1\" \"KLRC1\" #> [163] \"KLRD1\" \"KLRF1\" #> [165] \"KRT14\" \"KRT15\" #> [167] \"KRT16\" \"KRT23\" #> [169] \"KRT5\" \"KRT6B\" #> [171] \"KRT7\" \"KRT8\" #> [173] \"LAG3\" \"LARS\" #> [175] \"LDHB\" \"LEP\" #> [177] \"LGALSL\" \"LIF\" #> [179] \"LILRA4\" \"LPL\" #> [181] \"LPXN\" \"LRRC15\" #> [183] \"LTB\" \"LUM\" #> [185] \"LY86\" \"LYPD3\" #> [187] \"LYZ\" \"MAP3K8\" #> [189] \"MDM2\" \"MEDAG\" #> [191] \"MKI67\" \"MLPH\" #> [193] \"MMP1\" \"MMP12\" #> [195] \"MMP2\" \"MMRN2\" #> [197] \"MNDA\" \"MPO\" #> [199] \"MRC1\" \"MS4A1\" #> [201] \"MUC6\" \"MYBPC1\" #> [203] \"MYH11\" \"MYLK\" #> [205] \"MYO5B\" \"MZB1\" #> [207] \"NARS\" \"NCAM1\" #> [209] \"NDUFA4L2\" \"NKG7\" #> [211] \"NOSTRIN\" \"NPM3\" #> [213] \"OCIAD2\" \"OPRPN\" #> [215] \"OXTR\" \"PCLAF\" #> [217] \"PCOLCE\" \"PDCD1\" #> [219] \"PDCD1LG2\" \"PDE4A\" #> [221] \"PDGFRA\" \"PDGFRB\" #> [223] \"PDK4\" \"PECAM1\" #> [225] \"PELI1\" \"PGR\" #> [227] \"PIGR\" \"PIM1\" #> [229] \"PLD4\" \"POLR2J3\" #> [231] \"POSTN\" \"PPARG\" #> [233] \"PRDM1\" \"PRF1\" #> [235] \"PTGDS\" \"PTN\" #> [237] \"PTPRC\" \"PTRHD1\" #> [239] \"QARS\" \"RAB30\" #> [241] \"RAMP2\" \"RAPGEF3\" #> [243] \"REXO4\" \"RHOH\" #> [245] \"RORC\" \"RTKN2\" #> [247] \"RUNX1\" \"S100A14\" #> [249] \"S100A4\" \"S100A8\" #> [251] \"SCD\" \"SCGB2A1\" #> [253] \"SDC4\" \"SEC11C\" #> [255] \"SEC24A\" \"SELL\" #> [257] \"SERHL2\" \"SERPINA3\" #> [259] \"SERPINB9\" \"SFRP1\" #> [261] \"SFRP4\" \"SH3YL1\" #> [263] \"SLAMF1\" \"SLAMF7\" #> [265] \"SLC25A37\" \"SLC4A1\" #> [267] \"SLC5A6\" \"SMAP2\" #> [269] \"SMS\" \"SNAI1\" #> [271] \"SOX17\" \"SOX18\" #> [273] \"SPIB\" \"SQLE\" #> [275] \"SRPK1\" \"SSTR2\" #> [277] \"STC1\" \"SVIL\" #> [279] \"TAC1\" \"TACSTD2\" #> [281] \"TCEAL7\" \"TCF15\" #> [283] \"TCF4\" \"TCF7\" #> [285] \"TCIM\" \"TCL1A\" #> [287] \"TENT5C\" \"TFAP2A\" #> [289] \"THAP2\" \"TIFA\" #> [291] \"TIGIT\" \"TIMP4\" #> [293] \"TMEM147\" \"TNFRSF17\" #> [295] \"TOMM7\" \"TOP2A\" #> [297] \"TPD52\" \"TPSAB1\" #> [299] \"TRAC\" \"TRAF4\" #> [301] \"TRAPPC3\" \"TRIB1\" #> [303] \"TUBA4A\" \"TUBB2B\" #> [305] \"TYROBP\" \"UCP1\" #> [307] \"USP53\" \"VOPP1\" #> [309] \"VWF\" \"WARS\" #> [311] \"ZEB1\" \"ZEB2\" #> [313] \"ZNF562\" \"NegControlProbe_00042\" #> [315] \"NegControlProbe_00041\" \"NegControlProbe_00039\" #> [317] \"NegControlProbe_00035\" \"NegControlProbe_00034\" #> [319] \"NegControlProbe_00033\" \"NegControlProbe_00031\" #> [321] \"NegControlProbe_00025\" \"NegControlProbe_00024\" #> [323] \"NegControlProbe_00022\" \"NegControlProbe_00019\" #> [325] \"NegControlProbe_00017\" \"NegControlProbe_00016\" #> [327] \"NegControlProbe_00014\" \"NegControlProbe_00013\" #> [329] \"NegControlProbe_00012\" \"NegControlProbe_00009\" #> [331] \"NegControlProbe_00004\" \"NegControlProbe_00003\" #> [333] \"NegControlProbe_00002\" \"antisense_PROKR2\" #> [335] \"antisense_ULK3\" \"antisense_SCRIB\" #> [337] \"antisense_TRMU\" \"antisense_MYLIP\" #> [339] \"antisense_LGI3\" \"antisense_BCL2L15\" #> [341] \"antisense_ADCY4\" \"NegControlCodeword_0500\" #> [343] \"NegControlCodeword_0501\" \"NegControlCodeword_0502\" #> [345] \"NegControlCodeword_0503\" \"NegControlCodeword_0504\" #> [347] \"NegControlCodeword_0505\" \"NegControlCodeword_0506\" #> [349] \"NegControlCodeword_0507\" \"NegControlCodeword_0508\" #> [351] \"NegControlCodeword_0509\" \"NegControlCodeword_0510\" #> [353] \"NegControlCodeword_0511\" \"NegControlCodeword_0512\" #> [355] \"NegControlCodeword_0513\" \"NegControlCodeword_0514\" #> [357] \"NegControlCodeword_0515\" \"NegControlCodeword_0516\" #> [359] \"NegControlCodeword_0517\" \"NegControlCodeword_0518\" #> [361] \"NegControlCodeword_0519\" \"NegControlCodeword_0520\" #> [363] \"NegControlCodeword_0521\" \"NegControlCodeword_0522\" #> [365] \"NegControlCodeword_0523\" \"NegControlCodeword_0524\" #> [367] \"NegControlCodeword_0525\" \"NegControlCodeword_0526\" #> [369] \"NegControlCodeword_0527\" \"NegControlCodeword_0528\" #> [371] \"NegControlCodeword_0529\" \"NegControlCodeword_0530\" #> [373] \"NegControlCodeword_0531\" \"NegControlCodeword_0532\" #> [375] \"NegControlCodeword_0533\" \"NegControlCodeword_0534\" #> [377] \"NegControlCodeword_0535\" \"NegControlCodeword_0536\" #> [379] \"NegControlCodeword_0537\" \"NegControlCodeword_0538\" #> [381] \"NegControlCodeword_0539\" \"NegControlCodeword_0540\" #> [383] \"BLANK_0006\" \"BLANK_0013\" #> [385] \"BLANK_0037\" \"BLANK_0069\" #> [387] \"BLANK_0072\" \"BLANK_0087\" #> [389] \"BLANK_0110\" \"BLANK_0114\" #> [391] \"BLANK_0120\" \"BLANK_0147\" #> [393] \"BLANK_0180\" \"BLANK_0186\" #> [395] \"BLANK_0272\" \"BLANK_0278\" #> [397] \"BLANK_0319\" \"BLANK_0321\" #> [399] \"BLANK_0337\" \"BLANK_0350\" #> [401] \"BLANK_0351\" \"BLANK_0352\" #> [403] \"BLANK_0353\" \"BLANK_0354\" #> [405] \"BLANK_0355\" \"BLANK_0356\" #> [407] \"BLANK_0357\" \"BLANK_0358\" #> [409] \"BLANK_0359\" \"BLANK_0360\" #> [411] \"BLANK_0361\" \"BLANK_0362\" #> [413] \"BLANK_0363\" \"BLANK_0364\" #> [415] \"BLANK_0365\" \"BLANK_0366\" #> [417] \"BLANK_0367\" \"BLANK_0368\" #> [419] \"BLANK_0369\" \"BLANK_0370\" #> [421] \"BLANK_0371\" \"BLANK_0372\" #> [423] \"BLANK_0373\" \"BLANK_0374\" #> [425] \"BLANK_0375\" \"BLANK_0376\" #> [427] \"BLANK_0377\" \"BLANK_0378\" #> [429] \"BLANK_0379\" \"BLANK_0380\" #> [431] \"BLANK_0381\" \"BLANK_0382\" #> [433] \"BLANK_0383\" \"BLANK_0384\" #> [435] \"BLANK_0385\" \"BLANK_0386\" #> [437] \"BLANK_0387\" \"BLANK_0388\" #> [439] \"BLANK_0389\" \"BLANK_0390\" #> [441] \"BLANK_0391\" \"BLANK_0392\" #> [443] \"BLANK_0393\" \"BLANK_0394\" #> [445] \"BLANK_0395\" \"BLANK_0396\" #> [447] \"BLANK_0397\" \"BLANK_0398\" #> [449] \"BLANK_0399\" \"BLANK_0400\" #> [451] \"BLANK_0401\" \"BLANK_0402\" #> [453] \"BLANK_0403\" \"BLANK_0404\" #> [455] \"BLANK_0405\" \"BLANK_0406\" #> [457] \"BLANK_0407\" \"BLANK_0408\" #> [459] \"BLANK_0409\" \"BLANK_0410\" #> [461] \"BLANK_0411\" \"BLANK_0412\" #> [463] \"BLANK_0413\" \"BLANK_0414\" #> [465] \"BLANK_0415\" \"BLANK_0416\" #> [467] \"BLANK_0417\" \"BLANK_0418\" #> [469] \"BLANK_0419\" \"BLANK_0420\" #> [471] \"BLANK_0421\" \"BLANK_0422\" #> [473] \"BLANK_0423\" \"BLANK_0424\" #> [475] \"BLANK_0425\" \"BLANK_0426\" #> [477] \"BLANK_0427\" \"BLANK_0428\" #> [479] \"BLANK_0429\" \"BLANK_0430\" #> [481] \"BLANK_0431\" \"BLANK_0432\" #> [483] \"BLANK_0433\" \"BLANK_0434\" #> [485] \"BLANK_0435\" \"BLANK_0436\" #> [487] \"BLANK_0437\" \"BLANK_0438\" #> [489] \"BLANK_0439\" \"BLANK_0440\" #> [491] \"BLANK_0441\" \"BLANK_0442\" #> [493] \"BLANK_0443\" \"BLANK_0444\" #> [495] \"BLANK_0445\" \"BLANK_0446\" #> [497] \"BLANK_0447\" \"BLANK_0448\" #> [499] \"BLANK_0449\" \"BLANK_0450\" #> [501] \"BLANK_0451\" \"BLANK_0452\" #> [503] \"BLANK_0453\" \"BLANK_0454\" #> [505] \"BLANK_0455\" \"BLANK_0456\" #> [507] \"BLANK_0457\" \"BLANK_0458\" #> [509] \"BLANK_0459\" \"BLANK_0460\" #> [511] \"BLANK_0461\" \"BLANK_0462\" #> [513] \"BLANK_0463\" \"BLANK_0464\" #> [515] \"BLANK_0465\" \"BLANK_0466\" #> [517] \"BLANK_0467\" \"BLANK_0468\" #> [519] \"BLANK_0469\" \"BLANK_0470\" #> [521] \"BLANK_0471\" \"BLANK_0472\" #> [523] \"BLANK_0473\" \"BLANK_0474\" #> [525] \"BLANK_0475\" \"BLANK_0476\" #> [527] \"BLANK_0477\" \"BLANK_0478\" #> [529] \"BLANK_0479\" \"BLANK_0480\" #> [531] \"BLANK_0481\" \"BLANK_0482\" #> [533] \"BLANK_0483\" \"BLANK_0484\" #> [535] \"BLANK_0485\" \"BLANK_0486\" #> [537] \"BLANK_0487\" \"BLANK_0488\" #> [539] \"BLANK_0489\" \"BLANK_0497\" #> [541] \"BLANK_0499\" is_blank <- str_detect(rownames(sfe), \"^BLANK_\") sum(is_blank) #> [1] 159 is_neg <- str_detect(rownames(sfe), \"^NegControlProbe\") sum(is_neg) #> [1] 20 is_neg2 <- str_detect(rownames(sfe), \"^NegControlCodeword\") sum(is_neg2) #> [1] 41 is_anti <- str_detect(rownames(sfe), \"^antisense\") sum(is_anti) #> [1] 8 is_any_neg <- is_blank | is_neg | is_neg2 | is_anti sfe <- addPerCellQCMetrics(sfe, subsets = list(blank = is_blank, negProbe = is_neg, negCodeword = is_neg2, anti = is_anti, any_neg = is_any_neg)) names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" #> [11] \"nCounts_normed\" \"nGenes_normed\" #> [13] \"prop_nuc\" \"sum\" #> [15] \"detected\" \"subsets_blank_sum\" #> [17] \"subsets_blank_detected\" \"subsets_blank_percent\" #> [19] \"subsets_negProbe_sum\" \"subsets_negProbe_detected\" #> [21] \"subsets_negProbe_percent\" \"subsets_negCodeword_sum\" #> [23] \"subsets_negCodeword_detected\" \"subsets_negCodeword_percent\" #> [25] \"subsets_anti_sum\" \"subsets_anti_detected\" #> [27] \"subsets_anti_percent\" \"subsets_any_neg_sum\" #> [29] \"subsets_any_neg_detected\" \"subsets_any_neg_percent\" #> [31] \"total\" cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_percent$\")] plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) #> Warning: Removed 285 rows containing non-finite outside the scale range #> (`stat_bin()`). plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) + scale_x_log10() + annotation_logticks(sides = \"b\") #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 565577 rows containing non-finite outside the scale range #> (`stat_bin()`). cols_use2 <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_detected$\")] plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) + # Avoid decimal breaks on x axis unless there're too few breaks scale_x_continuous(breaks = scales::breaks_extended(Q = c(1,2,5))) get_neg_ctrl_outliers <- function(col, sfe) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\") outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_percent$\")] for (n in cols_use) { sfe <- get_neg_ctrl_outliers(n, sfe) } names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" #> [11] \"nCounts_normed\" \"nGenes_normed\" #> [13] \"prop_nuc\" \"sum\" #> [15] \"detected\" \"subsets_blank_sum\" #> [17] \"subsets_blank_detected\" \"subsets_blank_percent\" #> [19] \"subsets_negProbe_sum\" \"subsets_negProbe_detected\" #> [21] \"subsets_negProbe_percent\" \"subsets_negCodeword_sum\" #> [23] \"subsets_negCodeword_detected\" \"subsets_negCodeword_percent\" #> [25] \"subsets_anti_sum\" \"subsets_anti_detected\" #> [27] \"subsets_anti_percent\" \"subsets_any_neg_sum\" #> [29] \"subsets_any_neg_detected\" \"subsets_any_neg_percent\" #> [31] \"total\" \"is_blank_outlier\" #> [33] \"is_negProbe_outlier\" \"is_negCodeword_outlier\" #> [35] \"is_anti_outlier\" \"is_any_neg_outlier\" plotSpatialFeature(sfe, \"is_blank_outlier\", colGeometryName = \"cellSeg\") plotColData(sfe, y = \"is_blank_outlier\", x = \"cell_area\", point_fun = function(...) list()) plotSpatialFeature(sfe, \"is_anti_outlier\", colGeometryName = \"cellSeg\") inds_keep <- sfe$nCounts > 0 & sfe$nucleus_area < 400 & !sfe$is_anti_outlier & !sfe$is_blank_outlier & !sfe$is_negCodeword_outlier & !sfe$is_negProbe_outlier (sfe <- sfe[,inds_keep]) #> class: SpatialFeatureExperiment #> dim: 541 117503 #> metadata(1): Samples #> assays(1): counts #> rownames(541): ABCC11 ACTA2 ... BLANK_0497 BLANK_0499 #> rowData names(6): ID Symbol ... vars cv2 #> colnames(117503): 1 2 ... 118707 118708 #> colData names(36): Sample Barcode ... is_anti_outlier #> is_any_neg_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : x_centroid y_centroid #> imgData names(1): sample_id #> #> unit: #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON), nucSeg (GEOMETRY) #> #> Graphs: #> sample01: plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) + # Avoid decimal breaks on x axis unless there're too few breaks scale_x_continuous(breaks = scales::breaks_extended(3, Q = c(1,2,5)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"Xenium breast cancer dataset","text":"look mean variance gene Real genes generally higher mean expression across cells negative controls. real genes negative controls plotted different colors red line \\(y = x\\) expected data follows Poisson distribution. Negative controls real genes form mostly separate clusters. Negative controls stick close line, real genes overdispersed. Unlike CosMX dataset, negative controls don’t seem overdispersed.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_neg <- is_any_neg plotRowData(sfe, x = \"means\", y = \"is_neg\") + scale_y_log10() + annotation_logticks(sides = \"b\") plotRowData(sfe, x=\"means\", y=\"vars\", color_by = \"is_neg\") + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() + labs(color = \"Negative control\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"Xenium breast cancer dataset","text":"’s sparse dense region. poses question type neighborhood graph use, e.g. conceivable cells sparse region just singletons. Furthermore, unclear length scale influence might . might depend cell type contact secreted signals used cell type, length scale influence. k nearest neighbors used, neighbors dense region much closer together sparse region. distance based neighbors used, cells dense region neighbors cells sparse region, sparse region can break multiple compartments distance cutoff long enough. purpose demonstration, use k nearest neighbors \\(k = 5\\), inverse distance weighting. Note using neighbors leads longer computation time spatial autocorrelation metrics. Global Moran’s indicatse positive spatial autocorrelation. strength spatial autocorrelation can vary spatially, also run local Moran’s . pointsize argument adjusts point size scattermore. default 0, meaning single pixels, since cells sparse region hard see way, increase pointsize. still plot polygons larger single panel plots, use scattermore multi-panel plots polygons panel invisible anyway due small size save time. Interestingly, nCounts homogeneous interior dense region, nGenes homogeneous edge dense region. expected, cell area homogeneous sparse region. However, nucleus area homogeneous interior dense region. Moran plot nCounts obvious clusters . lower panel, 2D histogram influential points plotted red.","code":"system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 7.258 0.033 7.297 sfe <- colDataMoransI(sfe, c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGraphName = \"knn5\") colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"),] #> DataFrame with 4 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.422387 5.55509 #> nGenes 0.401395 3.13694 #> cell_area 0.628837 7.57098 #> nucleus_area 0.377248 6.88331 sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0, pointsize = 1) sfe <- colDataUnivariate(sfe, \"moran.plot\", \"nCounts\", colGraphName = \"knn5\") p1 <- moranPlot(sfe, \"nCounts\", binned = TRUE, plot_influential = FALSE) p2 <- moranPlot(sfe, \"nCounts\", binned = TRUE) p1 / p2 + plot_layout(guides = \"collect\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"Xenium breast cancer dataset","text":"default, gene expression, log normalized counts used spatial autocorrelation metrics, running Moran’s , normalize data. Use cores available speed . expected, generally negative controls tightly clustered around 0, real genes positive Moran’s , means generally technical artifact spatial trend. significantly negative Moran’s observed. negative spatial autocorrelation rare gene expression? two negative controls sizable Moran’s ? somewhat spatial trend antisense probe, detected upper left. However, might significantly affect results since 2 counts 1% counts cell. negative control codeword 1 count per cell cells negative control detected seem far . detected negative controls, detected one also one highest Moran’s among negative controls. However, negative control higher Moran’s among detected. genes highest Moran’s ? highlight histological regions, CosMX vignette. Moran’s relate gene expression level? highly expressed genes higher Moran’s , less expressed genes higher Moran’s well.","code":"sfe <- logNormCounts(sfe) system.time( sfe <- runMoransI(sfe, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 35.226 4.670 21.965 rowData(sfe)$is_neg <- is_any_neg plotRowData(sfe, x = \"moran_sample01\", y = \"is_neg\") ord <- order(rowData(sfe)$moran_sample01[is_any_neg], decreasing = TRUE)[1:2] top_neg <- rownames(sfe)[is_any_neg][ord] plotSpatialFeature(sfe, top_neg, colGeometryName = \"centroids\", scattermore = TRUE, pointsize = 1) head(sort(rowData(sfe)$means[is_any_neg], decreasing = TRUE), 15) #> antisense_PROKR2 antisense_SCRIB antisense_BCL2L15 #> 0.0192761036 0.0131741317 0.0066806805 #> antisense_TRMU antisense_MYLIP antisense_ULK3 #> 0.0042041480 0.0030807724 0.0028169494 #> BLANK_0485 antisense_ADCY4 antisense_LGI3 #> 0.0023403658 0.0019829281 0.0017871884 #> BLANK_0430 NegControlProbe_00035 NegControlProbe_00012 #> 0.0015063445 0.0010978443 0.0010382714 #> NegControlProbe_00033 NegControlProbe_00014 BLANK_0120 #> 0.0009446567 0.0009361463 0.0009276359 top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2, pointsize = 0.5) plotRowData(sfe, x = \"means\", y = \"moran_sample01\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"non-spatial-dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Non-spatial dimension reduction and clustering","title":"Xenium breast cancer dataset","text":"run non-spatial PCA scRNA-seq data spatial region explicitly used, PC’s highlight spatial regions due spatial autocorrelation gene expression histological regions different cell types. Non-spatial clustering locating clusters space Now scater can also rasterize plots lots points rasterise argument, different mechanism scattermore requires system dependencies. Plot location clusters space","code":"set.seed(29) sfe <- runPCA(sfe, ncomponents = 30, scale = TRUE, BSPARAM = IrlbaParam()) ElbowPlot(sfe, ndims = 30) plotDimLoadings(sfe, dims = 1:6) spatialReducedDim(sfe, \"PCA\", 6, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2, scattermore = TRUE, pointsize = 0.5) colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\")[,1:15], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' plotPCA(sfe, ncomponents = 4, colour_by = \"cluster\", rasterise = FALSE) plotSpatialFeature(sfe, \"cluster\", colGeometryName = \"cellSeg\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential expression","title":"Xenium breast cancer dataset","text":"Cluster marker genes found Wilcoxon rank sum test commonly done scRNA-seq. ’s already sorted p-values: code extracts significant markers cluster: allows plotting top marker genes heatmap:","code":"markers <- findMarkers(sfe, groups = colData(sfe)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[6]] #> DataFrame with 541 rows and 16 columns #> p.value FDR summary.AUC AUC.1 AUC.2 AUC.3 #> #> TENT5C 1.63326e-296 8.83591e-294 0.967126 0.925332 0.970773 0.973994 #> SEC11C 7.21967e-230 1.95292e-227 0.910578 0.899842 0.926434 0.943356 #> MZB1 1.53225e-208 2.76316e-206 0.890901 0.954750 0.952539 0.953965 #> PRDM1 1.85488e-171 2.50872e-169 0.851328 0.902618 0.844973 0.845018 #> SLAMF7 1.36571e-121 1.47770e-119 0.797638 0.973023 0.928639 0.964871 #> ... ... ... ... ... ... ... #> TRAF4 1 1 0.190188 0.190188 0.498645 0.463686 #> USP53 1 1 0.237375 0.237375 0.515006 0.439300 #> VWF 1 1 0.148324 0.521143 0.492783 0.472879 #> ZEB2 1 1 0.227203 0.764394 0.227203 0.317134 #> ZNF562 1 1 0.286342 0.286342 0.426846 0.479020 #> AUC.4 AUC.5 AUC.7 AUC.8 AUC.9 AUC.10 AUC.11 #> #> TENT5C 0.978959 0.957859 0.975948 0.951552 0.970808 0.957569 0.967970 #> SEC11C 0.955726 0.912545 0.930013 0.929966 0.935488 0.925262 0.924159 #> MZB1 0.957403 0.947673 0.955833 0.909592 0.954841 0.954570 0.946052 #> PRDM1 0.886670 0.720187 0.838240 0.903536 0.876730 0.880933 0.863019 #> SLAMF7 0.970315 0.931636 0.966127 0.973165 0.967341 0.927913 0.951534 #> ... ... ... ... ... ... ... ... #> TRAF4 0.498581 0.510160 0.476258 0.267629 0.424212 0.485015 0.482499 #> USP53 0.367034 0.532702 0.506453 0.317515 0.461329 0.516505 0.492849 #> VWF 0.436092 0.501213 0.148324 0.548782 0.541546 0.545200 0.438237 #> ZEB2 0.296810 0.491759 0.278638 0.784944 0.645498 0.568365 0.301803 #> ZNF562 0.468102 0.525932 0.449265 0.297023 0.407262 0.512553 0.480637 #> AUC.12 AUC.13 AUC.14 #> #> TENT5C 0.953956 0.967126 0.979535 #> SEC11C 0.902004 0.910578 0.926748 #> MZB1 0.939414 0.890901 0.952151 #> PRDM1 0.834607 0.851328 0.899233 #> SLAMF7 0.944967 0.797638 0.971807 #> ... ... ... ... #> TRAF4 0.475006 0.352237 0.389822 #> USP53 0.552792 0.545264 0.333889 #> VWF 0.507755 0.508495 0.557926 #> ZEB2 0.368603 0.251563 0.771073 #> ZNF562 0.525134 0.543883 0.375170 genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe, genes_use, x = \"cluster\", point_fun = function(...) list()) genes_use2 <- unique(unlist(lapply(markers, function(x) rownames(x)[1:5]))) plotGroupedHeatmap(sfe, genes_use2, group = \"cluster\", colour = scales::viridis_pal()(100))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"local-spatial-statistics-of-marker-genes","dir":"Articles","previous_headings":"","what":"Local spatial statistics of marker genes","title":"Xenium breast cancer dataset","text":"First plot genes space reference Global Moran’s marker genes shown : marker genes positive spatial autocorrelation, stronger others. Local Moran’s marker genes shown : seems histological regions tend spatially homogenous gene expression others. epithelial region tends homogenous. genes, regions higher expression also higher local Moran’s , FOXA1 GATA3, genes, case, FGL2 LUM. Finally, assess local spatial heteroscdasticity (LOSH) marker genes find local heterogeneity: , just like CosMX dataset, LOSH higher gene highly expressed (e.g. CD3E, LUM, TENT5C) cases (e.g. FOXA1, GATA3). may due spatial distribution different cell types.","code":"plotSpatialFeature(sfe, genes_use, colGeometryName = \"centroids\", ncol = 3, pointsize = 0.3, scattermore = TRUE) setNames(rowData(sfe)[genes_use, \"moran_sample01\"], genes_use) #> FOXA1 FGL2 LUM ADIPOQ CD3E TENT5C CD93 GATA3 #> 0.7421765 0.2604219 0.6812312 0.5493112 0.4015241 0.2806543 0.3250982 0.6558350 #> MYLK APOC1 CPA3 MS4A1 LILRA4 KRT15 #> 0.4653625 0.2696177 0.1904912 0.2144728 0.1092981 0.5425155 sfe <- runUnivariate(sfe, \"localmoran\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = genes_use, colGeometryName = \"centroids\", ncol = 3, divergent = TRUE, diverge_center = 0, scattermore = TRUE, pointsize = 0.3) sfe <- runUnivariate(sfe, \"LOSH\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"LOSH\", features = genes_use, colGeometryName = \"centroids\", ncol = 3, scattermore = TRUE, pointsize = 0.3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Xenium breast cancer dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] patchwork_1.2.0 scran_1.30.2 #> [3] bluster_1.12.0 BiocSingular_1.18.0 #> [5] BiocParallel_1.36.0 scater_1.30.1 #> [7] scuttle_1.12.0 stringr_1.5.1 #> [9] ggplot2_3.5.1 SpatialFeatureExperiment_1.3.0 #> [11] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [13] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [15] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [17] IRanges_2.36.0 S4Vectors_0.40.2 #> [19] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [21] matrixStats_1.3.0 SFEData_1.4.0 #> [23] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] sf_1.0-16 edgeR_4.0.16 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] cowplot_1.1.3 DBI_1.2.2 #> [21] RColorBrewer_1.1-3 abind_1.4-5 #> [23] zlibbioc_1.48.2 purrr_1.0.2 #> [25] RCurl_1.98-1.14 rappdirs_0.3.3 #> [27] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [29] irlba_2.3.5.1 terra_1.7-71 #> [31] pheatmap_1.0.12 units_0.8-5 #> [33] RSpectra_0.16-1 dqrng_0.3.2 #> [35] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [37] codetools_0.2-20 DelayedArray_0.28.0 #> [39] tidyselect_1.2.1 farver_2.1.1 #> [41] ScaledMatrix_1.10.0 viridis_0.6.5 #> [43] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [45] BiocNeighbors_1.20.2 e1071_1.7-14 #> [47] systemfonts_1.0.6 tools_4.3.3 #> [49] ggnewscale_0.4.10 ragg_1.3.0 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] mgcv_1.9-1 xfun_0.43 #> [57] dplyr_1.1.4 HDF5Array_1.30.1 #> [59] withr_3.0.0 BiocManager_1.30.22 #> [61] fastmap_1.1.1 boot_1.3-30 #> [63] rhdf5filters_1.14.1 fansi_1.0.6 #> [65] spData_2.3.0 digest_0.6.35 #> [67] rsvd_1.0.5 R6_2.5.1 #> [69] mime_0.12 textshaping_0.3.7 #> [71] colorspace_2.1-0 wk_0.9.1 #> [73] scattermore_1.2 RSQLite_2.3.6 #> [75] hexbin_1.28.3 utf8_1.2.4 #> [77] generics_0.1.3 class_7.3-22 #> [79] httr_1.4.7 htmlwidgets_1.6.4 #> [81] S4Arrays_1.2.1 spdep_1.3-3 #> [83] pkgconfig_2.0.3 scico_1.5.0 #> [85] gtable_0.3.5 blob_1.2.4 #> [87] XVector_0.42.0 htmltools_0.5.8.1 #> [89] scales_1.3.0 png_0.1-8 #> [91] knitr_1.45 rjson_0.2.21 #> [93] nlme_3.1-164 curl_5.2.1 #> [95] proxy_0.4-27 cachem_1.0.8 #> [97] rhdf5_2.46.1 BiocVersion_3.18.1 #> [99] KernSmooth_2.23-22 parallel_4.3.3 #> [101] vipor_0.4.7 AnnotationDbi_1.64.1 #> [103] desc_1.4.3 s2_1.1.6 #> [105] pillar_1.9.0 grid_4.3.3 #> [107] vctrs_0.6.5 promises_1.3.0 #> [109] dbplyr_2.5.0 beachmat_2.18.1 #> [111] xtable_1.8-4 cluster_2.1.6 #> [113] beeswarm_0.4.0 evaluate_0.23 #> [115] magick_2.8.3 cli_3.6.2 #> [117] locfit_1.5-9.9 compiler_4.3.3 #> [119] rlang_1.1.3 crayon_1.5.2 #> [121] labeling_0.4.3 classInt_0.4-10 #> [123] fs_1.6.4 ggbeeswarm_0.7.2 #> [125] stringi_1.8.3 viridisLite_0.4.2 #> [127] deldir_2.0-4 munsell_0.5.1 #> [129] Biostrings_2.70.3 Matrix_1.6-5 #> [131] ExperimentHub_2.10.0 sparseMatrixStats_1.14.0 #> [133] bit64_4.0.5 Rhdf5lib_1.24.2 #> [135] KEGGREST_1.42.0 statmod_1.5.0 #> [137] shiny_1.8.1.1 interactiveDisplayBase_1.40.0 #> [139] highr_0.10 AnnotationHub_3.10.1 #> [141] igraph_2.0.3 memoise_2.0.1 #> [143] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"MERFISH mouse liver dataset and considerations of large data","text":"SpatialFeatureExperiment (SFE) Voyager packages originally developed around relatively small Visium dataset proof concept, hence originally optimized large datasets. However, larger smFISH datasets hundreds thousands, sometimes million cells already produced soon produced routinely. Among studies using smFISH-based spatial transcriptomics technologies reported number cells per dataset, number cells per dataset increased past years (Moses Pachter 2022). anticipation large datasets, vignette produced using limited GitHub Actions resources (MacOS), 14 GB RAM 3 CPU cores 14 GB disk space, comparable laptops. therefore expect analyses vignette scale reasonably sized datasets. dataset use vignette MERFISH mouse liver dataset downloaded Vizgen website. use discuss issues large datasets upcoming features next release Voyager. gene count matrix cell metadata (including centroid coordinates) downloaded CSV files read R. 7 z-planes imaged, cell segmentation available one z-plane. cell polygons HDF5 files, one HDF5 file per field view (FOV), 1000 FOVs dataset. Converting HDF5 files sf data frame trivial. See vignette creating SpatialFeatureExperiment (SFE) object code used conversion, polygons included SFE object. cell metadata already cell volume. polygons used analyses, polygons can’t seen static plot hundreds thousands cells anyway, conversion optional. transcript spot locations available, yet work large point dataset. load packages used. 395,215 cells dataset. Plotting polygons takes , isn’t bad. However, wish save plot PDF. avoid problem, can either use scattermore = TRUE argument plotSpatialFeature() plot centroids since polygons hard see anyway. Cell density can vaguely seen plot . count number cells bins better visualize cell density. Cell density part homogeneous shows structure denser regions seem relate blood vessels.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(ggplot2) library(patchwork) library(stringr) library(spdep) library(BiocParallel) library(BiocSingular) library(gstat) library(BiocNeighbors) library(sf) library(automap) theme_set(theme_bw()) (sfe <- VizgenLiverData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 385 395215 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(395215): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(9): fov volume ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: plotGeometry(sfe, \"cellSeg\") plotCellBin2D(sfe, bins = 300, hex = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"MERFISH mouse liver dataset and considerations of large data","text":"Plotting almost 400,000 polygons kind slow doable. nCounts kind looks like salt pepper. Using scattermore package can speed plotting large number points. non-interactive plot, cell polygons small see anyway, plotting cell centroid points fine. run server, plotting almost 400,000 polygons took around 23 seconds, using geom_scattermore() (scattermore = TRUE) took 2 seconds. Since geom_scattermore() rasterizes plot, plot pixelated zoomed . interactive data visualization useful ESDA, need static figures publications. Voyager 1.2.0 (Bioconductor 3.17), bounding box can specified zoom data. Much time making plot spent subsetting sf data frame bounding box. , spatial autocorrelation evident upper right region smaller cells, less rest patch. nCounts seems related cell size; larger cells seem total counts. Interactive data visualization currently beyond scope Voyager vignette. existing tools interactive visualization highly multiplexed imaging data, MERmaid (G. Wang et al. 2020) MERFISH data, TissUUmaps (Behanova et al. 2023), Visinity (Warchol et al. 2023), samui broswer (Sriworarat et al. 2023). Since aren’t many genes, genes negative control probes can displayed: number real genes 347. Next, plot distribution nCounts divided number genes panel, distribution comparable across datasets different numbers genes. Xenium dataset, mysterious regular notches histogram number genes detected. also plot number genes detected per cell, geom_scattermore(). Similarly nCounts, points look intermingled. Distribution cell volume space: Next, explore nCounts relates nGenes: two branches plot. cell size relate nCounts? lower branch larger cells don’t tend total counts, upper branch larger cells tend total counts. also examine cell size relates number genes detected: seem clusters possibly related cell type.","code":"names(colData(sfe)) #> [1] \"fov\" \"volume\" \"min_x\" \"max_x\" \"min_y\" \"max_y\" #> [7] \"sample_id\" \"nCounts\" \"nGenes\" system.time( print(plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\")) ) #> user system elapsed #> 25.775 2.265 28.125 system.time({ print(plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"centroids\", scattermore = TRUE)) }) #> user system elapsed #> 1.468 0.369 1.838 bbox_use <- c(xmin = 3000, xmax = 3500, ymin = 2500, ymax = 3000) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\", bbox = bbox_use) rownames(sfe) #> [1] \"Comt\" \"Ldha\" \"Pck1\" \"Akr1a1\" #> [5] \"Ugt2b1\" \"Acsl5\" \"Ugt2a3\" \"Igf1\" #> [9] \"Errfi1\" \"Serping1\" \"Adh4\" \"Hsd17b2\" #> [13] \"Tpi1\" \"Cyp1a2\" \"Acsl1\" \"Akr1d1\" #> [17] \"Alas1\" \"Aldh7a1\" \"G6pc\" \"Hsd17b12\" #> [21] \"Pdhb\" \"Gpd1\" \"Cyp7b1\" \"Pgam1\" #> [25] \"Hc\" \"Dld\" \"Cyp2c23\" \"Proz\" #> [29] \"Acss2\" \"Psap\" \"Cald1\" \"Hsd3b3\" #> [33] \"Galm\" \"Cxcl12\" \"Sardh\" \"Cebpa\" #> [37] \"Aldh3a2\" \"Gck\" \"Sdc1\" \"Pdha1\" #> [41] \"Npc2\" \"Hsd17b6\" \"Aqp1\" \"Adh7\" #> [45] \"Smpdl3a\" \"Egfr\" \"Pgm1\" \"Fasn\" #> [49] \"Ctsc\" \"Abcb4\" \"Fyb\" \"Alas2\" #> [53] \"Gpi1\" \"Fech\" \"Lsr\" \"Psmd3\" #> [57] \"Gm2a\" \"Pabpc1\" \"Cbr4\" \"Tkt\" #> [61] \"Tmem56\" \"Eif3f\" \"Cxadr\" \"Srd5a1\" #> [65] \"Cyp2c55\" \"Gnai2\" \"Gimap6\" \"Hsd3b2\" #> [69] \"Grn\" \"Rpp14\" \"Csnk1a1\" \"Egr1\" #> [73] \"Mpeg1\" \"Acsl4\" \"Hmgb1\" \"Mpp1\" #> [77] \"Lcp1\" \"Plvap\" \"Aldh1b1\" \"Oxsm\" #> [81] \"Dlat\" \"Csk\" \"Mcat\" \"Hsd17b7\" #> [85] \"Epas1\" \"Eif3a\" \"Nrp1\" \"Dek\" #> [89] \"H2afy\" \"Bpgm\" \"Hsd3b6\" \"Dnase1l3\" #> [93] \"Serpinh1\" \"Tinagl1\" \"Aldoc\" \"Cyp2c38\" #> [97] \"Dpt\" \"Mrc1\" \"Minpp1\" \"Fgf1\" #> [101] \"Alcam\" \"Gimap4\" \"Cav2\" \"Eng\" #> [105] \"Adgre1\" \"Shisa5\" \"Csf1r\" \"Esam\" #> [109] \"Unc93b1\" \"Cnp\" \"Clec14a\" \"Kdr\" #> [113] \"Adpgk\" \"Gca\" \"Pkm\" \"Mkrn1\" #> [117] \"Sdc3\" \"Acaca\" \"Gpr182\" \"Bmp2\" #> [121] \"Tfrc\" \"Timp3\" \"Calcrl\" \"Pfkl\" #> [125] \"Wnt2\" \"Cybb\" \"Icam1\" \"Cdh5\" #> [129] \"Sgms2\" \"Cd48\" \"Stk17b\" \"Tubb6\" #> [133] \"Vcam1\" \"Hgf\" \"Ramp1\" \"Arsb\" #> [137] \"Pld4\" \"Smarca4\" \"Fstl1\" \"Pfkm\" #> [141] \"Lhfp\" \"Lmna\" \"Cd300lg\" \"Laptm5\" #> [145] \"Timp2\" \"Slc25a37\" \"Fzd7\" \"Lyve1\" #> [149] \"Acacb\" \"Cyp1a1\" \"Eno3\" \"Cd83\" #> [153] \"Epcam\" \"Ltbp4\" \"Pgm2\" \"Mertk\" #> [157] \"Pth1r\" \"Itga2b\" \"Kctd12\" \"Srd5a3\" #> [161] \"Bmp5\" \"Pecam1\" \"G6pc3\" \"Cyp17a1\" #> [165] \"Stab2\" \"Cygb\" \"Col1a2\" \"Nid1\" #> [169] \"Cd44\" \"Ctnnal1\" \"Ephb4\" \"Elk3\" #> [173] \"Foxq1\" \"Cxcl14\" \"Fzd4\" \"Itgb2\" #> [177] \"Tcf7\" \"Srd5a2\" \"Aldh3b1\" \"Flt4\" #> [181] \"Selp\" \"Rbpj\" \"Ep300\" \"Rhoj\" #> [185] \"Fzd1\" \"Tcf7l2\" \"Ssh2\" \"Col6a1\" #> [189] \"Notch2\" \"Tcf4\" \"Tek\" \"Trim47\" #> [193] \"Tent5c\" \"Ncf1\" \"Lepr\" \"Pck2\" #> [197] \"Lmnb1\" \"Selplg\" \"Myh10\" \"Aldoart1\" #> [201] \"Podxl\" \"Kitl\" \"Tcf3\" \"Tspan13\" #> [205] \"Dll4\" \"Fzd8\" \"Lad1\" \"Procr\" #> [209] \"Ccr2\" \"Akr1c18\" \"Maml1\" \"Ms4a1\" #> [213] \"Hk3\" \"Bcam\" \"Fzd5\" \"Dkk3\" #> [217] \"Bank1\" \"Itgal\" \"Pgam2\" \"Axin2\" #> [221] \"Pfkp\" \"Meis2\" \"Jag1\" \"Gimap3\" #> [225] \"Rassf4\" \"Notch1\" \"Cd93\" \"Tet2\" #> [229] \"Tcf7l1\" \"Cd34\" \"Hvcn1\" \"Mal\" #> [233] \"Itgb7\" \"Wnt4\" \"Kit\" \"Gapdhs\" #> [237] \"Kcnj16\" \"Tnfrsf13c\" \"Hk1\" \"Pdgfra\" #> [241] \"Apobec3\" \"Slc34a2\" \"Vav1\" \"Lamp3\" #> [245] \"Meis1\" \"Lck\" \"Efnb2\" \"Notch4\" #> [249] \"Klrb1c\" \"Angpt2\" \"Vwf\" \"E2f2\" #> [253] \"Ccr1\" \"Angpt1\" \"B4galt6\" \"Cyp21a1\" #> [257] \"Pdpn\" \"Dll1\" \"Ammecr1\" \"Csf3r\" #> [261] \"Ndn\" \"Fgf2\" \"Runx1\" \"Mpl\" #> [265] \"Mecom\" \"Itgam\" \"Hoxb4\" \"Tox\" #> [269] \"Prickle2\" \"Acss1\" \"Cyp2b9\" \"Aldh3a1\" #> [273] \"Bmp7\" \"Gata2\" \"Il7r\" \"Satb1\" #> [277] \"Sfrp1\" \"Eno2\" \"Mrvi1\" \"Mki67\" #> [281] \"Nes\" \"Tmod1\" \"Ace\" \"Gfap\" #> [285] \"Tgfb2\" \"Tomt\" \"Flt3\" \"Sult2b1\" #> [289] \"Hkdc1\" \"Notch3\" \"Cdh11\" \"Il6\" #> [293] \"Hk2\" \"Mmrn1\" \"Vangl2\" \"Pou2af1\" #> [297] \"Hoxb5\" \"Jag2\" \"Aldh3b2\" \"Gypa\" #> [301] \"Lrp2\" \"Lef1\" \"Olr1\" \"Lox\" #> [305] \"Txlnb\" \"Slc12a1\" \"Aldh3b3\" \"Cxcr2\" #> [309] \"Nkd2\" \"Sult1e1\" \"Acsl6\" \"Ddx4\" #> [313] \"Ldhc\" \"Kcnj1\" \"Acsbg1\" \"Fzd3\" #> [317] \"F13a1\" \"Hsd11b2\" \"Dkk2\" \"Hsd17b1\" #> [321] \"Fzd2\" \"Cyp2b23\" \"Eno4\" \"Celsr2\" #> [325] \"Obscn\" \"Slamf1\" \"Akap14\" \"Gnaz\" #> [329] \"Cd177\" \"Tet1\" \"Cspg4\" \"Aldoart2\" #> [333] \"Cyp2b19\" \"Ryr2\" \"Ldhal6b\" \"Acsf3\" #> [337] \"Chodl\" \"Ivl\" \"Cyp11b1\" \"Sfrp2\" #> [341] \"Dkk1\" \"Cyp11a1\" \"1700061G19Rik\" \"Acsbg2\" #> [345] \"Olah\" \"Pdha2\" \"Hsd17b3\" \"Blank-0\" #> [349] \"Blank-1\" \"Blank-2\" \"Blank-3\" \"Blank-4\" #> [353] \"Blank-5\" \"Blank-6\" \"Blank-7\" \"Blank-8\" #> [357] \"Blank-9\" \"Blank-10\" \"Blank-11\" \"Blank-12\" #> [361] \"Blank-13\" \"Blank-14\" \"Blank-15\" \"Blank-16\" #> [365] \"Blank-17\" \"Blank-18\" \"Blank-19\" \"Blank-20\" #> [369] \"Blank-21\" \"Blank-22\" \"Blank-23\" \"Blank-24\" #> [373] \"Blank-25\" \"Blank-26\" \"Blank-27\" \"Blank-28\" #> [377] \"Blank-29\" \"Blank-30\" \"Blank-31\" \"Blank-32\" #> [381] \"Blank-33\" \"Blank-34\" \"Blank-35\" \"Blank-36\" #> [385] \"Blank-37\" n_panel <- 347 colData(sfe)$nCounts_normed <- sfe$nCounts / n_panel colData(sfe)$nGenes_normed <- sfe$nGenes / n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"volume\", colGeometryName = \"centroids\", scattermore = TRUE) plotColData(sfe, x=\"nCounts\", y=\"nGenes\", bins = 100) plotColData(sfe, x=\"volume\", y=\"nCounts\", bins = 100) plotColData(sfe, x=\"volume\", y=\"nGenes\", bins = 100)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"negative-controls","dir":"Articles","previous_headings":"Quality control","what":"Negative controls","title":"MERFISH mouse liver dataset and considerations of large data","text":"Blank probes used negative controls. Total transcript counts blank probes: Number blank features detected per cell: Percentage blank features per cell: percentage interesting: within tissue, cells high percentage blank counts scattered like salt pepper, cells left edge tissue, edges FOVs, tissue doesn’t end. Also plot histograms: NA’s cells without transcript detected. Unlike Xenium dataset, cells least one blank count. log transforming, zeroes removed plot. small percentage blank counts acceptable. remove outlier based distribution percentage ’s greater zero. blank percentage relate total counts? outliers percentage blank counts low total counts. seemingly real cells sizable nCounts relatively high percentage blank counts. Since distribution percentage long tail, log transform finding outliers. proportion cells outliers? ’s cutoff outlier? Remove outliers empty cells: still 390,000 cells left removing outliers.","code":"is_blank <- str_detect(rownames(sfe), \"^Blank-\") sfe <- addPerCellQCMetrics(sfe, subset = list(blank = is_blank)) names(colData(sfe)) #> [1] \"fov\" \"volume\" \"min_x\" #> [4] \"max_x\" \"min_y\" \"max_y\" #> [7] \"sample_id\" \"nCounts\" \"nGenes\" #> [10] \"nCounts_normed\" \"nGenes_normed\" \"sum\" #> [13] \"detected\" \"subsets_blank_sum\" \"subsets_blank_detected\" #> [16] \"subsets_blank_percent\" \"total\" plotSpatialFeature(sfe, \"subsets_blank_sum\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"subsets_blank_detected\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"subsets_blank_percent\", colGeometryName = \"centroids\", scattermore = TRUE) plotColDataHistogram(sfe, paste0(\"subsets_blank_\", c(\"sum\", \"detected\", \"percent\"))) #> Warning: Removed 1332 rows containing non-finite outside the scale range #> (`stat_bin()`). mean(sfe$subsets_blank_sum > 0) #> [1] 0.7648799 plotColDataHistogram(sfe, \"subsets_blank_percent\") + scale_x_log10() + annotation_logticks() #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 92923 rows containing non-finite outside the scale range #> (`stat_bin()`). plotColData(sfe, x=\"nCounts\", y=\"subsets_blank_percent\", bins = 100) #> Warning: Removed 1332 rows containing non-finite outside the scale range #> (`stat_bin2d()`). get_neg_ctrl_outliers <- function(col, sfe, nmads = 3, log = FALSE) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\", nmads = nmads, log = log) outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } sfe <- get_neg_ctrl_outliers(\"subsets_blank_percent\", sfe, log = TRUE) mean(sfe$is_blank_outlier) #> [1] 0.008944499 min(sfe$subsets_blank_percent[sfe$is_blank_outlier]) #> [1] 2.303523 (sfe <- sfe[, !sfe$is_blank_outlier & sfe$nCounts > 0]) #> class: SpatialFeatureExperiment #> dim: 385 390348 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(390348): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(18): fov volume ... total is_blank_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"MERFISH mouse liver dataset and considerations of large data","text":"look mean variance gene: genes display higher mean expression blanks, considerable overlap distribution, probably genes expressed lower levels fewer cells included. “real” genes negative controls plotted different colors: red line \\(y = x\\) expected data follows Poisson distribution. zoomed , blanks somewhat overdispersed.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_blank <- is_blank plotRowData(sfe, x = \"means\", y = \"is_blank\") + scale_y_log10() + annotation_logticks(sides = \"b\") plotRowData(sfe, x = \"means\", y = \"vars\", colour_by = \"is_blank\") + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() as.data.frame(rowData(sfe)[is_blank,]) |> ggplot(aes(means, vars)) + geom_point() + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"MERFISH mouse liver dataset and considerations of large data","text":", plot zoomed patch visually inspect cell-cell contiguity: quite cells contiguous cell, cell segmentation imperfect, purely using poly2nb() problematic. next release, might implement way blend polygon contiguity graph graph case singletons. now, use k nearest neighbors \\(k = 5\\), seems like reasonable approximation contiguity based visual inspection. Voyager 1.2.0 (Bioconductor 3.17), findSpatialNeighbors() default uses BiocNeighbors k nearest neighbors distance neighbors saving distances neighbors. bypasses time consuming step spdep calculating distance based edge weights, compute distance, hence greatly speeding computation. spatial neighborhood graph, can compute Moran’s QC metrics. Unlike smFISH-based datasets website, nCounts nGenes sizable negative Moran’s ’s, closer 0 volume. interesting compare metrics across different tissues, add datasets SFEData future releases. Also check local Moran’s , since little patch examined , regions may positive spatial autocorrelation. niches around smaller blood vessels positive local Moran’s nCounts nGenes. likely due homogeneous endothelial cells compared hepatocytes.","code":"plotGeometry(sfe, \"cellSeg\", bbox = bbox_use) system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 56.014 0.818 57.663 sfe <- colDataMoransI(sfe, c(\"nCounts\", \"nGenes\", \"volume\"), colGraphName = \"knn5\") colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"volume\"),] #> DataFrame with 3 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts -0.1084532 4.22513 #> nGenes -0.0922130 2.25923 #> volume -0.0195237 3.89406 sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"volume\"), colGraphName = \"knn5\") plotLocalResult(sfe, \"localmoran\", c(\"nCounts\", \"nGenes\", \"volume\"), colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"MERFISH mouse liver dataset and considerations of large data","text":"’s actually slow thought almost 400,000 cells. Moran’s ’s distributed real genes blank probes? blanks clustered tightly around 0. vast majority real genes positive spatial autocorrelation, quite strong. genes negative spatial autocorrelation, although may may statistically significant. Plot top genes positive spatial autocorrelation: Unlike smFISH-based cancer datasets dataset, genes highest Moran’s highlight different histological regions. probably zones hepatic lobule, blood vessels. interesting compare spatial autocorrelation marker genes among different tissues cell types. Negative Moran’s means nearby cells tend dissimilar . hard see plotting whole tissue section, use bounding box . gene negative Moran’s compared one Moran’s closest 0. expected, feature Moran’s closest 0 blank.","code":"sfe <- logNormCounts(sfe) system.time( sfe <- runMoransI(sfe, BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 132.531 36.719 88.560 plotRowData(sfe, x = \"moran_sample01\", y = \"is_blank\") + geom_hline(yintercept = 0, linetype = 2) top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2) bottom_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01)[1]] bottom_abs_moran <- rownames(sfe)[order(abs(rowData(sfe)$moran_sample01))[1]] plotSpatialFeature(sfe, c(bottom_moran, bottom_abs_moran), colGeometryName = \"cellSeg\", bbox = bbox_use)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"spatial-autocorrelation-at-larger-length-scales","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation at larger length scales","title":"MERFISH mouse liver dataset and considerations of large data","text":"k nearest neighbor graph used concerns 5 cells around cell, small neighborhood, small length scale. current release Voyager, correlogram can computed get sense length scale spatial autocorrelation. However, since finding lag values higher higher orders neighborhoods slow large number cells higher orders, correlogram helpful . section, use methods involving binning explore spatial autocorrelation larger length scales.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"binning","dir":"Articles","previous_headings":"Spatial autocorrelation at larger length scales","what":"Binning","title":"MERFISH mouse liver dataset and considerations of large data","text":"sf package can create polygons grid, can bin cells attributes gene expressions. make 100 100 hexagonal grid bounding box cell centroids. use grid bin QC metrics averaging values cells. Since bins completely covered tissue fewer cells, mean may less susceptible edge effect sum, bins near edge lower sums, may spuriously increase Moran’s . Plot binned values: ’s outlier bin evident plotting single cells. ’s still edge effect around blood vessels. might truly edge effect, endothelial cells tend lower values 3 variables . compute Moran’s binned data, contiguity neighborhoods. zero.policy = TRUE bins neighbors. larger length scale, Moran’s becomes positive. Comparing Moran’s across different sized bins can give sense length scale spatial autocorrelation. However, problems binning watch : Edge effect, especially using sum binning function use aggregate values binning using rectangular grid, whether use rook queen neighbors. Rook means two cells neighbors share edge, queen means neighbors even merely share vertex. binning can greatly speed computation spatial autocorrelation metrics larger datasets, can used smaller datasets find length scales spatial autocorrelation. hand, seen , Moran’s can flip signs different length scales, larger datasets, exploring spatial autocorrelation cell level still interesting.","code":"(bins <- st_make_grid(colGeometry(sfe, \"centroids\"), n = 100, square = FALSE)) #> Geometry set for 11165 features #> Geometry type: POLYGON #> Dimension: XY #> Bounding box: xmin: -137.2225 ymin: -158.8407 xmax: 10396.09 ymax: 9708.547 #> CRS: NA #> First 5 geometries: #> POLYGON ((-85.58866 -69.40817, -137.2225 -39.59... #> POLYGON ((-85.58866 109.4569, -137.2225 139.267... #> POLYGON ((-85.58866 288.3219, -137.2225 318.132... #> POLYGON ((-85.58866 467.1869, -137.2225 496.997... #> POLYGON ((-85.58866 646.052, -137.2225 675.8628... df <- cbind(colGeometry(sfe, \"centroids\"), colData(sfe)[,c(\"nCounts\", \"nGenes\", \"volume\")]) df_binned <- aggregate(df, bins, FUN = mean) # Remove bins not containing cells df_binned <- df_binned[!is.na(df_binned$nCounts),] # Not using facet_wrap to give each panel its own color scale plts <- lapply(c(\"nCounts\", \"nGenes\", \"volume\"), function(f) { ggplot(df_binned[,f]) + geom_sf(aes(fill = .data[[f]]), linewidth = 0) + scale_fill_distiller(palette = \"Blues\", direction = 1) + theme_void() }) wrap_plots(plts, nrow = 2) nb <- poly2nb(df_binned) listw <- nb2listw(nb, zero.policy = TRUE) calculateMoransI(t(as.matrix(st_drop_geometry(df_binned[,c(\"nCounts\", \"nGenes\", \"volume\")]))), listw = listw, zero.policy = TRUE) #> DataFrame with 3 rows and 2 columns #> moran K #> #> nCounts 0.490837 5.21703 #> nGenes 0.422267 16.10323 #> volume 0.352221 4.78100"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"semivariogram","dir":"Articles","previous_headings":"Spatial autocorrelation at larger length scales","what":"Semivariogram","title":"MERFISH mouse liver dataset and considerations of large data","text":"geostatistical data, underlying spatial process sampled known locations. Kriging uses Gaussian process interpolate values sample locations, semivariogram used model spatial dependency locations covariance Gaussian process. kriging, semivariogram can used exploratory data analysis tool find length scale anisotropy spatial autocorrelation. One classic R packages geostatistical tradition gstat, use find semivariograms, defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Var}(X_t - X_0), \\] \\(X\\) value gene expression, \\(t\\) spatial vector. \\(X_0\\) value location interest, \\(X_t\\) value lagged \\(t\\). positive spatial autocorrelation, variance smaller among nearby values, variogram increase distance, eventually leveling distance beyond length scale spatial autocorrelation. “semi” comes 1/2. variogram Voyager v1.2.0 higher (Bioconductor 3.17 later) can computed runUnivariate() function. See vignette variograms variogram maps. First find empirical variogram assuming ’s directions. data binned distance intervals, much faster correlogram cell level. width argument controlls bin size. cutoff argument maximum distance consider. use defaults. first argument formula; covariates can specified, done . different widths cutoffs, variogram can estimated different length scales. gstat package can also fit model empirical variogram. See vgm() different types models. automap package can choose model user, used Voyager. Unfortunately, gstat doesn’t scale 400,000 cells, although worked 100,000 cells smFISH-based datasets website. since variogram used explore larger length scales anyway, use binned data , problems binning apply. numbers plot number pairs distance bin. variogram 0 0 distance; variance within bin size, called nugget. variogram levels greater distance, value variogram levels sill. Range variogram leveling , indicating length scale spatial autocorrelation; range visual inspection appears closer 1000 model somehow indicates 423. variogram map can made see spatial autocorrelation may differ different directions, .e. anisotropy apparently ’s anisotropy shorter length scales, may artifact hexagonal bins. Going beyond 2000 (whatever unit), variance drops northwest southeast direction directions, perhaps related repetitiveness hepatic lobules general NE/SW direction blood vessels seen previous plots. variogram can also calculated specified angles, selected sides hexagon: variogram rises going beyond 2000 30 90 degrees drops 150 degrees. consistent variogram map. differences averaged omni-directional variogram. gstat fit anisotropy parameters, fitted curve omni-directional. fits pretty well 2000. nCounts, may differ QC metrics genes. anisotropy varies space? problem variogram ’s global, giving one result entire dataset, albeit nuanced just number Moran’s , kriging assumes data intrinsically stationary, meaning variogram model applies everywhere, spatial dependence depends lag two observations. Voyager 1.2.0 implements ggplot2 based plotting functions make better looking customizable plots variograms SFE objects. However, binned data SFE object. considering writing method spatially bin cell level data SFE object Bioconductor 3.18. gstat using lattice, predecessor ggplot2 make facetted plots superseded ggplot2. gstat one oldest R packages still CRAN, dating back days S (prequel R), although oldest archive CRAN 2003. spdep also really old; oldest archive CRAN 2002, ’s still active development. using time honored packages methods (Moran’s Geary’s C date back 1950s modern form date back 1969 (Cliff Ord 1969; Bivand 2013)) cool new spatial transcriptomics dataset, participating glorious tradition, develop spatial analysis tradition forms around spatial -omics data analysis.","code":"# as_Spatial since automap uses old fashioned sp, the predecessor of sf v <- autofitVariogram(nCounts ~ 1, as_Spatial(df_binned)) plot(v) v2 <- variogram(nCounts ~ 1, data = df_binned, width = 300, cutoff = 4500, map = TRUE) plot(v2) v3 <- variogram(nCounts ~ 1, df_binned, alpha = c(30, 90, 150)) v3_model <- fit.variogram(v3, vgm(\"Ste\")) plot(v3, v3_model)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"pca-for-larger-datasets","dir":"Articles","previous_headings":"","what":"PCA for larger datasets","title":"MERFISH mouse liver dataset and considerations of large data","text":"many ways PCA R, BiocSingular package makes number different methods available consistent user interface, supports memory data DelayedArray. According benchmark, stats::prcomp() shipped R rather slow larger datasets. fastest methods irlba::irlba() RSpectra::svds(), former supported BiocSingular. use IRLBA see long takes. Many PCA algorithms involve repeated matrix multiplications. R come optimized BLAS LAPACK, portability reasons. However, BLAS LAPACK used R can changed optimized one (’s ), speed matrix multiplication. ’s pretty quick almost 400,000 cells, aren’t many genes . Use elbow plot see variance explained PC: Plot top gene loadings PC Many genes seem related endothelium. Plot first 4 PCs space PC1 PC4 highlight major blood vessels, PC2 PC3 less spatial structure. CosMX Xenium datasets website, top PCs clear spatial structures despite absence spatial information non-spatial PCA clear spatial compartments cell types, seem case dataset except blood vessels. seen genes strong spatial structures. methods spatially informed PCA, MULTISPATI PCA (Dray, Saı̈d, Débias 2008) adespatial package, seeks maximize variance (non-spatial PCA) Moran’s PC. Unlike traditional PCs, eigenvalues, signifying variance explained, positive, MULTISPATI PCA can negative eigenvalues, signify negative spatial autocorrelation. PCs MULTISPATI PCA positive eigenvalues also spatially coherent non-spatial PCA. CosMX Xenium datasets website, spatial coherence MULTISPATI might make difference, might make difference dataset non-spatial PCs don’t show much spatial structure, least larger scale entire tissue section. Voyager 1.2.0 (Bioconductor 3.17) faster implementation MULTISPATI PCA adespatial, demonstrated dataset another vignette. PC2 PC3 don’t seem large scale spatial structure, may local spatial structure obvious plotting entire section, zoom bounding box: ’s spatial structure PC2 PC3 smaller scale, perhaps negative spatial autocorrelation. Like global Moran’s , PCA MULTISPATI PCA return one result entire dataset. contrast, geographically weighted PCA (GWPCA) (Harris et al. 2015) can account spatial heterogeneity. GWPCA runs PCA spatial location using nearby locations weighed kernel. different locations can different PCs, results can visualized “winning variables” PC, .e. plotting feature highest loading PC space. likely doesn’t scale 400,000 cells, still interesting performed spatially binned data. GWPCA might added Bioconductor 3.18 require changes user interface, GWPCA features rather cell embeddings.","code":"set.seed(29) system.time( sfe <- runPCA(sfe, ncomponents = 20, subset_row = !is_blank, exprs_values = \"logcounts\", scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed #> 22.968 1.253 24.254 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) #> Ncells 16650376 889.3 32047497 1711.6 NA 32047497 1711.6 #> Vcells 250335884 1910.0 490341574 3741.1 16384 490268204 3740.5 ElbowPlot(sfe) plotDimLoadings(sfe) spatialReducedDim(sfe, \"PCA\", 4, colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0) spatialReducedDim(sfe, \"PCA\", ncomponents = 2:3, colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"more-challenges-from-large-datasets","dir":"Articles","previous_headings":"","what":"More challenges from large datasets","title":"MERFISH mouse liver dataset and considerations of large data","text":", despite numerous cells, data loaded memory. data doesn’t fit memory? might write new vignette DelayedArray demonstrating memory data analysis Bioconductor 3.18. already supported SingleCellExperiment, SFE inherits . However, geometries, graphs, local results can take lot memory well. can possibly stored SQL databases operated SQLDataFrame. geometric operations can handled sedona, although options limited compared GEOS, performs geometric operations behind scene sf. Another question can raised large spatial transcriptomics data: still good idea analyze entire dataset ? must many interesting unique neighborhoods might get attention deserve whole dataset analyzed . , geographical space, national level data usually analyzed block resolution, although reason privacy subjects. County resolution often used, aren’t hundreds thousands counties. Many analyses done cities counties neighborhood resolution; using largest geographical unit isn’t always relevant. Back histological space: aggregate cells larger spatial units? decide scale spatial units (analogous nation vs state vs county etc) relevant? traditional anatomical ontologies, Allen Brain Atlas, isn’t available tissues. Also, single cell -omics data, traditional ontologies can improved. Furthermore, 3D thick section single cell resolution spatial transcriptomics data, STARmap (X. Wang et al. 2018) EASI-FISH (Y. Wang et al. 2021), although vast majority spatial -omics data thin sections pretty much de facto 2D. mostly live surface Earth, many 2D geospatial resources 3D. However, methods can principle applied 3D existing software primarily made 2D data might work. example, GEOS supports 3D data, principle 3D geometries sf work, although ’s little documentation . Also, k nearest neighbor, Moran’s , variograms, etc. principle work 3D, gstat officially supports 3D. challenges related 3D data: Even multiple z-planes imaged, resolution much lower z direction x y directions. z-plane treated attribute coordinate? make static plots 3D data publications? complicated plotting z-planes separately, since 3D block can sectioned direction. Also interactive visualization, need somehow see tissue. Finally, geospatial tradition one tradition relevant large spatial data, present Voyager works vector data. uncertain whether raster added later version, existing tools large raster data well, TileDB. traditions can relevant, astronomy image processing, beyond scope package.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"MERFISH mouse liver dataset and considerations of large data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 automap_1.1-9 #> [3] BiocNeighbors_1.20.2 gstat_2.1-1 #> [5] BiocSingular_1.18.0 BiocParallel_1.36.0 #> [7] spdep_1.3-3 sf_1.0-16 #> [9] spData_2.3.0 stringr_1.5.1 #> [11] patchwork_1.2.0 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [17] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [19] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [21] IRanges_2.36.0 S4Vectors_0.40.2 #> [23] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [25] matrixStats_1.3.0 SFEData_1.4.0 #> [27] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] xts_0.13.2 lifecycle_1.0.4 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] cowplot_1.1.3 RColorBrewer_1.1-3 #> [19] DBI_1.2.2 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] e1071_1.7-14 systemfonts_1.0.6 #> [43] tools_4.3.3 ggnewscale_0.4.10 #> [45] ragg_1.3.0 Rcpp_1.0.12 #> [47] glue_1.7.0 gridExtra_2.3 #> [49] SparseArray_1.2.4 xfun_0.43 #> [51] dplyr_1.1.4 HDF5Array_1.30.1 #> [53] withr_3.0.0 BiocManager_1.30.22 #> [55] fastmap_1.1.1 boot_1.3-30 #> [57] rhdf5filters_1.14.1 bluster_1.12.0 #> [59] fansi_1.0.6 digest_0.6.35 #> [61] rsvd_1.0.5 R6_2.5.1 #> [63] mime_0.12 textshaping_0.3.7 #> [65] colorspace_2.1-0 wk_0.9.1 #> [67] scattermore_1.2 RSQLite_2.3.6 #> [69] hexbin_1.28.3 utf8_1.2.4 #> [71] generics_0.1.3 intervals_0.15.4 #> [73] FNN_1.1.4 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [79] scico_1.5.0 gtable_0.3.5 #> [81] blob_1.2.4 XVector_0.42.0 #> [83] htmltools_0.5.8.1 scales_1.3.0 #> [85] png_0.1-8 knitr_1.45 #> [87] rjson_0.2.21 spacetime_1.3-1 #> [89] curl_5.2.1 proxy_0.4-27 #> [91] cachem_1.0.8 zoo_1.8-12 #> [93] rhdf5_2.46.1 BiocVersion_3.18.1 #> [95] KernSmooth_2.23-22 parallel_4.3.3 #> [97] vipor_0.4.7 AnnotationDbi_1.64.1 #> [99] desc_1.4.3 s2_1.1.6 #> [101] reshape_0.8.9 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 plyr_1.8.9 #> [121] fs_1.6.4 ggbeeswarm_0.7.2 #> [123] stringi_1.8.3 stars_0.6-5 #> [125] viridisLite_0.4.2 deldir_2.0-4 #> [127] munsell_0.5.1 Biostrings_2.70.3 #> [129] Matrix_1.6-5 ExperimentHub_2.10.0 #> [131] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [133] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [135] statmod_1.5.0 shiny_1.8.1.1 #> [137] highr_0.10 interactiveDisplayBase_1.40.0 #> [139] AnnotationHub_3.10.1 igraph_2.0.3 #> [141] memoise_2.0.1 bslib_0.7.0 #> [143] bit_4.0.5"},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"seqFISH exploratory data analysis","text":"data used vignette described Integration spatial single-cell transcriptomic data elucidates mouse organogenesis. Briefly, seqFISH use profile 351 genes several mouse embryos 8-12 somite stage (ss). focus single biological replicate, embryo 3. raw processed counts corresponding metadata available download Marioni lab. Expression matrices, segmentation data, segmented cell vertices provided R objects can readily imported R environment. data relevant vignette converted SFE object available download Box. data added SFEData package Bioconductor available 3.17 release. begin downloading data loading R. rows count matrix correspond 351 barcoded genes measured seqFISH. Additionally, authors provide metadata, including field view z-slice cell. filter count matrix metadata include cells single z-slice.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(batchelor) library(scater) library(scran) library(bluster) library(purrr) library(tidyr) library(dplyr) library(fossil) library(ggplot2) library(patchwork) library(spdep) library(BiocParallel) theme_set(theme_bw()) # Only Bioc 3.17 and above sfe <- LohoffGastrulationData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache names(colData(sfe)) #> [1] \"uniqueID\" \"embryo\" #> [3] \"pos\" \"z\" #> [5] \"x_global\" \"y_global\" #> [7] \"x_global_affine\" \"y_global_affine\" #> [9] \"embryo_pos\" \"embryo_pos_z\" #> [11] \"Area\" \"UMAP1\" #> [13] \"UMAP2\" \"celltype_mapped_refined\" #> [15] \"sample_id\" mask <- colData(sfe)$z == 2 sfe <- sfe[,mask]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"seqFISH exploratory data analysis","text":"begin quality control (QC) cells computing metrics common single-cell analysis store colData field SFE object. , compute number counts per cell. also compute average display violin plot. Notably, cells dataset fewer counts expected single-cell sequencing experiment cells higher counts seem dispersed throughout tissue. Fewer counts expected seqFISH experiments probing highly expressed genes may lead optical crowding multiple imaging rounds. Since counts collected several fields view, visualize number cells total counts field separately. variability total number counts field view. completely apparent accounts low number counts FOVs. example, FOV 22 fewest number cells, comparably counts detected regions cells (e.g. FOV 18). Next, compute number genes detected per cell, defined number genes non-zero counts. plot metric FOV done . Many cells fewer 100 detected genes. part reflects panel 351 probed genes chosen distinguish cell types developmental stages distinct cell types likely express small subset 351 genes. authors also note gene panel consists lowly expressed moderately expressed genes. Taken together, technical details can explain relatively low number counts genes per cell. , plot number genes detected per cell FOV. plot mirrors plot total counts. single FOV stands obvious outlier. authors provided cell type assignments metadata. can assess whether low quality cells tend located particular FOV. appears FOV 26 31 largest fraction low quality cells. Interestingly, correspond FOVs largest number cells overall. plot nCounts vs. nGenes FOV. scRNA-seq, gene expression variance seqFISH measurements overdispersed compared variance counts Poisson distributed. understand mean-variance relationship, compute mean variance gene among cells tissue. , perform calculation separately FOV red line represents line \\(y = x\\), mean-variance relationship expected Poisson distributed data. data deviate expectation FOV. case, variance greater expected.","code":"colData(sfe)$nCounts <- colSums(counts(sfe)) avg <- mean(colData(sfe)$nCounts) violin <- plotColData(sfe, \"nCounts\") + geom_hline(yintercept = avg, color='red') + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"seg_coords\") violin + spatial pos <- colData(sfe)$pos counts_spl <- split.data.frame(t(counts(sfe)), pos) # nCounts per FOV df <- map_dfr(counts_spl, rowSums, .id='pos') |> pivot_longer(cols=contains('embryo'), values_to = 'nCounts') |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> dplyr::filter(!is.na(nCounts)) cells_fov <- colData(sfe) |> as.data.frame() |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> ggplot(aes(pos,)) + geom_bar() + theme_minimal() + labs( x = \"\", y = \"Number of cells\") + theme(axis.text.x = element_text(angle = 90)) counts_fov <- ggplot(df, aes(pos, nCounts)) + geom_boxplot(outlier.size = 0.5) + theme_minimal() + labs(x = \"\", y = 'nCounts') + theme(axis.text.x = element_text(angle = 90)) cells_fov / counts_fov colData(sfe)$nGenes <- colSums(counts(sfe) > 0) avg <- mean(colData(sfe)$nGenes) violin <- plotColData(sfe, \"nGenes\") + geom_hline(yintercept = avg, color='red') + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"seg_coords\") violin + spatial df <- map_dfr(counts_spl, ~ rowSums(.x > 0), .id='pos') |> pivot_longer(cols = contains('embryo'), values_to = 'nGenes') |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> filter(!is.na(nGenes)) |> merge(df) genes_fov <- ggplot(df, aes(pos, nGenes)) + geom_boxplot(outlier.size = 0.5) + theme_bw() + labs(x = \"\") + theme(axis.text.x = element_text(angle = 90)) genes_fov meta <- data.frame(colData(sfe)) meta <- meta |> group_by(pos) |> add_tally(name = \"nCells_FOV\") |> filter(celltype_mapped_refined %in% \"Low quality\") |> add_tally(name = \"nLQ_FOV\") |> mutate(prop_lq = nLQ_FOV/nCells_FOV) |> distinct(pos, prop_lq) |> ungroup() |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) prop_lq <- ggplot(meta, aes(pos, prop_lq)) + geom_bar(stat = 'identity' ) + theme(axis.text.x = element_text(angle = 90)) prop_lq count_vs_genes_p <- ggplot(df, aes(nCounts, nGenes)) + geom_point( alpha = 0.5, size = 1, fill = \"white\" ) + facet_wrap(~ pos) count_vs_genes_p gene_meta <- map_dfr(counts_spl, colMeans, .id = 'pos') |> pivot_longer(cols = -pos, names_to = 'gene', values_to = 'mean') gene_meta <- map_dfr(counts_spl, ~colVars(.x, useNames = TRUE), .id = 'pos') |> pivot_longer(-pos, names_to = 'gene', values_to='variance') |> full_join(gene_meta) #> Joining with `by = join_by(pos, gene)` ggplot(gene_meta, aes(mean, variance)) + geom_point( alpha = 0.5, size = 1, fill = \"white\" ) + facet_wrap(~ pos) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"data-normalization-and-dimension-reduction","dir":"Articles","previous_headings":"","what":"Data normalization and dimension reduction","title":"seqFISH exploratory data analysis","text":"exploratory analysis indicates presence batch effects corresponding FOV. use normalization scheme batch aware. SFE object inherits SpatialExperimentand SingleCellExperiment, classes, can take advantage normalization methods implemented scran batchelor R packages. first use multiBatchNorm() function scale data within batch. noted documentation, function uses median-based normalization ratio average counts batches. Batch correction dimension reduction accomplished using fastMNN() performs multi-sample PCA across multiple gene expression matrices project cells common low-dimensional space. function fastMNN returns batch-corrected matrix reducedDims slot SingleCellExperiment object. extract relevant data store SFE ojbject. Now visualize first two PCs space. notice PCs may show spatial structure correlates biological niches cells. Unfortunately, FOV artifacts can still seen.","code":"sfe <- multiBatchNorm(sfe, batch = pos) sfe_red <- fastMNN(sfe, batch = pos, cos.norm = FALSE, d = 20) reducedDim(sfe, \"PCA\") <- reducedDim(sfe_red, \"corrected\") assay(sfe, \"reconstructed\") <- assay(sfe_red, \"reconstructed\") spatialReducedDim(sfe, \"PCA\", ncomponents = 2, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"clustering","dir":"Articles","previous_headings":"","what":"Clustering","title":"seqFISH exploratory data analysis","text":"Much like single cell analysis, can use batch-corrected data cluster cells. implement graph-based clustering algorithm plot resulting clusters space. plot colored cluster ID cell types provided author. authors assigned cells types identified clustering step. case, clustering results seem recapitulate major cell niches previous annotations. can compute Rand index using function fossil package assess similarity two clustering results. value 1 suggest clustering results identical, value 0 suggest results agree . relatively large Rand index suggests cells often found cluster cases.","code":"colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\"), BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\") ) ) plotSpatialFeature(sfe, c(\"cluster\", \"celltype_mapped_refined\"), colGeometryName = \"seg_coords\") g1 <- as.numeric(colData(sfe)$cluster) g2 <- as.numeric(colData(sfe)$celltype_mapped_refined) rand.index(g1, g2) #> [1] 0.8486922"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"univariate-spatial-statistics","dir":"Articles","previous_headings":"","what":"Univariate Spatial Statistics","title":"seqFISH exploratory data analysis","text":"point, may interested identifying genes exhibit spatial variability, whose expression depends spatial location within tissue. Measures spatial autocorrelation can useful identifyign genes display spatial variablity. Among common measures Moran’s Geary’s C. latter case, less 1 indicates positive spatial autocorrelation, value larger 1 points negative spatial autocorrelation. former case, positive negative values Moran’s indicate positive negative spatial autocorrelation, respectively. tests require spatial neighborhood graph computation statistic. several ways define spatial neighbors findSpatialNeighbors() function wraps methods implemented spdep package. , compute k-nearest neighborhood graph. dist_type = \"idw\" weights edges graph inverse distance neighbors. also save variable genes use computations . use runUnivariate() function compute spatial autocorrelation metrics save results save SFE object. mc type test implements permutation test statistic relies nsim argument computing p-value statistic. can plot results Monte Carlo simulations: vertical line represents observed value Moran’s density represents Moran’s computed permuted data. simulations suggest spatial autocorrelation feature significant. function can also used plot geary.mc results. Now, might ask: genes display spatial autocorrelation? appears genes highest spatial autocorrelation seem obvious expression patterns tissue. interesting see genes also differentially expressed clusters . Non-spatial differential gene expression can interrogated using findMarkers() function implemented scran package complex methods identifying spatially variable genes actively developed. analyses bring interesting considerations. one, unclear whether normalization scheme employed effectively removes FOV batch effects. said, may times FOV differences expected represent biological differences, example context tumor sample. remains seen normalization methods perform best cases, represents area research.","code":"colGraph(sfe, \"knn5\") <- findSpatialNeighbors( sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") dec <- modelGeneVar(sfe) hvgs <- getTopHVGs(dec, n = 100) sfe <- runUnivariate( sfe, type = \"geary.mc\", features = hvgs, colGraphName = \"knn5\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- runUnivariate( sfe, type = \"moran.mc\", features = hvgs, colGraphName = \"knn5\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- colDataUnivariate( sfe, type = \"moran.mc\", features = c(\"nCounts\", \"nGenes\"), colGraphName = \"knn5\", nsim = 100) plotMoranMC(sfe, \"Meox1\") top_moran <- rownames(sfe)[order(-rowData(sfe)$moran.mc_statistic_sample01)[1:4]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"seg_coords\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"seqFISH exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocParallel_1.36.0 spdep_1.3-3 #> [3] sf_1.0-16 spData_2.3.0 #> [5] patchwork_1.2.0 fossil_0.4.0 #> [7] shapefiles_0.7.2 foreign_0.8-86 #> [9] maps_3.4.2 sp_2.1-4 #> [11] dplyr_1.1.4 tidyr_1.3.1 #> [13] purrr_1.0.2 bluster_1.12.0 #> [15] scran_1.30.2 scater_1.30.1 #> [17] ggplot2_3.5.1 scuttle_1.12.0 #> [19] batchelor_1.18.1 SpatialFeatureExperiment_1.3.0 #> [21] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [23] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [25] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [27] IRanges_2.36.0 S4Vectors_0.40.2 #> [29] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [31] matrixStats_1.3.0 SFEData_1.4.0 #> [33] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 RColorBrewer_1.1-3 #> [17] cowplot_1.1.3 DBI_1.2.2 #> [19] ResidualMatrix_1.12.0 abind_1.4-5 #> [21] zlibbioc_1.48.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 tools_4.3.3 #> [45] ggnewscale_0.4.10 ragg_1.3.0 #> [47] Rcpp_1.0.12 glue_1.7.0 #> [49] gridExtra_2.3 SparseArray_1.2.4 #> [51] xfun_0.43 HDF5Array_1.30.1 #> [53] withr_3.0.0 BiocManager_1.30.22 #> [55] fastmap_1.1.1 boot_1.3-30 #> [57] rhdf5filters_1.14.1 fansi_1.0.6 #> [59] digest_0.6.35 rsvd_1.0.5 #> [61] R6_2.5.1 mime_0.12 #> [63] textshaping_0.3.7 colorspace_2.1-0 #> [65] wk_0.9.1 RSQLite_2.3.6 #> [67] utf8_1.2.4 generics_0.1.3 #> [69] class_7.3-22 httr_1.4.7 #> [71] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [73] pkgconfig_2.0.3 scico_1.5.0 #> [75] gtable_0.3.5 blob_1.2.4 #> [77] XVector_0.42.0 htmltools_0.5.8.1 #> [79] scales_1.3.0 png_0.1-8 #> [81] knitr_1.45 rjson_0.2.21 #> [83] curl_5.2.1 proxy_0.4-27 #> [85] cachem_1.0.8 rhdf5_2.46.1 #> [87] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [89] parallel_4.3.3 vipor_0.4.7 #> [91] AnnotationDbi_1.64.1 desc_1.4.3 #> [93] s2_1.1.6 pillar_1.9.0 #> [95] grid_4.3.3 vctrs_0.6.5 #> [97] promises_1.3.0 BiocSingular_1.18.0 #> [99] dbplyr_2.5.0 beachmat_2.18.1 #> [101] xtable_1.8-4 cluster_2.1.6 #> [103] beeswarm_0.4.0 evaluate_0.23 #> [105] magick_2.8.3 cli_3.6.2 #> [107] locfit_1.5-9.9 compiler_4.3.3 #> [109] rlang_1.1.3 crayon_1.5.2 #> [111] labeling_0.4.3 classInt_0.4-10 #> [113] fs_1.6.4 ggbeeswarm_0.7.2 #> [115] viridisLite_0.4.2 deldir_2.0-4 #> [117] munsell_0.5.1 Biostrings_2.70.3 #> [119] Matrix_1.6-5 ExperimentHub_2.10.0 #> [121] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [123] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [125] statmod_1.5.0 shiny_1.8.1.1 #> [127] highr_0.10 interactiveDisplayBase_1.40.0 #> [129] AnnotationHub_3.10.1 igraph_2.0.3 #> [131] memoise_2.0.1 bslib_0.7.0 #> [133] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"CODEX exploratory data analysis","text":"","code":"library(Voyager) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(batchelor) library(scater) library(scran) library(bluster) library(glue) library(purrr) library(tidyr) library(dplyr) library(ggplot2) library(gghighlight) library(patchwork) library(spdep) library(spatialDE) library(BiocParallel) theme_set(theme_bw())"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"CODEX exploratory data analysis","text":"dataset used vignette paper Strategies Accurate Cell Type Identification CODEX Multiplexed Imaging Data(Hickey, et.al 2021). data collected part HuBMap consortium seeks characterize healthy human tissues make data broadly available. specifically, dataset characterizes 4 regions large intestine (colon) single donor. vignette focus data sigmoid colon. intestinal sections interrogated using multiplexed imaging method CO-Detection indEXing (CODEX). CODEX involves cyclical staining tissue DNA-barcoded antibodies. round experimentation, fluoresently labeled probes hybridize tissue bound DNA-conjugated antibodies subsequently imaged stripped tissue. present, technology quantifies 60 markers single experiment. Raw images generated process subjected image stitching, drift compensation, deconvolution, cycle concatenation using publicly avaialable software. result pre-processing matrix contains location individual cells quantified markers cell. Cell types assigned described manuscript linked . Briefly, authors used hand-gating strategy define cell types create standard compare effect normalization methods clustering cell annotation. raw intensity data available download HuBMAP identifier HBM575.THQMM.284 cell type annotations provided supplementary data manuscript. data relevant vignette converted SFE object available download Box. data submitted SFEData package Bioconductor available future release. begin downloading data loading R. rows count matrix correspond 47 barcoded genes measured CODEX. Additionally, authors provide metadata cells, including cell type. turns column names unique cause errors downstream analysis. update column names ","code":"download.file(\"https://caltech.box.com/public/static/zfr8l20450n2z28lnp0ugdj471ph9eyx\",'./codex.Rds', mode='wb', method = 'wget', quiet = TRUE) sfe <- readRDS(\"./codex.Rds\") sfe #> class: SpatialFeatureExperiment #> dim: 47 19724 #> metadata(0): #> assays(1): protein #> rownames(47): MUC2 SOX9 ... CD49a CD163 #> rowData names(0): #> colnames(19724): 1 2 ... 182 184 #> colData names(9): cell_id cell_type ... fn sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : X Y #> imgData names(0): #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: cellids <- glue(\"{colData(sfe)$fn}_{colData(sfe)$cell_id}\") colnames(sfe) <- cellids"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"exploratory-data-analysis","dir":"Articles","previous_headings":"Dataset","what":"Exploratory Data Analysis","title":"CODEX exploratory data analysis","text":"can see figure colonic epithelium enriched cells loose connective tissue muscle layers beneath epithelial layer sparsely populated. line known colon histology. epithelium enriched goblet cells invaginations project inwards towards connective tissue. Smooth muscle cells also prominent colon, bands muscle contract move colonic contents towards rectum. can visualize cell types space using plotSpatialFeature() function. highlight Goblet smooth muscle cells display relative distribution tissue. Since CODEX image processing relies segmentation, dot plot represents single cell. , cell represented centroid, can also visualized cell polygons cases segmentation mask available. goblet cells clearly define epithelial border tissue thick bands smooth muscle cells prominent mucosa. Next, compute gene level metrics 47 barcoded genes. contrast RNA-based methods, fields matrix represent intensities rather counts. appears sigmoid relationship mean variance protein expression. pattern reminiscent might expected intensity values derived Gamma distribution, continuous analog Negative Binomial distribution typically used describe count data scRNA-seq experiments. may implications CODEX data variance stabilized future. CODEX data subject noise several sources including segmentation artifacts, nonspecific staining, imperfect tissue processing. factors can limit accurate quantification signal intensity impede accurate cell annotation. authors dataset tested effects several normalization methods cell type annotation clustering found Z-score normalization marker resulted accurate identification rare common cell types. cell , demonstrate accomplish using standard matrix operations. normalized count matrix typically stored logcounts slot scRNA-seq data, instead store normalized matrix slot called normalizedIntensity.","code":"celldensity <- plotCellBin2D(sfe) celldensity spatial <- plotSpatialFeature(sfe, features='cell_type', colGeometryName = \"centroids\") + gghighlight(cell_type %in% c(\"Goblet\", \"SmoothMuscleME\")) #> Warning: Tried to calculate with group_by(), but the calculation failed. #> Falling back to ungrouped filter operation... spatial rowData(sfe)$mean <- rowMeans(assay(sfe)) rowData(sfe)$var <- rowVars(assay(sfe)) data.frame(rowData(sfe)) |> ggplot(aes(mean, var)) + geom_point() mtx <- assay(sfe, 'protein') assay(sfe, 'normalizedIntensity') <- (mtx - rowMeans(mtx))/rowSds(mtx) assays(sfe) #> List of length 2 #> names(2): protein normalizedIntensity"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"spatial-eda","dir":"Articles","previous_headings":"","what":"Spatial EDA","title":"CODEX exploratory data analysis","text":"Neighbor definition critical step computation metrics spatial dependency like Moran’s Geary’s C. definition neighbors complex, even cell polygons available. latter case, poly2nb method might appropriate assign two cells neighbors physically touch share border. may tenable cases cells sparse cells represented centroids, dataset. compute spatial neighborhood graph using knearestneigh function implemented spdep. brief, Euclidean distances computed pair cells k nearest cells considered neighbors. following code cell, consdier k=10 speed purposes, may ideal general. weights neighborhood matrix inverse-distance weighted, weight regions listed neighbors increases distance pairs points decreases. Setting style = \"W\" ensures weights row standardized. plotColGraph() function plots graph space along corresponding colGeometry, since many cells dataset, plotting neighborhood graph may useful many connections obscure overlapping lines. case, demonstrate use function . Next, explore univariate metrics global spatial autocorrelation. Since genes quantified study, compute metrics genes. larger datasets, may useful restrict analysis variable genes. use runUnivariate() function compute spatial autocorrelation metrics save results SFE object. results computations accessible rowData attribute SFE object. Next, plot results genes highest Moran’s statistic. vertical line plot represents observed Moran’s density represents Moran’s statistic random permutations data. plots suggests Moran’s statistic significant. can plot normalized intensity genes space. genes appear spatial distribution, also seems may overlap cell type. cells appear express genes interest seem spatially restricted known boundaries tissue. moranPlot() function plots spatial data spatially lagged values enables users assess similar observed values neighbors. variable centered, plot divided four quadrants defined horizontal line y = 0 vertical line x = 0. Points upper right (high-high) lower left (low-low) quadrants indicate positive spatial association, points lower right (high-low) upper left (low-high) quadrants include observations exhibit negative spatial association.","code":"colGraph(sfe, \"knn10\") <- findSpatialNeighbors( sfe, method = \"knearneigh\", dist_type = \"idw\", k = 10, style = \"W\") #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' plotColGraph(sfe, colGraphName = \"knn10\", colGeometryName = 'centroids') sfe <- runUnivariate( sfe, type = \"moran.mc\", features = rownames(sfe), exprs_values = \"normalizedIntensity\", colGraphName = \"knn10\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- runUnivariate( sfe, type = \"moran.plot\", features = rownames(sfe), exprs_values = \"normalizedIntensity\", colGraphName = \"knn10\") colnames(rowData(sfe)) #> [1] \"mean\" \"var\" #> [3] \"moran.mc_statistic_sample01\" \"moran.mc_parameter_sample01\" #> [5] \"moran.mc_p.value_sample01\" \"moran.mc_alternative_sample01\" #> [7] \"moran.mc_method_sample01\" \"moran.mc_res_sample01\" top_moran <- data.frame(rowData(sfe)) |> arrange(desc(moran.mc_statistic_sample01)) |> head(6) |> rownames() moran <- plotMoranMC(sfe, features = top_moran, facet_by = 'features') moran plotSpatialFeature( sfe, features=top_moran, colGeometryName = \"centroids\", exprs_values = \"normalizedIntensity\", scattermore = TRUE, pointsize = 1) moranPlot(sfe, top_moran[1])"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential Expression","title":"CODEX exploratory data analysis","text":"Moran’s global spatial autocorrelation metrics provide insight spatial patterns gene expression, necessarily limited structure imposed spatial weights matrix. complimentary task might identify spatially variable (SV) genes. One method described SpatialDE: identification spatially variable genes. method described manuscript relies Gaussian process regression decomposes variability expression spatial non-spatial components. contrast Moran’s , covariance pair cells modeled function distance . Notably, require explicit specification hte neighborhood graph, rather parameter controls decay covariance distance increases. spatialDE package implemented R requires normalized matrix input. spatialDE() function package performs normalization steps running algorithm. data already normalized, use run() function directly run spatialDE. first convert centroid coordinates data frame required function. can plot normalized expression top 5 genes space. Perhaps unsurprisingly, expression top DE genes seems highlight spatial distribution known cell types tissue rather identify spatially restricted gene expression. related experimental design targeted genes chosen differentiate cell types. Perhaps genome-wide technologies, potential discovery neew gene expression patterns plausible. open question whether results offer new information compared inferred typical DE expression methods. analyses represent minority types inferences can made protein expression data. interested investigate protein expression results compare inform data spatail scRNA-sequencing experiments. Already, work done obtain multimodal spatial measurements sample. Importantly however, considerations made types biases individual technology adds measurements. active areas research ripe future exploration.","code":"# Store coordinates in a data frame object coords <- centroids(sfe)$geometry |> purrr::map_dfr(\\(x) c(x = x[1], y = x[2])) # de_res <- spatialDE::run(assay(sfe,\"normalizedIntensity\"), coords, verbose=TRUE) # top_genes <- de_res |> # arrange(pval) |> # slice_head(n=6) |> # pull(g) # # plotSpatialFeature(sfe, top_genes, colGeometryName=\"centroids\", # exprs_values = \"normalizedIntensity\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"CODEX exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocParallel_1.36.0 spatialDE_1.8.1 #> [3] spdep_1.3-3 sf_1.0-16 #> [5] spData_2.3.0 patchwork_1.2.0 #> [7] gghighlight_0.4.1 dplyr_1.1.4 #> [9] tidyr_1.3.1 purrr_1.0.2 #> [11] glue_1.7.0 bluster_1.12.0 #> [13] scran_1.30.2 scater_1.30.1 #> [15] ggplot2_3.5.1 scuttle_1.12.0 #> [17] batchelor_1.18.1 SpatialFeatureExperiment_1.3.0 #> [19] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [21] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [23] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [25] IRanges_2.36.0 S4Vectors_0.40.2 #> [27] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [29] matrixStats_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] memoise_2.0.1 DelayedMatrixStats_1.24.0 #> [15] RCurl_1.98-1.14 terra_1.7-71 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] BiocNeighbors_1.20.2 Rhdf5lib_1.24.2 #> [21] s2_1.1.6 SparseArray_1.2.4 #> [23] rhdf5_2.46.1 sass_0.4.9 #> [25] KernSmooth_2.23-22 bslib_0.7.0 #> [27] basilisk_1.14.3 htmlwidgets_1.6.4 #> [29] desc_1.4.3 cachem_1.0.8 #> [31] ResidualMatrix_1.12.0 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 Matrix_1.6-5 #> [37] R6_2.5.1 fastmap_1.1.1 #> [39] GenomeInfoDbData_1.2.11 digest_0.6.35 #> [41] colorspace_2.1-0 ggnewscale_0.4.10 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] filelock_1.0.3 fansi_1.0.6 #> [51] mgcv_1.9-1 abind_1.4-5 #> [53] compiler_4.3.3 proxy_0.4-27 #> [55] withr_3.0.0 backports_1.4.1 #> [57] viridis_0.6.5 DBI_1.2.2 #> [59] highr_0.10 HDF5Array_1.30.1 #> [61] MASS_7.3-60.0.1 DelayedArray_0.28.0 #> [63] rjson_0.2.21 classInt_0.4-10 #> [65] tools_4.3.3 units_0.8-5 #> [67] vipor_0.4.7 beeswarm_0.4.0 #> [69] nlme_3.1-164 rhdf5filters_1.14.1 #> [71] grid_4.3.3 checkmate_2.3.1 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] isoband_0.2.7 gtable_0.3.5 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 metapod_1.10.1 #> [81] sp_2.1-4 utf8_1.2.4 #> [83] XVector_0.42.0 ggrepel_0.9.5 #> [85] pillar_1.9.0 limma_3.58.1 #> [87] splines_4.3.3 lattice_0.22-6 #> [89] deldir_2.0-4 tidyselect_1.2.1 #> [91] locfit_1.5-9.9 knitr_1.45 #> [93] gridExtra_2.3 edgeR_4.0.16 #> [95] scattermore_1.2 xfun_0.43 #> [97] statmod_1.5.0 yaml_2.3.8 #> [99] boot_1.3-30 evaluate_0.23 #> [101] codetools_0.2-20 tibble_3.2.1 #> [103] cli_3.6.2 reticulate_1.36.1 #> [105] systemfonts_1.0.6 munsell_0.5.1 #> [107] jquerylib_0.1.4 Rcpp_1.0.12 #> [109] dir.expiry_1.10.0 png_0.1-8 #> [111] parallel_4.3.3 pkgdown_2.0.9 #> [113] basilisk.utils_1.14.1 sparseMatrixStats_1.14.0 #> [115] bitops_1.0-7 viridisLite_0.4.2 #> [117] scales_1.3.0 e1071_1.7-14 #> [119] crayon_1.5.2 scico_1.5.0 #> [121] rlang_1.1.3"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig9_splitseq.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"SPLiT-seq basic quality control","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SpatialFeatureExperiment object.","code":"library(stringr) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) if (!file.exists(\"splitseq.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/splitseq.rds\", destfile = \"splitseq.rds\") sce <- readRDS(\"splitseq.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 37 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") #> Warning: Removed 7213 rows containing non-finite outside the scale range #> (`stat_ydensity()`). #> Warning: Removed 7213 rows containing missing values or values outside the scale range #> (`position_quasirandom()`). plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$subsets_mito_percent < 20)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 18272 102057 #> metadata(0): #> assays(1): counts #> rownames(18272): ENSMUSG00000086053.2 ENSMUSG00000051285.18 ... #> ENSMUSG00000079808.4 ENSMUSG00000095041.8 #> rowData names(1): gene_name #> colnames(102057): AAACATCGAAACATCGACTTCATC AAACATCGAAACATCGAGTCTTGG ... #> TTCACGCATTCACGCATCATATTC TTCACGCATTCACGCATTCATCGC #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] Matrix_1.6-5 stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] abind_1.4-5 compiler_4.3.3 #> [51] proxy_0.4-27 withr_3.0.0 #> [53] BiocParallel_1.36.0 viridis_0.6.5 #> [55] DBI_1.2.2 highr_0.10 #> [57] HDF5Array_1.30.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] bluster_1.12.0 tools_4.3.3 #> [63] units_0.8-5 vipor_0.4.7 #> [65] beeswarm_0.4.0 glue_1.7.0 #> [67] rhdf5filters_1.14.1 grid_4.3.3 #> [69] sf_1.0-16 cluster_2.1.6 #> [71] generics_0.1.3 gtable_0.3.5 #> [73] class_7.3-22 BiocSingular_1.18.0 #> [75] ScaledMatrix_1.10.0 sp_2.1-4 #> [77] utf8_1.2.4 XVector_0.42.0 #> [79] ggrepel_0.9.5 pillar_1.9.0 #> [81] limma_3.58.1 dplyr_1.1.4 #> [83] lattice_0.22-6 deldir_2.0-4 #> [85] tidyselect_1.2.1 locfit_1.5-9.9 #> [87] knitr_1.45 gridExtra_2.3 #> [89] edgeR_4.0.16 xfun_0.43 #> [91] statmod_1.5.0 stringi_1.8.3 #> [93] yaml_2.3.8 boot_1.3-30 #> [95] evaluate_0.23 codetools_0.2-20 #> [97] tibble_3.2.1 cli_3.6.2 #> [99] reticulate_1.36.1 systemfonts_1.0.6 #> [101] munsell_0.5.1 jquerylib_0.1.4 #> [103] Rcpp_1.0.12 png_0.1-8 #> [105] parallel_4.3.3 pkgdown_2.0.9 #> [107] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [109] viridisLite_0.4.2 scales_1.3.0 #> [111] e1071_1.7-14 purrr_1.0.2 #> [113] crayon_1.5.2 scico_1.5.0 #> [115] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic analysis of 10X example Visium dataset","text":"introductory vignette SpatialFeatureExperiment data representation Voyager anlaysis package, demonstrate basic exploratory data analysis (EDA) spatial transcriptomics data. Basic knowledge R SingleCellExperiment assumed. vignette showcases packages Visium spatial gene expression system dataset, downloaded 10X website, Space Ranger output format. technology chosen due popularity, therefore availability numerous publicly available datasets analysis (Moses Pachter 2022). Voyager developed goal facilitating use geospatial methods spatial genomics, introductory vignette restricted non-spatial scRNA-seq EDA Visium dataset. another Visium introductory vignette using dataset SFEData package 10X website. load packages used vignette. download data 10X website. unfiltered gene count matrix: spatial information: Decompress downloaded content: outs directory Space Ranger output looks like: gene count matrix directory: spatial directory: outputs spatial directory explained 10X website. tissue_hires_image.png relatively high resolution image tissue, full resolution. tissue_lowres_image.png file low resolution image tissue, suitable quick plotting, shown : array dots framing tissue seen image fiducials, used align tissue image positions Visium spots, gene expression can matched spatial locations. alignment fiducials shown aligned_fiducials.jpg. Space Ranger can automatically detect spots tissue, spots highlighted detected_tissue_image.jpg. Inside scalefactors_json.json file: spot_diameter_fullres diameter Visium spot full resolution H&E image pixels. tissue_hires_scalef tissue_lowres_scalef ratio size high resolution (full resolution) low resolution H&E image full resolution image. fiducial_diameter_fullres diameter fiducial spot used align spots H&E image pixels full resolution image. tissue_positions_list.csv file contains information coordinates spots full resolution image whether spot tissue (in_tissue, 1 means yes 0 means ) automatically detected Space Ranger manually annotated Loupe browser. spatial_enrichment.csv file Moran’s (presumably spots tissue) p-value gene detected least 10 spots least 20 UMIs. read Space Ranger output R SFE object:","code":"library(Voyager) library(SpatialExperiment) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(ggplot2) library(scater) library(scuttle) library(scran) library(stringr) library(patchwork) library(bluster) library(rjson) theme_set(theme_bw()) if (!file.exists(\"visium_ob.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_raw_feature_bc_matrix.tar.gz\", destfile = \"visium_ob.tar.gz\") if (!file.exists(\"visium_ob_spatial.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_spatial.tar.gz\", destfile = \"visium_ob_spatial.tar.gz\") if (!dir.exists(\"outs\")) { dir.create(\"outs\") system(\"tar -xvf visium_ob.tar.gz -C outs\") system(\"tar -xvf visium_ob_spatial.tar.gz -C outs\") } list.dirs(\"outs\") #> [1] \"outs\" \"outs/raw_feature_bc_matrix\" #> [3] \"outs/spatial\" list.files(\"outs/raw_feature_bc_matrix\") #> [1] \"barcodes.tsv.gz\" \"features.tsv.gz\" \"matrix.mtx.gz\" list.files(\"outs/spatial\") #> [1] \"aligned_fiducials.jpg\" \"detected_tissue_image.jpg\" #> [3] \"scalefactors_json.json\" \"spatial_enrichment.csv\" #> [5] \"tissue_hires_image.png\" \"tissue_lowres_image.png\" #> [7] \"tissue_positions.csv\" fromJSON(file = \"outs/spatial/scalefactors_json.json\") #> $tissue_hires_scalef #> [1] 0.2 #> #> $tissue_lowres_scalef #> [1] 0.06 #> #> $fiducial_diameter_fullres #> [1] 118.9155 #> #> $spot_diameter_fullres #> [1] 73.61433 head(read.csv(\"outs/spatial/tissue_positions.csv\")) #> barcode in_tissue array_row array_col pxl_row_in_fullres #> 1 ACGCCTGACACGCGCT-1 0 0 0 8668 #> 2 TACCGATCCAACACTT-1 0 1 1 8611 #> 3 ATTAAAGCGGACGAGC-1 0 0 2 8554 #> 4 GATAAGGGACGATTAG-1 0 1 3 8498 #> 5 GTGCAAATCACCAATA-1 0 0 4 8441 #> 6 TGTTGGCTGGCGGAAG-1 0 1 5 8384 #> pxl_col_in_fullres #> 1 1102 #> 2 1200 #> 3 1102 #> 4 1200 #> 5 1102 #> 6 1200 head(read.csv(\"outs/spatial/spatial_enrichment.csv\")) #> Feature.ID Feature.Name Feature.Type I P.value #> 1 ENSMUSG00000001023 S100a5 Gene Expression 0.7709048 0 #> 2 ENSMUSG00000019874 Fabp7 Gene Expression 0.6987346 0 #> 3 ENSMUSG00000002985 Apoe Gene Expression 0.6945210 0 #> 4 ENSMUSG00000025739 Gng13 Gene Expression 0.6585750 0 #> 5 ENSMUSG00000090223 Pcp4 Gene Expression 0.6317032 0 #> 6 ENSMUSG00000053310 Nrgn Gene Expression 0.6033600 0 #> Adjusted.p.value Feature.Counts.in.Spots.Under.Tissue #> 1 0 9019 #> 2 0 13462 #> 3 0 67509 #> 4 0 5260 #> 5 0 45118 #> 6 0 10723 #> Median.Normalized.Average.Counts Barcodes.Detected.per.Feature #> 1 15.848669 1021 #> 2 20.679932 1170 #> 3 76.635169 1184 #> 4 8.803694 1050 #> 5 25.811125 1133 #> 6 6.075966 898 (sfe <- read10xVisiumSFE(samples = \".\", type = \"sparse\", data = \"raw\")) #> class: SpatialFeatureExperiment #> dim: 32285 4992 #> metadata(0): #> assays(1): counts #> rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000095019 ENSMUSG00000095041 #> rowData names(8): symbol Feature.Type ... #> Median.Normalized.Average.Counts_sample01 #> Barcodes.Detected.per.Feature_sample01 #> colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Basic analysis of 10X example Visium dataset","text":"mouse olfactory bulb conventionally plotted horizontally. entire SFE object can transposed histologial space make olfactory bulb horizontal. Percentage mitochondrial counts spots outside tissue higher near tissue, especially left. 3 peaks, apparently histologically relevant. Also obvious outliers. unlike scRNA-seq data. Spots tissue wide range mitocondrial percentage. Spots tissue fall 3 clusters plot, seemingly related histological regions.","code":"is_mt <- str_detect(rowData(sfe)$symbol, \"^mt-\") sfe <- addPerCellQCMetrics(sfe, subsets = list(mito = is_mt)) names(colData(sfe)) #> [1] \"in_tissue\" \"array_row\" \"array_col\" #> [4] \"sample_id\" \"sum\" \"detected\" #> [7] \"subsets_mito_sum\" \"subsets_mito_detected\" \"subsets_mito_percent\" #> [10] \"total\" sfe <- SpatialFeatureExperiment::transpose(sfe) plotSpatialFeature(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\"), image_id = \"lowres\", maxcell = 5e4, ncol = 2) plotColData(sfe, \"sum\", x = \"in_tissue\", color_by = \"in_tissue\") + plotColData(sfe, \"detected\", x = \"in_tissue\", color_by = \"in_tissue\") + plotColData(sfe, \"subsets_mito_percent\", x = \"in_tissue\", color_by = \"in_tissue\") + plot_layout(guides = \"collect\") plotColData(sfe, x = \"sum\", y = \"subsets_mito_percent\", color_by = \"in_tissue\") + geom_density_2d() sfe_tissue <- sfe[,sfe$in_tissue] plotColData(sfe_tissue, x = \"sum\", y = \"detected\", bins = 75) #clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue, lowess = FALSE) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension reduction and clustering","title":"Basic analysis of 10X example Visium dataset","text":"clustering show dimension reduction plots Significant markers cluster can obtained follows: genes interesting view spatial context: spatial analyses dataset performed “advanced” version vignette.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat ElbowPlot(sfe_tissue, ndims = 30) names(rowData(sfe_tissue)) #> [1] \"symbol\" #> [2] \"Feature.Type\" #> [3] \"I_sample01\" #> [4] \"P.value_sample01\" #> [5] \"Adjusted.p.value_sample01\" #> [6] \"Feature.Counts.in.Spots.Under.Tissue_sample01\" #> [7] \"Median.Normalized.Average.Counts_sample01\" #> [8] \"Barcodes.Detected.per.Feature_sample01\" plotDimLoadings(sfe_tissue, dims = 1:5, swap_rownames = \"symbol\", ncol = 3) set.seed(29) colData(sfe_tissue)$cluster <- clusterRows(reducedDim(sfe_tissue, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotPCA(sfe_tissue, ncomponents = 5, colour_by = \"cluster\") plotSpatialFeature(sfe_tissue, features = \"cluster\", colGeometryName = \"spotPoly\", image_id = \"lowres\") spatialReducedDim(sfe_tissue, \"PCA\", ncomponents = 5, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, ncol = 2, image_id = \"lowres\", maxcell = 5e4) markers <- findMarkers(sfe_tissue, groups = colData(sfe_tissue)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe_tissue, rowData(sfe_tissue)[genes_use, \"symbol\"], x = \"cluster\", colour_by = \"cluster\", swap_rownames = \"symbol\") plotSpatialFeature(sfe_tissue, genes_use, colGeometryName = \"spotPoly\", ncol = 2, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Basic analysis of 10X example Visium dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] rjson_0.2.21 bluster_1.12.0 #> [3] patchwork_1.2.0 stringr_1.5.1 #> [5] scran_1.30.2 scater_1.30.1 #> [7] scuttle_1.12.0 ggplot2_3.5.1 #> [9] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [11] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [13] Biobase_2.62.0 GenomicRanges_1.54.1 #> [15] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [17] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [19] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [21] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 Matrix_1.6-5 #> [37] R6_2.5.1 fastmap_1.1.1 #> [39] GenomeInfoDbData_1.2.11 digest_0.6.35 #> [41] colorspace_2.1-0 ggnewscale_0.4.10 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 R.utils_2.12.3 #> [59] HDF5Array_1.30.1 MASS_7.3-60.0.1 #> [61] DelayedArray_0.28.0 classInt_0.4-10 #> [63] tools_4.3.3 units_0.8-5 #> [65] vipor_0.4.7 beeswarm_0.4.0 #> [67] R.oo_1.26.0 glue_1.7.0 #> [69] rhdf5filters_1.14.1 grid_4.3.3 #> [71] sf_1.0-16 cluster_2.1.6 #> [73] generics_0.1.3 isoband_0.2.7 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 metapod_1.10.1 #> [81] sp_2.1-4 utf8_1.2.4 #> [83] XVector_0.42.0 ggrepel_0.9.5 #> [85] pillar_1.9.0 limma_3.58.1 #> [87] dplyr_1.1.4 lattice_0.22-6 #> [89] deldir_2.0-4 tidyselect_1.2.1 #> [91] locfit_1.5-9.9 knitr_1.45 #> [93] gridExtra_2.3 edgeR_4.0.16 #> [95] xfun_0.43 statmod_1.5.0 #> [97] DropletUtils_1.22.0 stringi_1.8.3 #> [99] yaml_2.3.8 boot_1.3-30 #> [101] evaluate_0.23 codetools_0.2-20 #> [103] tibble_3.2.1 cli_3.6.2 #> [105] systemfonts_1.0.6 munsell_0.5.1 #> [107] jquerylib_0.1.4 Rcpp_1.0.12 #> [109] parallel_4.3.3 pkgdown_2.0.9 #> [111] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [113] viridisLite_0.4.2 scales_1.3.0 #> [115] e1071_1.7-14 purrr_1.0.2 #> [117] crayon_1.5.2 scico_1.5.0 #> [119] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Spatial analysis of 10X example Visium dataset","text":"introductory vignette, performed basic non-spatial analyses mouse olfactory bulb Visium dataset 10X website. vignette, perform spatial analyses histological space well gene expression space. load packages used vignette: download data 10X website. unfiltered gene count matrix: spatial information: Decompress downloaded content: Contents outs directory Space Ranger explained introductory vignette. read data R SFE object. add QC metrics, already plotted introductory vignette.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(ggplot2) library(scater) library(scuttle) library(scran) library(stringr) library(patchwork) library(bluster) library(rjson) library(EBImage) library(terra) library(rlang) library(sf) library(rmapshaper) library(dplyr) library(BiocParallel) library(BiocNeighbors) library(reticulate) theme_set(theme_bw()) # Specify Python version to use gget PY_PATH <- system(\"which python\", intern = TRUE) use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function gget <- import(\"gget\") if (!file.exists(\"visium_ob.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_raw_feature_bc_matrix.tar.gz\", destfile = \"visium_ob.tar.gz\") if (!file.exists(\"visium_ob_spatial.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_spatial.tar.gz\", destfile = \"visium_ob_spatial.tar.gz\") if (!dir.exists(\"outs\")) { dir.create(\"outs\") system(\"tar -xvf visium_ob.tar.gz -C outs\") system(\"tar -xvf visium_ob_spatial.tar.gz -C outs\") } (sfe <- read10xVisiumSFE(samples = \".\", type = \"sparse\", data = \"raw\")) #> class: SpatialFeatureExperiment #> dim: 32285 4992 #> metadata(0): #> assays(1): counts #> rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000095019 ENSMUSG00000095041 #> rowData names(8): symbol Feature.Type ... #> Median.Normalized.Average.Counts_sample01 #> Barcodes.Detected.per.Feature_sample01 #> colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> sample01: is_mt <- str_detect(rowData(sfe)$symbol, \"^mt-\") sfe <- addPerCellQCMetrics(sfe, subsets = list(mito = is_mt))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"tissue-segmentation","dir":"Articles","previous_headings":"","what":"Tissue segmentation","title":"Spatial analysis of 10X example Visium dataset","text":"Space Ranger can automatically detect spots tissue Loupe browser can used manually annotate spots tissue, may interesting get tissue outline polygon, know much spot overlaps tissue plot outline. tissue boundary polygon can manually annotated QuPath, saves polygon GeoJSON can directly read R st_read(). can segment tissue computationally. R generally isn’t great image processing, packages can perform segmentation, EBImage, based house C C++ code, imager, based CImg. don’t full resolution image. perform tissue segmentation high resolution downsampled image scale make coordinates tissue boundary match spots. EBImage package used . Compared OpenCV, EBImage slow full resolution image, fine downsized image. rendered static webpage, image static, run interactively, image shown interactive widget can zoom pan. show RGB channels separately tissue can discerned thresholding. tall peak right background. much lower peaks around 0.6 0.85 must tissue. capture faint bluish region, blue channel used thresholding. threshold chosen based histogram experimenting nearby values. use opening operation (erosion followed dilation) denoise small holes tissue, can removed closing operation (dilation followed erosion): larger holes tissue mask, may real holes faint regions nuclei missed thresholding. might large enough affect Visium spots intersect tissue. Now main piece tissue clear. must object largest area. However, two small pieces belong tissue top left. debris fiducials can removed setting pixels mask outside bounding box main piece 0. assign different value contiguous object bwlabel(), use computeFeatures.shape() find area among shape features (e.g. perimeter) object. remove small pieces debris. Object number 797 piece debris bottom left. pieces area 100 pixels tissue. Since debris really small bits tissue, boundary debris tissue can blurry. two distinguished morphology H&E image proximity main tissue. remove debris mask Since holes mask faint regions tissue missed thresholding, holes filled segmentation process took lot manual oversight, choosing threshold, choosing kernel size shape opening closing operations, deciding whether fill holes, deciding debris tissue.","code":"img <- readImage(\"outs/spatial/tissue_hires_image.png\") display(img) img2 <- img colorMode(img2) <- Grayscale display(img2, all = TRUE) hist(img) mask <- img2[,,3] < 0.87 display(mask) kern <- makeBrush(3, shape='disc') mask_open <- opening(mask, kern) display(mask_open) mask_close <- closing(mask_open, kern) display(mask_close) mask_label <- bwlabel(mask_close) fts <- computeFeatures.shape(mask_label) head(fts) #> s.area s.perimeter s.radius.mean s.radius.sd s.radius.min s.radius.max #> 1 39 25 3.428773 1.3542219 1.4176036 5.762777 #> 2 20 14 2.032665 0.3439068 1.5000000 2.500000 #> 3 8 8 1.144123 0.4370160 0.7071068 1.581139 #> 4 14 10 1.689175 0.2160726 1.5811388 2.121320 #> 5 15 12 1.716761 0.4684015 1.0000000 2.236068 #> 6 9 8 1.207107 0.2071068 1.0000000 1.414214 summary(fts[,\"s.area\"]) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 8.0 14.0 51.0 595.9 345.0 496326.0 max_ind <- which.max(fts[,\"s.area\"]) inds <- which(as.array(mask_label) == max_ind, arr.ind = TRUE) head(inds) #> row col #> [1,] 1168 562 #> [2,] 1169 562 #> [3,] 1170 562 #> [4,] 1158 563 #> [5,] 1159 563 #> [6,] 1160 563 row_inds <- c(seq_len(min(inds[,1])-1), seq(max(inds[,1])+1, nrow(mask_label), by = 1)) col_inds <- c(seq_len(min(inds[,2])-1), seq(max(inds[,2])+1, nrow(mask_label), by = 1)) mask_label[row_inds, ] <- 0 mask_label[,col_inds] <- 0 display(mask_label) unique(as.vector(mask_label)) #> [1] 0 421 425 429 430 438 450 458 461 469 473 483 487 505 523 633 640 642 651 #> [20] 678 739 741 757 762 775 778 789 791 797 805 810 813 820 821 822 826 831 838 #> [39] 839 840 843 845 848 849 861 862 863 fts2 <- fts[unique(as.vector(mask_label))[-1],] fts2 <- fts2[order(fts2[,\"s.area\"], decreasing = TRUE),] plot(fts2[,1][-1], type = \"l\", ylab = \"Area\") head(fts2, 10) #> s.area s.perimeter s.radius.mean s.radius.sd s.radius.min s.radius.max #> 421 496326 3151 395.118732 68.6493949 234.1605637 485.715835 #> 450 217 55 7.840627 1.2883202 5.0010247 10.458197 #> 849 211 63 7.961248 2.0228566 3.8569142 12.189753 #> 797 182 56 7.547362 2.3368359 3.0772370 11.839919 #> 461 136 54 7.186020 3.2751628 0.9255555 12.479805 #> 741 92 56 6.365661 2.8382829 1.3273276 11.653219 #> 840 69 33 4.503526 1.6417370 1.6026264 7.076974 #> 862 63 37 4.854424 2.4445530 0.6361407 8.837838 #> 839 45 25 3.305562 0.7074306 1.9320455 4.526897 #> 775 32 22 2.887407 1.1755375 0.5300865 4.543636 #display(mask_label == 797) mask_label[mask_label %in% c(797, as.numeric(rownames(fts2)[fts2[,1] < 100]))] <- 0 mask_label <- fillHull(mask_label) display(paintObjects(mask_label, img, col=c(\"red\", \"yellow\"), opac=c(1, 0.3)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"convert-tissue-mask-to-polygon","dir":"Articles","previous_headings":"","what":"Convert tissue mask to polygon","title":"Spatial analysis of 10X example Visium dataset","text":"Now tissue mask, convert polygon. OpenCV can directly perform conversion, isn’t comprehensive R wrapper OpenCV, conversion convoluted R. first convert Image object raster implemented terra, core R package geospatial raster data. terra can convert raster polygon. image downsized, polygon look quite pixelated. mitigate pixelation save memory, ms_simplify() function used simplify polygon, keeping small proportion vertices. st_simplify() function sf can also simplify polygons, can’t specify proportion vertices keep. adding geometry SFE object, needs scaled match coordinates spots mouse olfactory bulb conventionally plotted horizontally. entire SFE object can transposed histologial space make olfactory bulb horizontal. can use geometric operations find spots intersect tissue, spots covered tissue, much spot intersects tissue. Discrepancies Space Ranger’s annotation annotation based tissue segmentation : Spots margin can intersect tissue without covered . can also get geometries intersections tissue Visium spots, calculate percentage spot tissue. However, percentage may useful tissue segmentation subject error. percentage may useful pathologist annotated histological regions objects nuclei myofibers. spots intersect tissue, total counts relate percentage spot tissue? Spots fully covered tissue lower total UMI counts, can due fully tissue cell types lower total counts histological region near edge, spots fully covered tissue also low UMI counts.","code":"raster2polygon <- function(seg, keep = 0.2) { r <- rast(as.array(seg), extent = ext(0, nrow(seg), 0, ncol(seg))) |> trans() |> flip() r[r < 1] <- NA contours <- st_as_sf(as.polygons(r, dissolve = TRUE)) simplified <- ms_simplify(contours, keep = keep) list(full = contours, simplified = simplified) } tb <- raster2polygon(mask_label) scale_factors <- fromJSON(file = \"outs/spatial/scalefactors_json.json\") tb$simplified$geometry <- tb$simplified$geometry / scale_factors$tissue_hires_scalef tissueBoundary(sfe) <- tb$simplified plotSpatialFeature(sfe, \"sum\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, color = \"black\"), image_id = \"lowres\") + theme_void() sfe <- SpatialFeatureExperiment::transpose(sfe) plotSpatialFeature(sfe, \"sum\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, color = \"black\"), image_id = \"lowres\") # Which spots intersect tissue sfe$int_tissue <- annotPred(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", pred = st_intersects) sfe$cov_tissue <- annotPred(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", pred = st_covered_by) sfe$diff_sr <- case_when(sfe$in_tissue == sfe$int_tissue ~ \"same\", sfe$in_tissue & !sfe$int_tissue ~ \"Space Ranger\", sfe$int_tissue & !sfe$in_tissue ~ \"segmentation\") |> factor(levels = c(\"Space Ranger\", \"same\", \"segmentation\")) plotSpatialFeature(sfe, \"diff_sr\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, size = 0.5, color = \"black\")) + scale_fill_brewer(type = \"div\", palette = 4) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sfe$diff_int_cov <- sfe$int_tissue != sfe$cov_tissue plotSpatialFeature(sfe, \"diff_int_cov\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, size = 0.5, color = \"black\")) spot_ints <- annotOp(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", op = st_intersection) sfe$pct_tissue <- st_area(spot_ints) / st_area(spotPoly(sfe)) * 100 sfe_tissue <- sfe[,sfe$int_tissue] plotColData(sfe_tissue, x = \"pct_tissue\", y = \"sum\", color_by = \"diff_int_cov\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"Spatial analysis of 10X example Visium dataset","text":"","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) qc_features <- c(\"sum\", \"detected\", \"subsets_mito_percent\") sfe_tissue <- colDataUnivariate(sfe_tissue, \"moran.mc\", qc_features, nsim = 200) plotMoranMC(sfe_tissue, qc_features) sfe_tissue <- colDataUnivariate(sfe_tissue, \"sp.correlogram\", qc_features, order = 8) plotCorrelogram(sfe_tissue, qc_features) sfe_tissue <- colDataUnivariate(sfe_tissue, \"localmoran\", qc_features) plotLocalResult(sfe_tissue, \"localmoran\", qc_features, ncol = 2, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- colDataUnivariate(sfe_tissue, \"LOSH\", qc_features) plotLocalResult(sfe_tissue, \"LOSH\", qc_features, ncol = 2, colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- colDataUnivariate(sfe_tissue, \"moran.plot\", qc_features) moranPlot(sfe_tissue, \"subsets_mito_percent\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"spatial-autocorrelation-of-gene-expression","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of gene expression","title":"Spatial analysis of 10X example Visium dataset","text":"Normalize data scran method, find highly variable genes Find Moran’s highly variable genes: vast majority genes positive Moran’s . ’ll find genes highest Moran’s : can use gget info module gget package get additional information genes, descriptions, synonyms, transcripts collection reference databases including Ensembl, UniProt NCBI , showing gene descriptions NCBI: Plot genes highest Moran’s : global Moran’s seems tissue structure. genes negative Moran’s might statistically significant: 2000 highly variable genes 2000 tests, longer significant correcting multiple testing. global Moran’s relate gene expression level? Genes highly expressed overall tend higher Moran’s .","code":"#clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 2000) sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, BPPARAM = MulticoreParam(2)) plotRowDataHistogram(sfe_tissue, \"moran_sample01\") #> Warning: Removed 30285 rows containing non-finite outside the scale range #> (`stat_bin()`). top_moran <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_sample01, decreasing = TRUE)[1:9]] gget_info <- gget$info(top_moran) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> S100a5 Predicted to enable calcium-dependent protein binding activity; metal ion binding activity; and protein homodimerization activity. Located in neuronal cell body. Orthologous to human S100A5 (S100 calcium binding protein A5). [provided by Alliance of Genome Resources, Apr 2022] #> Fabp7 Predicted to enable fatty acid binding activity. Acts upstream of or within cell proliferation in forebrain; neurogenesis; and prepulse inhibition. Located in several cellular components, including cell projection; cell-cell junction; and neuronal cell body. Is expressed in several structures, including central nervous system; cranial nerve; gut; peripheral nervous system; and retina. Orthologous to human FABP7 (fatty acid binding protein 7). [provided by Alliance of Genome Resources, Apr 2022] #> Apoe This gene encodes a member of the apolipoprotein A1/A4/E family of proteins. This protein is involved in the transport of lipoproteins in the blood. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. Homozygous knockout mice for this gene accumulate high levels of cholesterol in the blood and develop atherosclerosis. Different alleles of this gene have been associated with either increased risk or a protective effect for Alzheimer's disease in human patients. This gene maps to chromosome 7 in a cluster with the related apolipoprotein C1, C2 and C4 genes. [provided by RefSeq, Apr 2015] #> Gng13 Predicted to enable G-protein beta-subunit binding activity. Acts upstream of or within phospholipase C-activating G protein-coupled receptor signaling pathway and sensory perception of taste. Located in dendrite. Part of heterotrimeric G-protein complex. Is expressed in brain; gonad; gut; and liver. Orthologous to human GNG13 (G protein subunit gamma 13). [provided by Alliance of Genome Resources, Apr 2022] #> Pcp4 Enables calmodulin binding activity. Predicted to be involved in several processes, including calmodulin dependent kinase signaling pathway; negative regulation of protein kinase activity; and positive regulation of dopamine secretion. Predicted to be located in axon and neurofilament. Predicted to be part of protein-containing complex. Predicted to be active in cytoplasm. Is expressed in several structures, including alimentary system; central nervous system; genitourinary system; peripheral nervous system; and sensory organ. Orthologous to human PCP4 (Purkinje cell protein 4). [provided by Alliance of Genome Resources, Apr 2022] #> Mtco2 Predicted to enable copper ion binding activity and oxidoreductase activity. Predicted to contribute to cytochrome-c oxidase activity. Predicted to be involved in ATP synthesis coupled electron transport; positive regulation of cellular biosynthetic process; and positive regulation of necrotic cell death. Located in mitochondrion. Is expressed in embryo; epiblast; heart; liver; and metanephros. Orthologous to several human genes including MTCO2P12 (MT-CO2 pseudogene 12). [provided by Alliance of Genome Resources, Apr 2022] #> Ptgds Enables prostaglandin-D synthase activity and retinoid binding activity. Involved in prostaglandin biosynthetic process and regulation of circadian sleep/wake cycle, sleep. Acts upstream of or within negative regulation of male germ cell proliferation. Located in extracellular region. Is expressed in several structures, including alimentary system; genitourinary system; integumental system; nervous system; and sensory organ. Human ortholog(s) of this gene implicated in carotid artery disease. Orthologous to human PTGDS (prostaglandin D2 synthase). [provided by Alliance of Genome Resources, Apr 2022] #> Mtnd4 Predicted to enable NADH dehydrogenase (ubiquinone) activity and ubiquinone binding activity. Predicted to contribute to NADH dehydrogenase activity. Predicted to be involved in several processes, including electron transport coupled proton transport; mitochondrial electron transport, NADH to ubiquinone; and mitochondrial respiratory chain complex I assembly. Located in mitochondrion. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; Parkinson's disease; macular degeneration; and schizophrenia. Orthologous to human MT-ND4 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 4). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, top_moran, ncol = 3, image_id = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\") neg_moran <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_sample01, decreasing = FALSE)[1:9]] # Display NCBI descriptions for these genes gget_info_neg <- gget$info(neg_moran) rownames(gget_info_neg) <- gget_info_neg$primary_gene_name select(gget_info_neg, ncbi_description) #> ncbi_description #> Hibch Predicted to enable 3-hydroxyisobutyryl-CoA hydrolase activity. Predicted to be involved in valine catabolic process. Predicted to act upstream of or within branched-chain amino acid catabolic process. Located in mitochondrion. Orthologous to human HIBCH (3-hydroxyisobutyryl-CoA hydrolase). [provided by Alliance of Genome Resources, Apr 2022] #> Syngr2 Predicted to be involved in regulated exocytosis and synaptic vesicle membrane organization. Predicted to act upstream of or within protein targeting. Located in synaptic vesicle. Is expressed in several structures, including genitourinary system; heart; liver; lung; and spleen. Orthologous to human SYNGR2 (synaptogyrin 2). [provided by Alliance of Genome Resources, Apr 2022] #> Entpd5 Enables guanosine-diphosphatase activity and uridine-diphosphatase activity. Involved in several processes, including positive regulation of glycolytic process; protein N-linked glycosylation; and regulation of phosphatidylinositol 3-kinase signaling. Located in endoplasmic reticulum. Is expressed in several structures, including genitourinary system; gut; hemolymphoid system gland; liver; and nose. Orthologous to human ENTPD5 (ectonucleoside triphosphate diphosphohydrolase 5 (inactive)). [provided by Alliance of Genome Resources, Apr 2022] #> Fyco1 Predicted to enable metal ion binding activity. Predicted to be involved in plus-end-directed vesicle transport along microtubule and positive regulation of autophagosome maturation. Predicted to be located in Golgi apparatus. Predicted to be active in autophagosome; late endosome; and lysosome. Is expressed in central nervous system; early conceptus; and retina. Human ortholog(s) of this gene implicated in cataract 18. Orthologous to human FYCO1 (FYVE and coiled-coil domain autophagy adaptor 1). [provided by Alliance of Genome Resources, Apr 2022] #> Ptpn18 Enables non-membrane spanning protein tyrosine phosphatase activity. Acts upstream of or within blastocyst formation. Located in cytoplasm and nucleus. Is expressed in several structures, including alimentary system; brain; genitourinary system; immune system; and liver and biliary system. Orthologous to human PTPN18 (protein tyrosine phosphatase non-receptor type 18). [provided by Alliance of Genome Resources, Apr 2022] #> Cbl Enables SH3 domain binding activity and ephrin receptor binding activity. Involved in regulation of platelet-derived growth factor receptor-alpha signaling pathway. Acts upstream of or within regulation of Rap protein signal transduction. Located in Golgi apparatus and cilium. Part of flotillin complex. Is expressed in male reproductive system and urinary system. Human ortholog(s) of this gene implicated in acute myeloid leukemia; juvenile myelomonocytic leukemia; lung non-small cell carcinoma; and myeloid neoplasm. Orthologous to human CBL (Cbl proto-oncogene). [provided by Alliance of Genome Resources, Apr 2022] #> Dusp18 Predicted to enable protein tyrosine phosphatase activity and protein tyrosine/serine/threonine phosphatase activity. Predicted to be involved in peptidyl-threonine dephosphorylation and peptidyl-tyrosine dephosphorylation. Predicted to act upstream of or within protein targeting to membrane; protein targeting to mitochondrion; and response to antibiotic. Predicted to be located in mitochondrial inner membrane; mitochondrial intermembrane space; and nucleoplasm. Predicted to be extrinsic component of mitochondrial inner membrane and intrinsic component of mitochondrial inner membrane. Is expressed in central nervous system; dorsal root ganglion; olfactory epithelium; and retina. Orthologous to human DUSP18 (dual specificity phosphatase 18). [provided by Alliance of Genome Resources, Apr 2022] #> Tsc22d3 Enables MRF binding activity. Acts upstream of or within several processes, including negative regulation of activation-induced cell death of T cells; negative regulation of skeletal muscle tissue development; and negative regulation of transcription by RNA polymerase II. Located in cytoplasm and nucleus. Is expressed in several structures, including early conceptus; genitourinary system; nervous system; sensory organ; and viscerocranium. Orthologous to human TSC22D3 (TSC22 domain family member 3). [provided by Alliance of Genome Resources, Apr 2022] #> Plekhg2 Predicted to enable guanyl-nucleotide exchange factor activity. Predicted to be involved in regulation of actin filament polymerization. Orthologous to human PLEKHG2 (pleckstrin homology and RhoGEF domain containing G2). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, neg_moran, ncol = 3, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- runUnivariate(sfe_tissue, \"moran.mc\", neg_moran, colGraphName = \"visium\", nsim = 200, alternative = \"less\") plotMoranMC(sfe_tissue, neg_moran, swap_rownames = \"symbol\") rowData(sfe_tissue)[neg_moran, c(\"moran_sample01\", \"moran.mc_p.value_sample01\")] #> DataFrame with 9 rows and 2 columns #> moran_sample01 moran.mc_p.value_sample01 #> #> ENSMUSG00000041426 -0.0531915 0.00497512 #> ENSMUSG00000048277 -0.0451179 0.00497512 #> ENSMUSG00000021236 -0.0445148 0.00497512 #> ENSMUSG00000025241 -0.0419121 0.00995025 #> ENSMUSG00000026126 -0.0399917 0.01990050 #> ENSMUSG00000034342 -0.0393964 0.00497512 #> ENSMUSG00000047205 -0.0381599 0.02487562 #> ENSMUSG00000031431 -0.0369456 0.02985075 #> ENSMUSG00000037552 -0.0368969 0.00995025 sfe_tissue <- addPerFeatureQCMetrics(sfe_tissue) names(rowData(sfe_tissue)) #> [1] \"symbol\" #> [2] \"Feature.Type\" #> [3] \"I_sample01\" #> [4] \"P.value_sample01\" #> [5] \"Adjusted.p.value_sample01\" #> [6] \"Feature.Counts.in.Spots.Under.Tissue_sample01\" #> [7] \"Median.Normalized.Average.Counts_sample01\" #> [8] \"Barcodes.Detected.per.Feature_sample01\" #> [9] \"moran_sample01\" #> [10] \"K_sample01\" #> [11] \"moran.mc_statistic_sample01\" #> [12] \"moran.mc_parameter_sample01\" #> [13] \"moran.mc_p.value_sample01\" #> [14] \"moran.mc_alternative_sample01\" #> [15] \"moran.mc_method_sample01\" #> [16] \"moran.mc_res_sample01\" #> [17] \"mean\" #> [18] \"detected\" plotRowData(sfe_tissue, x = \"mean\", y = \"moran_sample01\") + scale_x_log10() + annotation_logticks(sides = \"b\") + geom_density2d() #> Warning in scale_x_log10(): log-10 transformation introduced infinite values. #> log-10 transformation introduced infinite values. #> Warning: Removed 30285 rows containing non-finite outside the scale range #> (`stat_density2d()`). #> Warning: Removed 30285 rows containing missing values or values outside the scale range #> (`geom_point()`)."},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"apply-spatial-analysis-methods-to-gene-expression-space","dir":"Articles","previous_headings":"","what":"Apply spatial analysis methods to gene expression space","title":"Spatial analysis of 10X example Visium dataset","text":"Spatial statistics require spatial neighborhood graph can also applied k nearest neighbor graph histological space gene expression space. done depth vignette. store results “moran_ns”, confused spatial Moran’s results. genes tend similar neighbors 10 nearest neighbor graph PCA space gene expression rather histological space: Although Moran’s computed histological space, genes highest Moran’s PCA space also show spatial structure, different cell types reside different spatial regions.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat foo <- findKNN(reducedDim(sfe_tissue, \"PCA\")[,1:10], k=10, BNPARAM=AnnoyParam()) # Split by row foo_nb <- asplit(foo$index, 1) dmat <- 1/foo$distance # Row normalize the weights dmat <- sweep(dmat, 1, rowSums(dmat), FUN = \"/\") glist <- asplit(dmat, 1) # Sort based on index ord <- lapply(foo_nb, order) foo_nb <- lapply(seq_along(foo_nb), function(i) foo_nb[[i]][ord[[i]]]) class(foo_nb) <- \"nb\" glist <- lapply(seq_along(glist), function(i) glist[[i]][ord[[i]]]) listw <- list(style = \"W\", neighbours = foo_nb, weights = glist) class(listw) <- \"listw\" attr(listw, \"region.id\") <- colnames(sfe_tissue) colGraph(sfe_tissue, \"knn10\") <- listw sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, BPPARAM = MulticoreParam(2), colGraphName = \"knn10\", name = \"moran_ns\") top_moran2 <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_ns_sample01, decreasing = TRUE)[1:9]] # Display NCBI descriptions for these genes gget_info2 <- gget$info(top_moran2) rownames(gget_info2) <- gget_info2$primary_gene_name select(gget_info2, ncbi_description) #> ncbi_description #> Mtco1 Enables cytochrome-c oxidase activity. Predicted to be involved in electron transport coupled proton transport; mitochondrial electron transport, cytochrome c to oxygen; and response to oxidative stress. Located in mitochondrial inner membrane. Part of mitochondrial respiratory chain complex IV. Is expressed in several structures, including brown fat; heart; liver; metanephros; and skeletal muscle. Orthologous to human MT-CO1 (mitochondrially encoded cytochrome c oxidase I). [provided by Alliance of Genome Resources, Apr 2022] #> Mtco2 Predicted to enable copper ion binding activity and oxidoreductase activity. Predicted to contribute to cytochrome-c oxidase activity. Predicted to be involved in ATP synthesis coupled electron transport; positive regulation of cellular biosynthetic process; and positive regulation of necrotic cell death. Located in mitochondrion. Is expressed in embryo; epiblast; heart; liver; and metanephros. Orthologous to several human genes including MTCO2P12 (MT-CO2 pseudogene 12). [provided by Alliance of Genome Resources, Apr 2022] #> Apoe This gene encodes a member of the apolipoprotein A1/A4/E family of proteins. This protein is involved in the transport of lipoproteins in the blood. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. Homozygous knockout mice for this gene accumulate high levels of cholesterol in the blood and develop atherosclerosis. Different alleles of this gene have been associated with either increased risk or a protective effect for Alzheimer's disease in human patients. This gene maps to chromosome 7 in a cluster with the related apolipoprotein C1, C2 and C4 genes. [provided by RefSeq, Apr 2015] #> Mtnd4 Predicted to enable NADH dehydrogenase (ubiquinone) activity and ubiquinone binding activity. Predicted to contribute to NADH dehydrogenase activity. Predicted to be involved in several processes, including electron transport coupled proton transport; mitochondrial electron transport, NADH to ubiquinone; and mitochondrial respiratory chain complex I assembly. Located in mitochondrion. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; Parkinson's disease; macular degeneration; and schizophrenia. Orthologous to human MT-ND4 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 4). [provided by Alliance of Genome Resources, Apr 2022] #> Fabp7 Predicted to enable fatty acid binding activity. Acts upstream of or within cell proliferation in forebrain; neurogenesis; and prepulse inhibition. Located in several cellular components, including cell projection; cell-cell junction; and neuronal cell body. Is expressed in several structures, including central nervous system; cranial nerve; gut; peripheral nervous system; and retina. Orthologous to human FABP7 (fatty acid binding protein 7). [provided by Alliance of Genome Resources, Apr 2022] #> mt-Nd2 Predicted to enable NADH dehydrogenase (ubiquinone) activity; ionotropic glutamate receptor binding activity; and protein kinase binding activity. Acts upstream of or within reactive oxygen species metabolic process. Located in mitochondrion. Is expressed in early conceptus and secondary oocyte. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; multiple sclerosis; myocardial infarction; neurodegenerative disease (multiple); and urinary bladder cancer. Orthologous to human MT-ND2 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2). [provided by Alliance of Genome Resources, Apr 2022] #> Ptn Predicted to enable several functions, including glycosaminoglycan binding activity; signaling receptor binding activity; and syndecan binding activity. Involved in several processes, including decidualization; learning or memory; and regulation of nervous system development. Acts upstream of or within bone mineralization. Located in extracellular region. Is expressed in several structures, including alimentary system; genitourinary system; nervous system; respiratory system; and sensory organ. Human ortholog(s) of this gene implicated in adrenal carcinoma. Orthologous to human PTN (pleiotrophin). [provided by Alliance of Genome Resources, Apr 2022] #> Apod The protein encoded by this gene is a component of high-density lipoprotein (HDL), but is unique in that it shares greater structural similarity to lipocalin than to other members of the apolipoprotein family, and has a wider tissue expression pattern. The encoded protein is involved in lipid metabolism, and ablation of this gene results in defects in triglyceride metabolism. Elevated levels of this gene product have been observed in multiple tissues of Niemann-Pick disease mouse models, as well as in some tumors. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Aug 2014] #> Atp1a2 Predicted to enable several functions, including ATP binding activity; ATP hydrolysis activity; and alkali metal ion binding activity. Involved in several processes, including cellular response to steroid hormone stimulus; locomotory exploration behavior; and response to auditory stimulus. Acts upstream of or within several processes, including forebrain development; regulation of blood circulation; and regulation of muscle contraction. Located in T-tubule; cell projection; and neuronal cell body. Is expressed in several structures, including genitourinary system; heart; musculature; nervous system; and sensory organ. Used to study familial hemiplegic migraine 2. Human ortholog(s) of this gene implicated in alternating hemiplegia of childhood; benign neonatal seizures; familial hemiplegic migraine 2; hypertension; and migraine with aura. Orthologous to human ATP1A2 (ATPase Na+/K+ transporting subunit alpha 2). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, top_moran2, ncol = 3, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Spatial analysis of 10X example Visium dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 BiocNeighbors_1.20.2 #> [3] BiocParallel_1.36.0 dplyr_1.1.4 #> [5] rmapshaper_0.5.0 sf_1.0-16 #> [7] rlang_1.1.3 terra_1.7-71 #> [9] EBImage_4.44.0 rjson_0.2.21 #> [11] bluster_1.12.0 patchwork_1.2.0 #> [13] stringr_1.5.1 scran_1.30.2 #> [15] scater_1.30.1 scuttle_1.12.0 #> [17] ggplot2_3.5.1 SingleCellExperiment_1.24.0 #> [19] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [21] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [23] IRanges_2.36.0 S4Vectors_0.40.2 #> [25] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [27] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [29] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] curl_5.2.1 Rhdf5lib_1.24.2 #> [21] s2_1.1.6 SparseArray_1.2.4 #> [23] rhdf5_2.46.1 sass_0.4.9 #> [25] spData_2.3.0 KernSmooth_2.23-22 #> [27] bslib_0.7.0 htmlwidgets_1.6.4 #> [29] desc_1.4.3 cachem_1.0.8 #> [31] igraph_2.0.3 lifecycle_1.0.4 #> [33] pkgconfig_2.0.3 rsvd_1.0.5 #> [35] Matrix_1.6-5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 dqrng_0.3.2 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] mgcv_1.9-1 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 tiff_0.1-12 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 R.utils_2.12.3 #> [59] HDF5Array_1.30.1 MASS_7.3-60.0.1 #> [61] DelayedArray_0.28.0 classInt_0.4-10 #> [63] tools_4.3.3 units_0.8-5 #> [65] vipor_0.4.7 beeswarm_0.4.0 #> [67] R.oo_1.26.0 glue_1.7.0 #> [69] dbscan_1.1-12 nlme_3.1-164 #> [71] rhdf5filters_1.14.1 grid_4.3.3 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] isoband_0.2.7 gtable_0.3.5 #> [77] R.methodsS3_1.8.2 class_7.3-22 #> [79] BiocSingular_1.18.0 ScaledMatrix_1.10.0 #> [81] metapod_1.10.1 sp_2.1-4 #> [83] utf8_1.2.4 XVector_0.42.0 #> [85] ggrepel_0.9.5 pillar_1.9.0 #> [87] limma_3.58.1 splines_4.3.3 #> [89] lattice_0.22-6 deldir_2.0-4 #> [91] tidyselect_1.2.1 locfit_1.5-9.9 #> [93] knitr_1.45 gridExtra_2.3 #> [95] V8_4.4.2 edgeR_4.0.16 #> [97] xfun_0.43 statmod_1.5.0 #> [99] DropletUtils_1.22.0 fftwtools_0.9-11 #> [101] stringi_1.8.3 geojsonsf_2.0.3 #> [103] yaml_2.3.8 boot_1.3-30 #> [105] evaluate_0.23 codetools_0.2-20 #> [107] tibble_3.2.1 cli_3.6.2 #> [109] systemfonts_1.0.6 munsell_0.5.1 #> [111] jquerylib_0.1.4 Rcpp_1.0.12 #> [113] png_0.1-8 parallel_4.3.3 #> [115] pkgdown_2.0.9 jpeg_0.1-10 #> [117] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [119] SpatialExperiment_1.12.0 viridisLite_0.4.2 #> [121] scales_1.3.0 e1071_1.7-14 #> [123] purrr_1.0.2 crayon_1.5.2 #> [125] scico_1.5.0 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Visium Processing Workflows with Voyager","text":"Pros: Commercial kit Provided many core facilities widely available spatial transcriptomics technologies Transcriptome wide Formalin fixed, paraffin embedded (FFPE) tissue compatible Can panel proteins addition RNA Accompanied H&E fluorescent images tissue morphology lower resolution, data size manageable larger tissue areas larger number samples Cons: Lower resolution – 55 \\(\\mu\\)m spot diameter 100 \\(\\mu\\)m center center Relatively low detection efficiency transcripts full length, protocol adapted long read sequencing","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"dowload-data-and-create-a-spatialfeatureexperiment-object","dir":"Articles","previous_headings":"Getting Started","what":"Dowload Data and Create a SpatialFeatureExperiment object","title":"Visium Processing Workflows with Voyager","text":"Several publicly available Visium datasets available 10X Genomics website. vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix demonstrate read output typical Visium experiment SpatialFeatureExperiment object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Visium Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using variety Visium datasets. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Xenium Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible H&E immunofluorescence Cons: curated panel usually hundred genes required. However, 10X provides curated gene panels common applications oncology, neuroscience, development, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"Xenium Processing Workflows with Voyager","text":"10x Genomics publicly released Xenium human breast cancer dataset website. tutorial processing output various spatial transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager Getting Started page. output files format Xenium data may change technology developed released.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Xenium Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using variety Visium datasets. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Lambda Moses. Author, maintainer. Kayla Jackson. Author. Laura Luebbert. Author. Sina Booeshaghi. Author. Lior Pachter. Author, reviewer.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Moses L, Einarsson PH, Jackson K, Luebbert L, Booeshaghi S, Antonsson S, Melsted P, Pachter L (2023). “Voyager: exploratory single-cell genomics data analysis geospatial statistics.” bioRxiv. doi:10.1101/2023.07.20.549945.","code":"@Article{, title = {Voyager: exploratory single-cell genomics data analysis with geospatial statistics}, author = {Lambda Moses and Pétur Helgi Einarsson and Kayla Jackson and Laura Luebbert and Sina Booeshaghi and Sindri Antonsson and Páll Melsted and Lior Pachter}, journal = {bioRxiv}, year = {2023}, doi = {10.1101/2023.07.20.549945}, }"},{"path":"https://pachterlab.github.io/voyager/dev/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"From geospatial to spatial omics","text":"SpatialFeatureExperiment Voyager can installed Bioconductor version 3.16 higher:","code":"if (!requireNamespace(\"BiocManager\")) install.packages(\"BiocManager\") BiocManager::install(version = \"3.17\") # Or a higher version in the future BiocManager::install(\"Voyager\")"},{"path":"https://pachterlab.github.io/voyager/dev/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"From geospatial to spatial omics","text":"Voyager: exploratory single-cell genomics data analysis geospatial statistics Lambda Moses, Pétur Helgi Einarsson, Kayla Jackson, Laura Luebbert, . Sina Booeshaghi, Sindri Antonsson, Páll Melsted, Lior Pachter bioRxiv 2023.07.20.549945; doi: https://doi.org/10.1101/2023.07.20.549945","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot the elbow plot or scree plot for PCA — ElbowPlot","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"Apparently, apparent way plot PC elbow plot extracting variance explained attribute dimred slot, even OSCA book makes elbow plot way, find kind cumbersome compared Seurat. writing function make elbow plot SCE less cumbersome.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"","code":"ElbowPlot( sce, ndims = 20, nfnega = 0, reduction = \"PCA\", sample_id = \"all\", facet = FALSE, ncol = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"sce SingleCellExperiment object, anything inherits SingleCellExperiment. ndims Number components positive eigenvalues, PCs non-spatial PCA. nfnega Number nega eigenvalues eigenvectors compute. indicate negative spatial autocorrelation. reduction Name dimension reduction use. must attribute called either \"percentVar\" \"eig\" eigenvalues. Defaults \"PCA\". sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. facet Logical, whether facet samples multiple samples present. relevant spatial PCA run separately sample, gives different results running jointly samples. ncol Number columns facets facetting.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"ggplot object. y axis eigenvalues percentage variance explained relevant.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"","code":"library(SFEData) library(scater) #> Loading required package: SingleCellExperiment #> Loading required package: SummarizedExperiment #> Loading required package: MatrixGenerics #> Loading required package: matrixStats #> #> Attaching package: ‘MatrixGenerics’ #> The following objects are masked from ‘package:matrixStats’: #> #> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, #> colCounts, colCummaxs, colCummins, colCumprods, colCumsums, #> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, #> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, #> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds, #> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, #> colWeightedMeans, colWeightedMedians, colWeightedSds, #> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet, #> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, #> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, #> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, #> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks, #> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, #> rowWeightedMads, rowWeightedMeans, rowWeightedMedians, #> rowWeightedSds, rowWeightedVars #> Loading required package: GenomicRanges #> Loading required package: stats4 #> Loading required package: BiocGenerics #> #> Attaching package: ‘BiocGenerics’ #> The following objects are masked from ‘package:stats’: #> #> IQR, mad, sd, var, xtabs #> The following objects are masked from ‘package:base’: #> #> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append, #> as.data.frame, basename, cbind, colnames, dirname, do.call, #> duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted, #> lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, #> pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table, #> tapply, union, unique, unsplit, which.max, which.min #> Loading required package: S4Vectors #> #> Attaching package: ‘S4Vectors’ #> The following object is masked from ‘package:utils’: #> #> findMatches #> The following objects are masked from ‘package:base’: #> #> I, expand.grid, unname #> Loading required package: IRanges #> Loading required package: GenomeInfoDb #> Loading required package: Biobase #> Welcome to Bioconductor #> #> Vignettes contain introductory material; view with #> 'browseVignettes()'. To cite Bioconductor, see #> 'citation(\"Biobase\")', and for packages 'citation(\"pkgname\")'. #> #> Attaching package: ‘Biobase’ #> The following object is masked from ‘package:MatrixGenerics’: #> #> rowMedians #> The following objects are masked from ‘package:matrixStats’: #> #> anyMissing, rowMedians #> Loading required package: scuttle #> Loading required package: ggplot2 sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> require(“SpatialFeatureExperiment”) sfe <- runPCA(sfe, ncomponents = 10, exprs_values = \"counts\") ElbowPlot(sfe, ndims = 10)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":null,"dir":"Reference","previous_headings":"","what":"SFEMethod class — SFEMethod","title":"SFEMethod class — SFEMethod","text":"S4 class used wrap spatial analysis methods, taking inspiration caret tidymodels packages.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SFEMethod class — SFEMethod","text":"","code":"SFEMethod( name, fun, reorganize_fun, package, variate = c(\"uni\", \"bi\", \"multi\"), scope = c(\"global\", \"local\"), title = NULL, default_attr = NA, args_not_check = NA, joint = FALSE, use_graph = TRUE, use_matrix = FALSE, dest = c(\"reducedDim\", \"colData\") ) # S4 method for SFEMethod info(x, type) # S4 method for SFEMethod is_local(x) # S4 method for SFEMethod fun(x) # S4 method for SFEMethod reorganize_fun(x) # S4 method for SFEMethod args_not_check(x) # S4 method for SFEMethod is_joint(x) # S4 method for SFEMethod use_graph(x) # S4 method for SFEMethod use_matrix(x)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SFEMethod class — SFEMethod","text":"name Name method, used user-facing functions specify method use, \"moran\" Moran's . fun Function run method. See Details. reorganize_fun Function reorganize results add SFE object. See Details. package Name package whose implementation method used , used check package installed. variate many variables method works , must one \"uni\" univariate, \"bi\" bivariate, \"multi\" multivariate. scope Either \"global\", returning one result entire dataset, \"local\", returning one result spatial location. multivariate methods, irrelevant. title Descriptive title show plotting results. default_attr local methods return multiple fields, local Moran values p-values, default field use plotting. args_not_check character vector indicating argument checked comparing parameters previous run. joint Logical, whether makes sense run method multiple samples jointly. TRUE, fun must able handle adjacency matrix listw argument straightforward way concatenate listw objects multiple samples. use_graph Logical, indicate whether method uses spatial neighborhood graph unifying user facing functions argument asking graph though methods require graph. use_matrix Logical, whether function slot fun takes matrix input. argument used bivariate methods. dest Whether results appropriate reducedDim colData. used multivariate methods. overrides \"local\" field info. x SFEMethod object type One names info slot, see slot documentation.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"SFEMethod class — SFEMethod","text":"constructor returns SFEMethod object. getters return content corresponding slots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SFEMethod class — SFEMethod","text":"fun slot specified : methods, must arguments x vector, listw listw object specifying spatial neighborhood graph, zero.policy specifying cells without neighbors (default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA), optionally method specific arguments ... pass underlying imported function. original function implementing method package different argument names orders, write thin wrapper rearrange /rename arguments. univariate methods use spatial neighborhood graph, first two arguments must x listw. univariate methods use spatial neighborhood graph, variogram, first two arguments must x numeric vector coords_df sf data frame cell locations optionally regressors. formula argument optional can defaults specifying regressors use. bivariate methods, first three arguments must x, y, listw. multivariate methods, argument x mandatory, matrix input. arguments must present can optional defaults: listw ncomponents set number dimentions output. reorganize_fun slot specified : Univariate methods meant run separately gene, input reorganize_fun argument list outputs; element list corresponds output gene. univariate global methods, different fields result columns data frame one row results multiple features data frame. arguments , name rename primary field informative name needed, ... arguments specific methods. output reorganize_fun DataFrame whose rows correspond genes columns correspond fields output. univariate local methods, arguments , nb neighborhood list used multiple testing correction, p.adjust.method method correct multiple testing p.adjust, .... output reorganize_fun list reorganized output. element list corresponds gene, reorganized content element can vector, matrix, data frame, must dimensions genes. element vector, row matrix data frame corresponds cell. multivariate methods whose results go reducedDim, reorganize_fun one argument raw output. output reorganize_fun cell embedding matrix ready added reducedDim. relevant information gene loadings eigenvalues added attributes cell embedding matrix. multivariate methods whose results can go colData, arguments , nb, p.adjust.method. Unlike univariate local counterpart, takes raw output instead list outputs. output reorganize_fun vector data frame ready added colData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"slots","dir":"Reference","previous_headings":"","what":"Slots","title":"SFEMethod class — SFEMethod","text":"info named character vector specifying information method. fun function implementing method. See Details. reorganize_fun Function convert output fun format store SFE object. See Details. misc Miscellaneous information method interacts rest package. named list.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"SFEMethod class — SFEMethod","text":"","code":"moran <- SFEMethod( name = \"moran\", title = \"Moran's I\", package = \"spdep\", variate = \"uni\", scope = \"global\", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), reorganize_fun = Voyager:::.moran2df )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Bivariate spatial statistics — calculateBivariate","title":"Bivariate spatial statistics — calculateBivariate","text":"functions perform bivariate spatial analysis. version, bivariate global method supported lee, lee.mc, lee.test spdep, cross variograms gstat (use cross_variogram cross_variogram_map type argument, see variogram-internal). Global Lee statistic computed implementation much faster spdep. Bivariate local methods supported lee (use locallee type argument) localmoran_bv bivariate version Local Moran spdep.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Bivariate spatial statistics — calculateBivariate","text":"","code":"# S4 method for ANY calculateBivariate( x, y = NULL, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for SpatialFeatureExperiment calculateBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) runBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, p.adjust.method = \"BH\", name = NULL, overwrite = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Bivariate spatial statistics — calculateBivariate","text":"x numeric matrix whose rows features/genes, numeric vector (y must specified), SpatialFeatureExperiment (SFE) object matrix assay. y numeric matrix whose rows features/genes, numeric vector. Bivariate statics computed pairwise combinations row names x row names y, except cross variogram combinations within x y also computed. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. coords_df sf data frame specifying location cell. used method specified type uses spatial neighborhood graph. Must specified otherwise. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA returnDF Logical, results added SFE object, whether results formatted DataFrame. p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. ... arguments passed S4 method (convenience wrappers like calculateMoransI) method used compute metrics specified argument type (general functions like calculateUnivariate). See documentation functions name specified type spdep package method specific arguments. variograms, see .variogram. feature1 ID symbol first genes SFE object, argument x. feature2 ID symbol second genes SFE object, argument x. Mandatory length feature1 1. colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. SFE method calculateUnivariate, specify location cells methods take spatial neighborhood graph variogram. geometry type POINT, spatialCoords(x) used instead. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. exprs_values Integer scalar string indicating assay x contains expression values. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. overwrite Logical, whether overwrite existing results name. Defaults FALSE.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Bivariate spatial statistics — calculateBivariate","text":"calculateBivariate function returns correlation matrix global Lee, results pair genes methods. Global results stored SFE object. methods return one result pair genes, return pairwise results 2 genes jointly. Local results stored localResults field SFE object, name concatenation two gene names separated two underscores (__).","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Bivariate spatial statistics — calculateBivariate","text":"","code":"library(SFEData) library(scater) library(scran) library(SpatialFeatureExperiment) library(SpatialExperiment) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) gs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gs, fdr.threshold = 0.01) g <- colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Matrix method mat <- logcounts(sfe)[hvgs[1:5],] df <- df2sf(spatialCoords(sfe), spatialCoordsNames(sfe)) out <- calculateBivariate(mat, type = \"lee\", listw = g) out <- calculateBivariate(mat, type = \"cross_variogram\", coords_df = df) # SFE method out <- calculateBivariate(sfe, type = \"lee\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\") out2 <- calculateBivariate(sfe, type = \"lee.test\", feature1 = \"Myh1\", feature2 = \"Myh2\", swap_rownames = \"symbol\") sfe <- runBivariate(sfe, type = \"locallee\", feature1 = \"Myh1\", feature2 = \"Myh2\", swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Multivariate spatial data analysis — calculateMultivariate","title":"Multivariate spatial data analysis — calculateMultivariate","text":"functions perform multivariate spatial data analysis, usually spatially informed dimension reduction.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Multivariate spatial data analysis — calculateMultivariate","text":"","code":"# S4 method for ANY,SFEMethod calculateMultivariate( x, type, listw = NULL, transposed = FALSE, zero.policy = TRUE, p.adjust.method = \"BH\", ... ) # S4 method for ANY,character calculateMultivariate(x, type, listw = NULL, transposed = FALSE, ...) # S4 method for SpatialFeatureExperiment,ANY calculateMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = \"logcounts\", sample_action = c(\"joint\", \"separate\"), BPPARAM = SerialParam(), ... ) runMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = \"logcounts\", sample_action = c(\"joint\", \"separate\"), BPPARAM = SerialParam(), name = NULL, dest = c(\"reducedDim\", \"colData\"), ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Multivariate spatial data analysis — calculateMultivariate","text":"x numeric matrix whose rows features/genes, SpatialFeatureExperiment (SFE) object matrix assay. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. transposed Logical, whether matrix genes columns cells rows. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. ... Extra arguments passed specific multivariate method. example, see multispati_rsp arguments MULTISPATI PCA. See localC arguments \"localC_multi\" \"localC_perm_multi\". colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. subset_row Vector specifying subset features use dimensionality reduction. can character vector row names, integer vector row indices logical vector. exprs_values Integer scalar string indicating assay x contains expression values. sample_action Character, either \"joint\" \"separate\". Spatial methods depend spatial coordinates /spatial neighborhood graph, SpatialExperiment uses sample_id keep coordinates different samples separate. spatial methods can sensibly run jointly multiple samples. case, \"joint\" run method jointly samples, \"separate\" run method separately sample concatenate results. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. parallelize computation across multiple samples large number samples. cautious using optimized BLAS matrix operations supports multithreading. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. dest Character, either \"reducedDim\" \"colData\". output multivariate method matrix array, spatially informed dimension reduction, option \"reducedDim\", results stored reducedDim SFE object. output vector, multivariate version localC, sotred colData. Data frame output, localC_perm, can stored either reducedDim colData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Multivariate spatial data analysis — calculateMultivariate","text":"calculateMultivariate, matrix cell embeddings whose attributes include loadings eigenvalues relevant, ready added SFE object reducedDim setter. run*, SpatialFeatureExperiment object results added. See Details results stored.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Multivariate spatial data analysis — calculateMultivariate","text":"argument type, package supports \"multispati\" MULTISPATI PCA, \"localC_multi\" multivariate generalization Geary's C, \"localC_perm_multi\" multivariate Geary's C permutation testing, \"gwpca\" geographically weighted PCA.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Multivariate spatial data analysis — calculateMultivariate","text":"Dray, S., Said, S. Debias, F. (2008) Spatial ordination vegetation data using generalization Wartenberg's multivariate spatial correlation. Journal vegetation science, 19, 45-56. Anselin, L. (2019), Local Indicator Multivariate Spatial Association: Extending Geary's c. Geogr Anal, 51: 133-150. doi:10.1111/gean.12164","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Multivariate spatial data analysis — calculateMultivariate","text":"","code":"# example code library(SFEData) library(scater) library(scran) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) gvs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gvs, fdr.threshold = 0.05) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- runMultivariate(sfe, \"multispati\", subset_row = hvgs)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Univariate spatial stiatistics — calculateUnivariate","title":"Univariate spatial stiatistics — calculateUnivariate","text":"functions compute univariate spatial statistics, global local, matrices, data frames, SFE objects. SFE objects, statistics can computed numeric columns colData, colGeometries, annotGeometries, results stored within SFE object. calculateMoransI runMoransI convenience wrappers calculateUnivariate runUnivariate respectively.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Univariate spatial stiatistics — calculateUnivariate","text":"","code":"# S4 method for ANY,SFEMethod calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for ANY,character calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for SpatialFeatureExperiment,ANY calculateUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) # S4 method for ANY calculateMoransI( x, ..., BPPARAM = SerialParam(), zero.policy = NULL, name = \"moran\" ) # S4 method for SpatialFeatureExperiment calculateMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) colDataUnivariate( x, type, features, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colDataMoransI( x, features, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colGeometryUnivariate( x, type, features, colGeometryName = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colGeometryMoransI( x, features, colGeometryName = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) annotGeometryUnivariate( x, type, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) annotGeometryMoransI( x, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) runUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, overwrite = FALSE, ... ) runMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) reducedDimUnivariate( x, type, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) reducedDimMoransI( x, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Univariate spatial stiatistics — calculateUnivariate","text":"x numeric matrix whose rows features/genes, SpatialFeatureExperiment (SFE) object matrix assay. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. coords_df sf data frame specifying location cell. used method specified type uses spatial neighborhood graph. Must specified otherwise. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA returnDF Logical, results added SFE object, whether results formatted DataFrame. p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. ... arguments passed S4 method (convenience wrappers like calculateMoransI) method used compute metrics specified argument type (general functions like calculateUnivariate). See documentation functions name specified type spdep package method specific arguments. variograms, see .variogram. features Genes (calculate* SFE method run*) numeric columns colData(x) (colData*) colGeometry (colGeometry*) annotGeometry (annotGeometry*) univariate metric computed. Default NULL. NULL, metric computed genes values assay specified argument exprs_values. can parallelized argument BPPARAM. genes, row names SFE object Ensembl IDs, gene symbol can used converted IDs behind scene column rowData can specified swap_rownames. However, one symbol matches multiple IDs, warning given first match used. Internally, results always stored Ensembl ID rather symbol. colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. SFE method calculateUnivariate, specify location cells methods take spatial neighborhood graph variogram. geometry type POINT, spatialCoords(x) used instead. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. exprs_values Integer scalar string indicating assay x contains expression values. include_self Logical, whether spatial neighborhood graph include edges location . Getis-Ord Gi* localG localG_perm, used method. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. annotGeometryName Name annotGeometry sf data frame whose numeric columns interest used compute metric. Use annotGeometryNames look names sf data frames associated annotations. annotGraphName Name listw graph SFE object corresponds annotGeometry interest. Use annotGraphNames look names available annotation graphs. overwrite Logical, whether overwrite existing results name. Defaults FALSE. dimred Name dimension reduction, can seen reducedDimNames. components Numeric vector components dimension reduction compute spatial statistics .","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Univariate spatial stiatistics — calculateUnivariate","text":"calculateUnivariate, returnDF = TRUE, DataFrame, otherwise list element results feature. run*, SpatialFeatureExperiment object results added. See Details results stored.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Univariate spatial stiatistics — calculateUnivariate","text":"univariate methods package spdep supported . methods global, meaning returning one result spatial locations dataset: moran, geary, moran.mc, geary.mc, moran.test, geary.test, globalG.test, sp.correlogram. variogram variogram map gstat package also supported. following methods local, meaning location results: moran.plot, localmoran, localmoran_perm, localC, localC_perm, localG, localG_perm, LOSH, LOSH.mc, LOSH.cs. GWmodel::gwss method supported soon, supported yet. Global results genes stored rowData. colGeometry annotGeometry, results added attribute data frame called featureData, DataFrame analogous rowData gene count matrix, can accessed geometryFeatureData function. New column names featureData follow rules rowData. colData, results can accessed colFeatureData function. Local results stored field localResults field SFE object, can accessed localResults localResult. results p-values, -log10 p adjusted -log10 p added. Note multiple testing correction, p.adjustSP used. results stored SFE object, parameters used compute results well construct spatial neighborhood graph also added. localResults, parameters added metadata field params localResults sorted name, defaults name SFEMethod object specified type argument. global methods, parameters results genes metadata rowData(x), organized name (metadata(rowData(x))$params[[name]]). colData, global method parameters stored metadata colData field params (metadata(colData(x))$params[[name]]). geometries, global method parameters attribute named \"params\" corresponding sf data frame (attr(df, \"params\")[[name]]).","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Univariate spatial stiatistics — calculateUnivariate","text":"Cliff, . D., Ord, J. K. 1981 Spatial processes, Pion, p. 17. Anselin, L. (1995), Local Indicators Spatial Association-LISA. Geographical Analysis, 27: 93-115. doi:10.1111/j.1538-4632.1995.tb00338.x Ord, J. K., & Getis, . 2012. Local spatial heteroscedasticity (LOSH), Annals Regional Science, 48 (2), 529-539. Ord, J. K. Getis, . 1995 Local spatial autocorrelation statistics: distributional issues application. Geographical Analysis, 27, 286-306","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Univariate spatial stiatistics — calculateUnivariate","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) features_use <- rownames(sfe)[1:5] # Moran's I moran_results <- calculateMoransI(sfe, features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) # This does not advocate for computing Moran's I on raw counts. # Just an example for function usage. sfe <- runMoransI(sfe, features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) # Look at the results head(rowData(sfe)) #> DataFrame with 6 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000025902 ENSMUSG00000025902 Sox17 Gene Expression 0.007612179 #> ENSMUSG00000096126 ENSMUSG00000096126 Gm22307 Gene Expression 0.000200321 #> ENSMUSG00000033845 ENSMUSG00000033845 Mrpl15 Gene Expression 0.075921474 #> ENSMUSG00000025903 ENSMUSG00000025903 Lypla1 Gene Expression 0.057491987 #> ENSMUSG00000033813 ENSMUSG00000033813 Tcea1 Gene Expression 0.052283654 #> ENSMUSG00000002459 ENSMUSG00000002459 Rgs20 Gene Expression 0.000200321 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000025902 0.008757912 151.1411 -0.0424335 13.32749 #> ENSMUSG00000096126 0.000200321 4992.0000 NaN NaN #> ENSMUSG00000033845 0.114250804 19.8212 0.2485804 5.41594 #> ENSMUSG00000025903 0.080645121 24.3985 0.0070062 9.46309 #> ENSMUSG00000033813 0.073603279 26.9256 0.1592157 8.51384 #> ENSMUSG00000002459 0.000200321 4992.0000 NA NA # Local Moran's I sfe <- runUnivariate(sfe, type = \"localmoran\", features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) head(localResult(sfe, \"localmoran\", features_use[1])) #> Ii E.Ii Var.Ii Z.Ii Pr(z != E(Ii)) #> AAATTACCTATCGATG -0.02897069 -0.001345388 0.01609308 -0.2177647 0.82761246 #> AACATATCAACTGGTG -0.29141104 -0.001345388 0.01609308 -2.2865292 0.02222332 #> AAGATTGGCGGAACGT 0.10224949 -0.001345388 0.01958757 0.7401981 0.45917982 #> AAGGGACAGATTCTGT -0.02897069 -0.001345388 0.01609308 -0.2177647 0.82761246 #> AATATCGAGGGTTCTC 0.10224949 -0.001345388 0.01609308 0.8166176 0.41414701 #> AATGATGATACGCTAT 0.10224949 -0.001345388 0.01609308 0.8166176 0.41414701 #> mean median pysal -log10p -log10p_adj #> AAATTACCTATCGATG Low-High Low-High Low-High 0.08217298 0.0000000 #> AACATATCAACTGGTG Low-High Low-High Low-High 1.65319110 0.8080931 #> AAGATTGGCGGAACGT Low-Low Low-Low Low-Low 0.33801720 0.0000000 #> AAGGGACAGATTCTGT Low-High Low-High Low-High 0.08217298 0.0000000 #> AATATCGAGGGTTCTC Low-Low Low-Low Low-Low 0.38284547 0.0000000 #> AATGATGATACGCTAT Low-Low Low-Low Low-Low 0.38284547 0.0000000 # For colData sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = \"nCounts\", colGraphName = \"visium\" ) head(localResult(sfe, \"localmoran\", \"nCounts\")) #> Ii E.Ii Var.Ii Z.Ii #> AAATTACCTATCGATG 0.53682603 -0.0073375879 0.087243111 1.8423152 #> AACATATCAACTGGTG 0.20017125 -0.0008174853 0.009783652 2.0319883 #> AAGATTGGCGGAACGT 0.13533683 -0.0002992400 0.004361215 2.0538630 #> AAGGGACAGATTCTGT 0.67946203 -0.0182482408 0.214584793 1.5061757 #> AATATCGAGGGTTCTC -0.01287299 -0.0009633914 0.011528171 -0.1109218 #> AATGATGATACGCTAT 0.15331553 -0.0306802864 0.356207210 0.3082880 #> Pr(z != E(Ii)) mean median pysal -log10p #> AAATTACCTATCGATG 0.06542906 High-High High-High High-High 1.18422931 #> AACATATCAACTGGTG 0.04215484 High-High High-High High-High 1.37515260 #> AAGATTGGCGGAACGT 0.03998896 High-High Low-High High-High 1.39805992 #> AAGGGACAGATTCTGT 0.13202207 High-High High-High High-High 0.87935347 #> AATATCGAGGGTTCTC 0.91167838 High-Low High-Low High-Low 0.04015835 #> AATGATGATACGCTAT 0.75786321 High-High High-Low High-High 0.12040917 #> -log10p_adj #> AAATTACCTATCGATG 0.33913127 #> AACATATCAACTGGTG 0.53005456 #> AAGATTGGCGGAACGT 0.61990867 #> AAGGGACAGATTCTGT 0.03425543 #> AATATCGAGGGTTCTC 0.00000000 #> AATGATGATACGCTAT 0.00000000 # For annotGeometries annotGraph(sfe, \"myofiber_tri2nb\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3L, method = \"tri2nb\", dist_type = \"idw\", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, type = \"localG\", features = \"area\", annotGraphName = \"myofiber_tri2nb\", annotGeometryName = \"myofiber_simplified\", zero.policy = TRUE ) head(localResult(sfe, \"localG\", \"area\", annotGeometryName = \"myofiber_simplified\" )) #> localG Gi E(Gi) V(Gi) Z(Gi) #> 1018 -2.3083710 0.0001426229 0.0002238002 1.236681e-09 -2.3083710 #> 1021 -0.8140180 0.0002393084 0.0002665443 1.119477e-09 -0.8140180 #> 1024 0.0508039 0.0002301134 0.0002280492 1.650888e-09 0.0508039 #> 1041 -0.1700897 0.0002715145 0.0002773569 1.179830e-09 -0.1700897 #> 1052 0.1547597 0.0002185310 0.0002133753 1.109810e-09 0.1547597 #> 1058 -0.3688569 0.0002047116 0.0002174315 1.189189e-09 -0.3688569 #> Pr(z != E(Gi)) -log10p -log10p_adj cluster #> 1018 0.02097851 1.67822538 0.9000741 High #> 1021 0.41563466 0.38128824 0.0000000 High #> 1024 0.95948178 0.01796327 0.0000000 High #> 1041 0.86493956 0.06301424 0.0000000 High #> 1052 0.87701073 0.05699509 0.0000000 Low #> 1058 0.71223439 0.14737706 0.0000000 Low"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":null,"dir":"Reference","previous_headings":"","what":"Find clusters of correlogram patterns — clusterCorrelograms","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"Cluster correlograms find patterns length scales spatial autocorrelation. correlograms clustered must computed method number lags. Correlograms clustered jointly across samples.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"","code":"clusterCorrelograms( sfe, features, BLUSPARAM, sample_id = \"all\", method = \"I\", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"sp.correlogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"sfe SpatialFeatureExperiment object correlograms computed features interest. features Features whose correlograms cluster. BLUSPARAM BlusterParam object specifying algorithm use. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. method \"corr\" correlation, \"\" Moran's , \"C\" Geary's C colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"data frame 3 columns: feature features, cluster factor cluster membership features within sample, sample_id sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) sfe <- runUnivariate(sfe, type = \"sp.correlogram\", features = rownames(sfe)[inds], exprs_values = \"counts\", order = 5 ) clust <- clusterCorrelograms(sfe, features = rownames(sfe)[inds], BLUSPARAM = KmeansParam(2) )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Find clusters on the Moran plot — clusterMoranPlot","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"Moran plot plots value location x axis, average neighbors locations y axis. Sometimes clusters can seen Moran plot, indicating different types neighborhoods.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"","code":"clusterMoranPlot( sfe, features, BLUSPARAM, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, swap_rownames = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"sfe SpatialFeatureExperiment object Moran plot computed feature interest. Moran plot feature computed feature sample_id, calculated stored rowData. See calculateUnivariate. features Features whose Moran plot cluster. Features whose Moran plots computed skipped, warning. BLUSPARAM BlusterParam object specifying algorithm use. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"data frame column factor cluster membership feature. column names features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Compute moran plot sfe <- runUnivariate(sfe, type = \"moran.plot\", features = rownames(sfe)[1], exprs_values = \"counts\" ) clusts <- clusterMoranPlot(sfe, rownames(sfe)[1], BLUSPARAM = KmeansParam(2) )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":null,"dir":"Reference","previous_headings":"","what":"Cluster variograms of multiple features — clusterVariograms","title":"Cluster variograms of multiple features — clusterVariograms","text":"function clusters variograms features across samples find patterns decays spatial autocorrelation. fitted variograms clustered different samples can different distance bins.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cluster variograms of multiple features — clusterVariograms","text":"","code":"clusterVariograms( sfe, features, BLUSPARAM, n = 20, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"variogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cluster variograms of multiple features — clusterVariograms","text":"sfe SpatialFeatureExperiment object correlograms computed features interest. features Features whose correlograms cluster. BLUSPARAM BlusterParam object specifying algorithm use. n Number points fitted variogram line. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cluster variograms of multiple features — clusterVariograms","text":"data frame 3 columns: feature features, cluster factor cluster membership features within sample, sample_id sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cluster variograms of multiple features — clusterVariograms","text":"","code":"library(SFEData) library(scater) library(bluster) library(Matrix) #> #> Attaching package: ‘Matrix’ #> The following object is masked from ‘package:S4Vectors’: #> #> expand sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) # Just the highly expressed genes gs <- order(Matrix::rowSums(counts(sfe)), decreasing = TRUE)[1:10] genes <- rownames(sfe)[gs] sfe <- runUnivariate(sfe, \"variogram\", features = genes) clusts <- clusterVariograms(sfe, genes, BLUSPARAM = HclustParam(), swap_rownames = \"symbol\") # Plot the clustering plotVariogram(sfe, genes, color_by = clusts, group = \"feature\", use_lty = FALSE, swap_rownames = \"symbol\", show_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":null,"dir":"Reference","previous_headings":"","what":"Get metadata of colData, rowData, and geometries — colFeatureData","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"Results spatial analyses columns colData, rowData, geometries stored metadata, can accessed metadata function. colFeaturedata function allows users directly access results.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"","code":"colFeatureData(sfe) rowFeatureData(sfe) geometryFeatureData(sfe, type, MARGIN = 2L) reducedDimFeatureData(sfe, dimred)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"sfe SFE object. type geometry, can name (character) index (integer) MARGIN Integer, 1 means rowGeometry, 2 means colGeometry, 3 means annotGeometry. Defaults 2, colGeometry. dimred Name dimension reduction, can seen reducedDimNames.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"DataFrame.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Moran's I for colData sfe <- colDataMoransI(sfe, \"nCounts\") colFeatureData(sfe) #> DataFrame with 12 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> barcode NA NA #> col NA NA #> row NA NA #> x NA NA #> y NA NA #> ... ... ... #> sample_id NA NA #> nCounts 0.675416 1.67027 #> nGenes NA NA #> prop_mito NA NA #> in_tissue NA NA"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":null,"dir":"Reference","previous_headings":"","what":"Colorblind friendly palette from dittoSeq — ditto_colors","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"Just get palette without install dependencies dittoSeq.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"","code":"ditto_colors"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"character vector hex colors palette. 40 colors.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"dittoSeq package.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":null,"dir":"Reference","previous_headings":"","what":"Get beginning and end of palette to center a divergent palette — getDivergeRange","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"function longer used internally unnecessary scico divergent palettes. can useful using divergent palettes outside scico one must specify beginning end midpoint, override default palette.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"","code":"getDivergeRange(values, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"values Numeric vector colored. diverge_center Value center , defaults 0.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"numeric vector length 2, first element beginning, second end. values 0 1.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"","code":"v <- rnorm(10) getDivergeRange(v, diverge_center = 0) #> [1] 0.1643015 1.0000000"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":null,"dir":"Reference","previous_headings":"","what":"Get parameters used in spatial methods — getParams","title":"Get parameters used in spatial methods — getParams","text":"getParams function allows users access parameters used compute results may stored colFeatureData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get parameters used in spatial methods — getParams","text":"","code":"getParams( sfe, name, local = FALSE, colData = FALSE, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get parameters used in spatial methods — getParams","text":"sfe SpatialFeatureExperiment object. name Name used store results. local Logical, whether results interest come local spatial method. colData Logical, whether results computed column colData(sfe). colGeometryName get results colGeometry. annotGeometryName get results annotGeometry; colGeometry precedence argument ignored colGeometryName specified. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get parameters used in spatial methods — getParams","text":"named list showing parameters","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get parameters used in spatial methods — getParams","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- colDataMoransI(sfe, \"nCounts\") getParams(sfe, \"moran\", colData = TRUE) #> $name #> [1] \"moran\" #> #> $package #> [1] \"spdep\" #> #> $version #> [1] ‘1.3.3’ #> #> $zero.policy #> NULL #> #> $include_self #> [1] FALSE #> #> $graph_params #> $graph_params$FUN #> [1] \"findVisiumGraph\" #> #> $graph_params$package #> $graph_params$package[[1]] #> [1] \"SpatialFeatureExperiment\" #> #> $graph_params$package[[2]] #> [1] ‘1.3.0’ #> #> #> $graph_params$args #> $graph_params$args$style #> [1] \"W\" #> #> $graph_params$args$zero.policy #> NULL #> #> $graph_params$args$sample_id #> [1] \"Vis5A\" #> #> #>"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":null,"dir":"Reference","previous_headings":"","what":"List all spatial methods in Voyager package — listSFEMethods","title":"List all spatial methods in Voyager package — listSFEMethods","text":"package ships many spatial statistics methods SFEMethod objects. user can adapt uniform user interface package spatial methods creating new SFEMethod objects. function lists names methods within Voyager, use type argument calculateUnivariate, calculateBivariate, calculateMultivariate.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"List all spatial methods in Voyager package — listSFEMethods","text":"","code":"listSFEMethods(variate = c(\"uni\", \"bi\", \"multi\"), scope = c(\"global\", \"local\"))"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"List all spatial methods in Voyager package — listSFEMethods","text":"variate Uni-, bi-, multi-variate. scope whether local global.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"List all spatial methods in Voyager package — listSFEMethods","text":"data frame column name another brief description.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"List all spatial methods in Voyager package — listSFEMethods","text":"","code":"listSFEMethods(\"uni\", \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert listw into sparse adjacency matrix — listw2sparse","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"Edge weights used adjacency matrix. elements matrix 0, using sparse matrix greatly reduces memory use.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"","code":"listw2sparse(listw)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"listw listw object spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"sparse dgCMatrix, whose row represents cell spot whose columns represent neighbors. matrix symmetric. region.id present listw object, row column names output matrix.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"","code":"library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache g <- findVisiumGraph(sfe) mat <- listw2sparse(g)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"Values Moran's can take depends spatial neighborhood graph. bounds Moran's given graph, C, given minimum maximum eigenvalues double centered -- .e. subtracting column means row means -- adjacency matrix \\((- \\mathbb{11}^T/n)C(- \\mathbb{11}^T/n)\\), \\(\\mathbb 1\\) vector 1's. implementation follows implementation adespatial uses RSpectra package quickly find minimum maximum eigenvalues without performing unnecessary work find full spectrum done base R's eigen.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"","code":"moranBounds(listw)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"listw listw object spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"numeric vector minimum maximum Moran's given spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"double centering, adjacency matrix longer sparse, function can take lot memory larger datasets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"de Jong, P., Sprenger, C., & van Veen, F. (1984). extreme values Moran's Geary's C. Geographical Analysis, 16(1), 17-24.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"","code":"# example code library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache g <- findVisiumGraph(sfe) moranBounds(g) #> Imin Imax #> -0.5825787 0.9725069"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Use ggplot to plot the moran.plot results — moranPlot","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"function uses ggplot2 plot Moran plot. plot aesthetically pleasing base R version implemented spdep. addition, contours plotted show point density plot, points can colored variable, clusters. contours may also filled influential points plotted. filled, viridis E option used.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"","code":"moranPlot( sfe, feature, graphName = 1L, sample_id = \"all\", contour_color = \"cyan\", color_by = NULL, colGeometryName = NULL, annotGeometryName = NULL, plot_singletons = TRUE, binned = FALSE, filled = FALSE, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, bins = 100, binwidth = NULL, hex = FALSE, plot_influential = TRUE, bins_contour = NULL, name = \"moran.plot\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"sfe SpatialFeatureExperiment object. feature Name one variable show plot. converted sentence case x axis lower case y axis appended \"Spatially lagged\". One feature time since colors color_by may specific feature (e.g. clusterMoranPlot). graphName Name colGraph annotGraph, spatial neighborhood graph used compute Moran plot. determine points singletons plot differently plot. sample_id One sample_id sample whose graph plot. contour_color Color point density contours, can changed contours stand points. color_by Variable color points . can name column colData, gene, name column colGeometry specified colGeometryName. can vector length number cells/spots sample_id interest. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. plot_singletons Logical, whether plot items spatial neighbors. binned Logical, whether plot 2D histograms. argument precedence filled. filled Logical, whether plot filled contours non-influential points plot influential points points. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. binwidth Width bins, passed geom_bin2d geom_hex. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. plot_influential Logical, whether plot influential points different palette binned = TRUE. bins_contour Number bins point density contour. Use smaller number make sparser contours. name Name Moran plot results stored. default \"moran.plot\". ... arguments pass geom_density2d.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData(\"full\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[, colData(sfe)$in_tissue] sfe <- logNormCounts(sfe) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- runUnivariate(sfe, type = \"moran.plot\", features = \"Myh1\", swap_rownames = \"symbol\") clust <- clusterMoranPlot(sfe, \"Myh1\", BLUSPARAM = KmeansParam(2), swap_rownames = \"symbol\") moranPlot(sfe, \"Myh1\", graphName = \"visium\", color_by = clust[, 1], swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"sample SFE object separate spatial neighborhood graph. Spatial analyses performed jointly multiple samples require combined spatial neighborhood graph different samples, different samples disconnected components graph. combined adjacency matrix can used MULTISPATI PCA.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"","code":"multi_listw2sparse(listws)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"listws list listw objects.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"sparse dgCMatrix combined spatial neighborhood graph, original spatial neighborhood graphs samples diagonal. input SFE object, rows columns match column names SFE object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"","code":"# example code"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":null,"dir":"Reference","previous_headings":"","what":"A faster implementation of MULTISPATI PCA — multispati_rsp","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"implementation uses RSpectra package efficiently compute small subset eigenvalues eigenvectors, small subset typically used. Hence much faster memory efficient original implementation adespatial. However, implementation support row column weighting standard ones PCA., adespatial implementation general.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"","code":"multispati_rsp(x, listw, nfposi = 30L, nfnega = 30L, scale = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"x matrix whose columns features rows cells. listw listw object, spatial neighborhood graph cells x. length must equal number row x. nfposi Number positive eigenvalues eigenvectors compute. nfnega Number nega eigenvalues eigenvectors compute. indicate negative spatial autocorrelation. scale Logical, whether scale data.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"matrix cell embeddings spatial PC, attribute loading eigenvectors gene loadings, attribute eig eigenvalues.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"Eigen decomposition fail feature variance zero leading NaN scaled matrix.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"Dray, S., Said, S. Debias, F. (2008) Spatial ordination vegetation data using generalization Wartenberg's multivariate spatial correlation. Journal vegetation science, 19, 45-56.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) inds <- order(rowSums(logcounts(sfe)), decreasing = TRUE)[1:50] mat <- logcounts(sfe)[inds,] g <- findVisiumGraph(sfe) out <- multispati_rsp(t(mat), listw = g, nfposi = 10, nfnega = 10)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cell density as 2D histogram — plotCellBin2D","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"function plots cell density histological space 2D histograms, especially helpful larger smFISH-based datasets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"","code":"plotCellBin2D( sfe, sample_id = \"all\", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"sfe SpatialFeatureExperiment object. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. bins Number bins. Can vector length 2 specify x y axes separately. binwidth Width bins, passed geom_bin2d geom_hex. hex Logical, whether use hexagonal bins. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"","code":"library(SFEData) sfe <- HeNSCLCData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache plotCellBin2D(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"function recommended instead plotColDataHistogram coloring multiple categories log transforming y axis, causes problems stacked histograms.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"","code":"plotColDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = \"free\", ncol = 1, position = \"identity\" ) plotRowDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = \"free\", ncol = 1, position = \"identity\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"sce SingleCellExperiment object. feature Names columns colData rowData plot. multiple features specified, plotted separate facets. color_by Name categorical column colData rowData color polygons. subset Name logical column plot subset data. bins Number bins. Overridden binwidth. Defaults 30. binwidth width bins. Can specified numeric value function calculates width unscaled x. , \"unscaled x\" refers original x values data, application scale transformation. specifying function along grouping structure, function called per group. default use number bins bins, covering range data. always override value, exploring multiple widths find best illustrate stories data. bin width date variable number days time; bin width time variable number seconds. linewidth Line width polygons, defaults thicker 1.2. scales scales fixed (\"fixed\", default), free (\"free\"), free one dimension (\"free_x\", \"free_y\")? ncol Number columns facetting. position Position adjustment, either string naming adjustment (e.g. \"jitter\" use position_jitter), result call position adjustment function. Use latter need change settings adjustment.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache plotColDataFreqpoly(sfe, c(\"nCounts\", \"nGenes\"), color_by = \"in_tissue\", bins = 50) plotColDataFreqpoly(sfe, \"nCounts\", subset = \"in_tissue\") sfe2 <- sfe[, sfe$in_tissue] plotColDataFreqpoly(sfe2, c(\"nCounts\", \"nGenes\"), bins = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot histograms for colData and rowData columns — plotColDataHistogram","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"Plot histograms colData rowData columns","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"","code":"plotColDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = \"free\", ncol = 1, position = \"stack\", ... ) plotRowDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = \"free\", ncol = 1, position = \"stack\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"sce SingleCellExperiment object. feature Names columns colData rowData plot. multiple features specified, plotted separate facets. fill_by Name categorical column colData rowData fill histogram. facet_by Column colData rowData facet . multiple features plotted, features different facets. case, setting facet_by call facet_grid features rows categories facet_by columns. subset Name logical column plot subset data. bins Numeric vector giving number bins vertical horizontal directions. Set 100 default. binwidth width bins. Can specified numeric value function calculates width unscaled x. , \"unscaled x\" refers original x values data, application scale transformation. specifying function along grouping structure, function called per group. default use number bins bins, covering range data. always override value, exploring multiple widths find best illustrate stories data. bin width date variable number days time; bin width time variable number seconds. scales scales fixed (\"fixed\", default), free (\"free\"), free one dimension (\"free_x\", \"free_y\")? ncol Number columns facetting. position Position adjustment, either string naming adjustment (e.g. \"jitter\" use position_jitter), result call position adjustment function. Use latter need change settings adjustment. ... arguments passed layer(). often aesthetics, used set aesthetic fixed value, like colour = \"red\" size = 3. may also parameters paired geom/stat.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"ggplot object","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache plotColDataHistogram(sfe, c(\"nCounts\", \"nGenes\"), fill_by = \"in_tissue\", bins = 50, position = \"stack\") plotColDataHistogram(sfe, \"nCounts\", subset = \"in_tissue\") sfe2 <- sfe[, sfe$in_tissue] plotColDataHistogram(sfe2, c(\"nCounts\", \"nGenes\"), bins = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot spatial graphs — plotColGraph","title":"Plot spatial graphs — plotColGraph","text":"ggplot version spdep::plot.nb, reducing boilerplate SFE objects.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot spatial graphs — plotColGraph","text":"","code":"plotColGraph( sfe, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL ) plotAnnotGraph( sfe, annotGraphName = 1L, annotGeometryName = 1L, sample_id = \"all\", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot spatial graphs — plotColGraph","text":"sfe SpatialFeatureExperiment object. colGraphName Name graph associated columns gene count matrix plotted. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. weights Whether plot weights. TRUE, transparency (alpha) segments represent edge weights. segment_size Thickness segments represent graph edges. geometry_size Point size (POINT geometries) line thickness (LINESTRING POLYGON) plot geometry background. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. annotGraphName Name annotation graph plot. annotGeometryName Name annotGeometry, associated graph specified annotGraphName, spatial coordinates graph nodes context.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot spatial graphs — plotColGraph","text":"ggplot2 object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot spatial graphs — plotColGraph","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(sf) #> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) plotColGraph(sfe, colGraphName = \"visium\", colGeometryName = \"spotPoly\") # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, \"myofiber_simplified\") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, \"myofiber_simplified\") <- ag annotGraph(sfe, \"myofibers\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3, method = \"tri2nb\", dist_type = \"idw\" ) plotAnnotGraph(sfe, annotGraphName = \"myofibers\", annotGeometryName = \"myofiber_simplified\", weights = TRUE )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot correlogram — plotCorrelogram","title":"Plot correlogram — plotCorrelogram","text":"Use ggplot2 plot correlograms computed runUnivariate, pulling results rowData. Correlograms multiple genes error bars can plotted, can colored numeric categorical column rowData vector length nrow SFE object. coloring useful correlograms clustered show types length scales patterns decay spatial autocorrelation. method = \"\", error bars twice standard deviation estimated Moran's value.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot correlogram — plotCorrelogram","text":"","code":"plotCorrelogram( sfe, features, sample_id = \"all\", method = \"I\", color_by = NULL, facet_by = c(\"sample_id\", \"features\"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, plot_signif = TRUE, p_adj_method = \"BH\", divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = \"sp.correlogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot correlogram — plotCorrelogram","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. method \"corr\" correlation, \"\" Moran's , \"C\" Geary's C color_by Name column rowData(sfe) featureData colData (see colFeatureData), colGeometry, annotGeometry color correlogram feature. Alternatively, vector length features, data frame clusterCorrelograms. facet_by Whether facet sample_id (default) features. facetting sample_id, different features plotted facet comparison. facetting features, different samples compared feature. Ignored one sample specified. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. plot_signif Logical, whether plot significance symbols: p < 0.001: ***, p < 0.01: **, p < 0.05 *, p < 0.1: ., otherwise symbol. p-values two sided, based assumption estimated Moran's normally distributed mean randomized version data. mean variance come moran.test Moran's geary.test Geary's C. Take results grain salt data normally distributed. p_adj_method Multiple testing correction method p.adjust, correct multiple testing (number lags times number features) Moran's estimates plot_signif = TRUE. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot correlogram — plotCorrelogram","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot correlogram — plotCorrelogram","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) features <- rownames(sfe)[inds] sfe <- runUnivariate(sfe, type = \"sp.correlogram\", features = features, exprs_values = \"counts\", order = 5 ) clust <- clusterCorrelograms(sfe, features = features, BLUSPARAM = KmeansParam(2) ) # Color by features plotCorrelogram(sfe, features) # Color by something else plotCorrelogram(sfe, features, color_by = clust$cluster) # Facet by features plotCorrelogram(sfe, features, facet_by = \"features\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cross variogram — plotCrossVariogram","title":"Plot cross variogram — plotCrossVariogram","text":"Equivalent gstat::plot.gstatVariogram, using ggplot2 customizable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cross variogram — plotCrossVariogram","text":"","code":"plotCrossVariogram(res, show_np = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cross variogram — plotCrossVariogram","text":"res Cross variogram results one sample, calculateBivariate. Global bivariate results stored SFE object. show_np Logical, whether show number pairs cells distance bin.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cross variogram — plotCrossVariogram","text":"ggplot object. Unfortunately figured way collect facet labels top entire plot.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cross variogram — plotCrossVariogram","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = \"cross_variogram\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\") plotCrossVariogram(res)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cross variogram map — plotCrossVariogramMap","title":"Plot cross variogram map — plotCrossVariogramMap","text":"Equivalent gstat::plot.gstatVariogram, using ggplot2 customizable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cross variogram map — plotCrossVariogramMap","text":"","code":"plotCrossVariogramMap(res, plot_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cross variogram map — plotCrossVariogramMap","text":"res Cross variogram results one sample, calculateBivariate. Global bivariate results stored SFE object. plot_np Logical, whether plot number pairs distance bin instead variance.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cross variogram map — plotCrossVariogramMap","text":"ggplot object.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cross variogram map — plotCrossVariogramMap","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = \"cross_variogram_map\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\", width = 500, cutoff = 2000) plotCrossVariogramMap(res)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot top PC loadings of genes — plotDimLoadings","title":"Plot top PC loadings of genes — plotDimLoadings","text":"Just like Seurat's VizDimLoadings function. found equivalent SCE find useful. trying reproduce Seurat function exactly. instance, like Seurat imposes ggplot theme, like cowplot theme. Maybe rewrite base R now using Tidyverse.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot top PC loadings of genes — plotDimLoadings","text":"","code":"plotDimLoadings( sce, dims = 1:4, nfeatures = 10, swap_rownames = NULL, reduction = \"PCA\", balanced = TRUE, ncol = 2, sample_id = \"all\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot top PC loadings of genes — plotDimLoadings","text":"sce SingleCellExperiment object, anything inherits SingleCellExperiment. dims Numeric vector specifying PCs plot. MULTISPATI, PCs negative eigenvalues right columns embedding loading matrices. See ElbowPlot. nfeatures Number genes plot. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. reduction Name dimension reduction use. must attribute called either \"percentVar\" \"eig\" eigenvalues. Defaults \"PCA\". balanced Return equal number genes + - scores. FALSE, returns top genes ranked scores absolute values. ncol Number columns facetted plot. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot top PC loadings of genes — plotDimLoadings","text":"ggplot object. Loadings different PCs plotted different facets one ggplot object returned.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot top PC loadings of genes — plotDimLoadings","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- runPCA(sfe, ncomponents = 10, exprs_values = \"counts\") plotDimLoadings(sfe, dims = 1:2)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot geometries without coloring — plotGeometry","title":"Plot geometries without coloring — plotGeometry","text":"Different samples plotted separate facets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot geometries without coloring — plotGeometry","text":"","code":"plotGeometry( sfe, type, MARGIN = 2L, sample_id = \"all\", ncol = NULL, bbox = NULL, image_id = NULL, maxcell = 5e+05 )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot geometries without coloring — plotGeometry","text":"sfe SpatialFeatureExperiment object. type Name geometry associated MARGIN interest compute graph. MARGIN Just like apply, 1 stands row, 2 stands column. , addition, 3 stands annotation, query annotGeometries, nuclei segmentation Visium data sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot geometries without coloring — plotGeometry","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot geometries without coloring — plotGeometry","text":"","code":"library(SFEData) sfe1 <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe2 <- McKellarMuscleData(\"small2\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache sfe <- cbind(sfe1, sfe2) sfe <- removeEmptySpace(sfe) plotGeometry(sfe, \"spotPoly\") plotGeometry(sfe, \"myofiber_simplified\", MARGIN = 3)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot local results — plotLocalResult","title":"Plot local results — plotLocalResult","text":"Plot results local spatial analyses space, local Getis-Ord Gi* values.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot local results — plotLocalResult","text":"","code":"plotLocalResult( sfe, name, features, attribute = NULL, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, ncol = NULL, ncol_sample = NULL, annot_aes = list(), annot_fixed = list(), bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = \"black\", fill = \"gray80\", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, type = name, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot local results — plotLocalResult","text":"sfe SpatialFeatureExperiment object. name local spatial results. Use localResultNames see types results already calculated. features Character vector vectors. see features results given type, see localResultFeatures. attribute field local results type features. result feature vector, argument ignored. result data frame matrix, column name result, \"Ii\" local Moran's . local spatial analysis method, default attribute. See Details. Use localResultAttrs. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. type SFEMethod object string corresponding name one objects environment. localResult interest manually added outside runUnivariate runBivariate, method recorded, type argument can used specify method properly get title labels. default, argument set argument name. method parameters recorded, type argument ignored. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot local results — plotLocalResult","text":"ggplot2 object plotting one feature. patchwork object plotting multiple features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Plot local results — plotLocalResult","text":"Many local spatial analyses return data frame matrix results, whose columns can statistic interest location, variance, expected value permutation, p-value, etc. attribute argument specifies column use multiple columns. defaults local method supported package mean: localmoran localmoran_perm Ii, local Moran's statistic location. localC_perm localC, local Geary C statistic location. localG localG_perm localG, local Getis-Ord Gi Gi* statistic. include_self = TRUE calculateUnivariate runUnivariate called, Gi*. Otherwise Gi. LOSH LOSH.mc Hi, local spatial heteroscedasticity moran.plot wx, average value neighbor location. Moran plot best plotted scatter plot wx vs x. See moranPlot. local methods listed return vectors results. instance, localC returns vector default, local Geary's C statistic.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Plot local results — plotLocalResult","text":"function shares internals plotSpatialFeature, important differences. plotSpatialFeature, annotGeometry indeed used annotation protagonist colGeometry, since easy directly use ggplot2 plot data annotGeometry sf data frames overlaying annotGeometry colGeometry involves complicated code. contrast, function, local results annotGeometry can plotted separately without anything related colGeometry. Note annotGeometry local results plotted without colGeometry, annot_* arguments ignored. Use arguments aesthetics colGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot local results — plotLocalResult","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) feature_use <- rownames(sfe)[1] sfe <- logNormCounts(sfe) sfe <- runUnivariate(sfe, \"localmoran\", feature_use) # Which types of results are available? localResultNames(sfe) #> [1] \"localmoran\" # Which features for localmoran? localResultFeatures(sfe, \"localmoran\") #> [1] \"ENSMUSG00000025902\" # Which columns does the localmoran results have? localResultAttrs(sfe, \"localmoran\", feature_use) #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe, \"localmoran\", feature_use, \"Ii\", colGeometryName = \"spotPoly\" ) # For annotGeometry # Make sure it's type POLYGON annotGeometry(sfe, \"myofiber_simplified\") <- sf::st_buffer(annotGeometry(sfe, \"myofiber_simplified\"), 0) annotGraph(sfe, \"poly2nb_myo\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, \"localmoran\", features = \"area\", annotGraphName = \"poly2nb_myo\", annotGeometryName = \"myofiber_simplified\", zero.policy = TRUE ) plotLocalResult(sfe, \"localmoran\", \"area\", \"Ii\", annotGeometryName = \"myofiber_simplified\", size = 0.3, color = \"cyan\" ) plotLocalResult(sfe, \"localmoran\", \"area\", \"Z.Ii\", annotGeometryName = \"myofiber_simplified\" ) # don't use annot_* arguments when annotGeometry is plotted without colGeometry"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot Moran/Geary Monte Carlo results — plotMoranMC","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"Plot simulations density plot histogram compared observed Moran's Geary's C, ggplot2 looks nicer. Unlike plotting function spdep, function can also plot feature different samples facets plot different features samples together comparison.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"","code":"plotMoranMC( sfe, features, sample_id = \"all\", facet_by = c(\"sample_id\", \"features\"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, ptype = c(\"density\", \"histogram\", \"freqpoly\"), swap_rownames = NULL, name = \"moran.mc\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. facet_by Whether facet sample_id (default) features. facetting sample_id, different features plotted facet comparison. facetting features, different samples compared feature. Ignored one sample specified. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. ptype Plot type, one \"density\", \"histogram\", \"freqpoly\". swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name Monte Carlo results stored, defaults \"moran.mc\". Geary's C Monte Carlo, default \"geary.mc\". ... arguments passed geom_density, geom_histogram, geom_freqpoly, depending ptype.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"ggplot2 object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- colDataUnivariate(sfe, type = \"moran.mc\", \"nCounts\", nsim = 100) plotMoranMC(sfe, \"nCounts\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot gene expression in space — plotSpatialFeature","title":"Plot gene expression in space — plotSpatialFeature","text":"Unlike Seurat ggspavis, plotting functions package uses geom_sf whenever applicable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot gene expression in space — plotSpatialFeature","text":"","code":"plotSpatialFeature( sfe, features, colGeometryName = 1L, sample_id = \"all\", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, annot_aes = list(), annot_fixed = list(), exprs_values = \"logcounts\", bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NA, annot_divergent = FALSE, annot_diverge_center = NA, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = \"black\", fill = \"gray80\", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot gene expression in space — plotSpatialFeature","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. exprs_values Integer scalar string indicating assay x contains expression values. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot gene expression in space — plotSpatialFeature","text":"ggplot2 object plotting one feature. patchwork object plotting multiple features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Plot gene expression in space — plotSpatialFeature","text":"documentation function, \"feature\" can gene (whatever entity corresponds rows gene count matrix), column colData, column colGeometry sf data frame specified colGeometryName argument. light theme, continuous variables, Blues palette colorbrewer used divergent = FALSE, roma palette scico package divergent = TRUE. dark theme, nuuk palette scico used divergent = FALSE, berlin palette scico used divergent = TRUE. discrete variables, dittoSeq palette used. annotation, YlOrRd colorbrewer palette used continuous variables light theme. dark theme, acton palette scico used divergent = FALSE vanimo palette scico used divergent = FALSE. end dittoSeq palette used discrete variables. individual palette colorblind friendly, plotting continuous variables coloring colGeometry annotGeometry simultaneously, combination two palettes guaranteed colorblind friendly. addition, plotting image behind geometries, colors image may distort color perception values geometries. theme_void used spatial plots package, units spatial coordinates often arbitrary. can overriden show axes using different theme normally done ggplot2.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot gene expression in space — plotSpatialFeature","text":"","code":"library(SFEData) library(sf) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache # features can be genes or colData or colGeometry columns plotSpatialFeature(sfe, c(\"nCounts\", rownames(sfe)[1]), exprs_values = \"counts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\" ) # Change fixed aesthetics plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(color = \"blue\", size = 0.3, fill = NA), alpha = 0.7 ) # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, \"myofiber_simplified\") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, \"myofiber_simplified\") <- ag # Also plot an annotGeometry variable plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\") ) # Use a bounding box to zoom in bbox <- c(xmin = 5500, ymin = 13500, xmax = 6000, ymax = 14000) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometry = \"myofiber_simplified\", bbox = bbox, annot_fixed = list(linewidth = 0.3))"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot variogram — plotVariogram","title":"Plot variogram — plotVariogram","text":"function plots variogram feature fitted variogram models, showing nugget, range, sill model. Unlike plotting functions package automap uses lattice, function uses ggplot2 make prettier customizable plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot variogram — plotVariogram","text":"","code":"plotVariogram( sfe, features, sample_id = \"all\", color_by = NULL, group = c(\"none\", \"sample_id\", \"features\", \"angles\"), use_lty = TRUE, show_np = TRUE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = \"variogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot variogram — plotVariogram","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. color_by Name column rowData(sfe) featureData colData (see colFeatureData), colGeometry, annotGeometry color correlogram feature. Alternatively, vector length features, data frame clusterCorrelograms. group samples, features, angles show facet comparison multiple. Default \"none\", meaning facet contain one variogram. grouping multiple variograms facet, text model, nugget, sill, range variograms shown. use_lty Logical, whether use linetype point shape distinguish different features samples facet. FALSE, different features samples distinguished patterns shown . show_np Logical, whether show number pairs cells distance bin. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot variogram — plotVariogram","text":"ggplot object. empirical variogram distance bin plotted points, fitted variogram model plotted line feature. number next point number pairs cells distance bin.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot variogram — plotVariogram","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- colDataUnivariate(sfe, \"variogram\", features = \"nCounts\", model = \"Sph\") plotVariogram(sfe, \"nCounts\") # Anisotropy, will get a message sfe <- colDataUnivariate(sfe, \"variogram\", features = \"nCounts\", model = \"Sph\", alpha = c(30, 90, 150), name = \"variogram_anis\") #> gstat does not fit anisotropic variograms. Variogram model is fitted to the whole dataset. # Facet by angles by default plotVariogram(sfe, \"nCounts\", name = \"variogram_anis\") # Plot angles with different colors plotVariogram(sfe, \"nCounts\", group = \"angles\", name = \"variogram_anis\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot variogram maps — plotVariogramMap","title":"Plot variogram maps — plotVariogramMap","text":"Plot variogram maps show variogram directions grid distances x y coordinates.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot variogram maps — plotVariogramMap","text":"","code":"plotVariogramMap( sfe, features, sample_id = \"all\", plot_np = FALSE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"variogram_map\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot variogram maps — plotVariogramMap","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. plot_np Logical, whether plot number pairs distance bin instead variance. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot variogram maps — plotVariogramMap","text":"ggplot object.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot variogram maps — plotVariogramMap","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- colDataUnivariate(sfe, \"variogram_map\", features = \"nCounts\", width = 500, cutoff = 5000) plotVariogramMap(sfe, \"nCounts\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot dimension reduction components in space — spatialReducedDim","title":"Plot dimension reduction components in space — spatialReducedDim","text":"plotting value projection gene expression cell principal component space. present, function work 3D array geographically weighted PCA (GWPCA), future version deal GWPCA results.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot dimension reduction components in space — spatialReducedDim","text":"","code":"spatialReducedDim( sfe, dimred, ncomponents = NULL, components = ncomponents, colGeometryName = 1L, sample_id = \"all\", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, annot_aes = list(), annot_fixed = list(), exprs_values = \"logcounts\", bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = NA, fill = \"gray80\", scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot dimension reduction components in space — spatialReducedDim","text":"sfe SpatialFeatureExperiment object. dimred string integer scalar indicating reduced dimension result reducedDims(sfe) plot. ncomponents numeric scalar indicating number dimensions plot, starting first dimension. Alternatively, numeric vector specifying dimensions plotted. components numeric scalar vector specifying dimensions plotted. Use instead ncomponents plotting one dimension. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. exprs_values Integer scalar string indicating assay x contains expression values. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot dimension reduction components in space — spatialReducedDim","text":"plotSpatialFeature. ggplot2 object plotting one component. patchwork object plotting multiple components.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot dimension reduction components in space — spatialReducedDim","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) sfe <- runPCA(sfe, ncomponents = 2) spatialReducedDim(sfe, \"PCA\", ncomponents = 2, \"spotPoly\", annotGeometryName = \"tissueBoundary\", divergent = TRUE, diverge_center = 0 ) # Basically PC1 separates spots not on tissue from those on tissue."},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute variograms — variogram-internal","title":"Compute variograms — variogram-internal","text":"Wrapper automap::autofitVariogram facilitate computing variograms multiple genes SFE objects EDA tool. functions written conform uniform format univariate methods called internally. functions exported, documentation written show users extra arguments use alling calculateUnivariate runUnivariate.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute variograms — variogram-internal","text":"","code":".variogram(x, coords_df, formula = x ~ 1, scale = TRUE, ...) .variogram_bv(x, y, coords_df, scale = TRUE, map = FALSE, ...) .cross_variogram(x, y, coords_df, scale = TRUE, ...) .cross_variogram_map(x, y, coords_df, width, cutoff, scale = TRUE, ...) .variogram_map(x, coords_df, formula = x ~ 1, width, cutoff, scale = TRUE, ...)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute variograms — variogram-internal","text":"x numeric vector whose variogram computed. coords_df sf data frame geometry regressors variogram modeling. formula formula defining response vector (possible) regressors, case absence regressors, use x ~ 1. scale Logical, whether scale x. Defaults TRUE variogram easier interpret comparable features different magnitudes length scale spatial autocorrelation interest. ... arguments passed automap::autofitVariogram model variogram alpha anisotropy. Note gstat fit ansotropic models get warning specify alpha. Nevertheless, plotting empirical anisotropic variograms comparing variogram fitted entire dataset can useful EDA tool. y bivariate, another numeric vector whose variogram computed. map logical; TRUE, cutoff width given, variogram map returned. requires package sp. Alternatively, map can passed, class SpatialDataFrameGrid (see sp docs) width width subsequent distance intervals data point pairs grouped semivariance estimates cutoff spatial separation distance point pairs included semivariance estimates; default, length diagonal box spanning data divided three.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute variograms — variogram-internal","text":"autofitVariogram object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-131-05152023","dir":"Changelog","previous_headings":"","what":"Version 1.3.1 (05/15/2023)","title":"Version 1.3.1 (05/15/2023)","text":"Removed functions arguments deprecated 1.2.0","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-127-09192023","dir":"Changelog","previous_headings":"","what":"Version 1.2.7 (09/19/2023)","title":"Version 1.2.7 (09/19/2023)","text":"Polygon boundaries show despite linewidth = 0 Windows users. Set color = NA polygons linewidth = 0 default work Windows.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-126-09192023","dir":"Changelog","previous_headings":"","what":"Version 1.2.6 (09/19/2023)","title":"Version 1.2.6 (09/19/2023)","text":"Fixed bug plotColGraph one multiple samples plotted. Allow 16 bit images spatial plotting functions. Removed adespatial Suggests ’s used reference unit tests got removed CRAN.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-125-08182023","dir":"Changelog","previous_headings":"","what":"Version 1.2.5 (08/18/2023)","title":"Version 1.2.5 (08/18/2023)","text":"Use imgRaster getter rather S4 -@image get images plot, latter longer work SFE 1.2.3 wraps SpatRaster images saving RDS. Reading RDS won’t unwrap images need unwrapped ’re needed.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-124-07042023","dir":"Changelog","previous_headings":"","what":"Version 1.2.4 (07/04/2023)","title":"Version 1.2.4 (07/04/2023)","text":"Remove useNames = NA warning calling MULTISPATI; warning comes generic colVars. Use algebraic eigenvalues MULTISPATI either nfposi nfnega 0 Added bins_contour argument moranPlot change number bins cell density contours","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-123-05042023","dir":"Changelog","previous_headings":"","what":"Version 1.2.3 (05/04/2023)","title":"Version 1.2.3 (05/04/2023)","text":"Fix bug plotting feature illegal name alongside another feature legal name Make sure runBivariate calculateBivariate use gene symbols results even Ensembl IDs specified swap_rownames set Change secondary sequential palette light theme YlOrRd ’s distinguishable Blues primary palette low values","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-122-04262023","dir":"Changelog","previous_headings":"","what":"Version 1.2.2 (04/26/2023)","title":"Version 1.2.2 (04/26/2023)","text":"minor bugs: runBivariate gets correct feature names feature1 specified swap_rownames used show gene symbol Correct output cross variogram maps one pair genes Added default_attr localmoran_bv’s SFEMethod Don’t plot attribute localResult vector ’s default attr plotting multiple features, panels follow order features specified Allow illegal characters names colData reducedDims plots Plot one component spatialReducedDim components argument Deprecate plotColDataBin2D plotRowDataBin2D","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1112-04222023","dir":"Changelog","previous_headings":"","what":"Version 1.1.12 (04/22/2023)","title":"Version 1.1.12 (04/22/2023)","text":"Plot image behind geometries functions plot geometries Added dark theme support functions plot geometries","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1111-04052023","dir":"Changelog","previous_headings":"","what":"Version 1.1.11 (04/05/2023)","title":"Version 1.1.11 (04/05/2023)","text":"Added MULTISPATI PCA Added multivariate local Geary’s C Anselin 2019 Added calculateMultivariate unified user interface multivariate spatial analyses Variogram variogram map gstat related plotting functions Allow non-standard names local results plotLocalResult","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1110-03072023","dir":"Changelog","previous_headings":"","what":"Version 1.1.10 (03/07/2023)","title":"Version 1.1.10 (03/07/2023)","text":"Record parameters used get spatial results Force users use new name running method different parameters","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-119-02122023","dir":"Changelog","previous_headings":"","what":"Version 1.1.9 (02/12/2023)","title":"Version 1.1.9 (02/12/2023)","text":"Deprecated show_symbol argument, replacing swap_rownames consistent scater","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-117","dir":"Changelog","previous_headings":"","what":"Version 1.1.7","title":"Version 1.1.7","text":"Added bbox argument spatial plotting functions zoom bounding box","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1010-02232023","dir":"Changelog","previous_headings":"","what":"Version 1.0.10 (02/23/2023)","title":"Version 1.0.10 (02/23/2023)","text":"Added plotColDataFreqpoly y axis needs log transformed. doesn’t work stacked histograms using position = “identity” causes bars covered.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-109-02032023","dir":"Changelog","previous_headings":"","what":"Version 1.0.9 (02/03/2023)","title":"Version 1.0.9 (02/03/2023)","text":"Fixed bug hardcoded ncol plotDimLoadings.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-108-01262023","dir":"Changelog","previous_headings":"","what":"Version 1.0.8 (01/26/2023)","title":"Version 1.0.8 (01/26/2023)","text":"Flipped divergent palettes warm color means high value.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-107-01112023","dir":"Changelog","previous_headings":"","what":"Version 1.0.7 (01/11/2023)","title":"Version 1.0.7 (01/11/2023)","text":"Fixed bug assigning local results sample colData, colGeometry, annotGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-105-12022022","dir":"Changelog","previous_headings":"","what":"Version 1.0.5 (12/02/2022)","title":"Version 1.0.5 (12/02/2022)","text":"Removed aes_string(), deprecated. Fixed bug show_symbol = TRUE “symbol” column absent rowData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-100-11022022","dir":"Changelog","previous_headings":"","what":"Version 1.0.0 (11/02/2022)","title":"Version 1.0.0 (11/02/2022)","text":"First version Bioconductor Univariate local global spatial statistics based spdep Plotting functions: gene expression metadata space, results local spatial analyses, plot dimension reductions space, plot correlograms Monte Carlo simulation results","code":""}] +[{"path":"https://pachterlab.github.io/voyager/dev/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"Artistic License 2.0","title":"Artistic License 2.0","text":"Copyright (c) 2000-2006, Perl Foundation. Everyone permitted copy distribute verbatim copies license document, changing allowed. Preamble ******** license establishes terms given free software Package may copied, modified, distributed, /redistributed. intent Copyright Holder maintains artistic control development Package still keeping Package available open source free software. always permitted make arrangements wholly outside license directly Copyright Holder given Package. terms license permit full use propose make Package, contact Copyright Holder seek different licensing arrangement. Definitions *********** “Copyright Holder” means individual(s) organization(s) named copyright notice entire Package. “Contributor” means party contributed code material Package, accordance Copyright Holder’s procedures. “” “” means person like copy, distribute, modify Package. “Package” means collection files distributed Copyright Holder, derivatives collection /files. given Package may consist either Standard Version, Modified Version. “Distribute” means providing copy Package making accessible anyone else, case company organization, others outside company organization. “Distributor Fee” means fee charge Distributing Package providing support Package another party. mean licensing fees. “Standard Version” refers Package modified, modified ways explicitly requested Copyright Holder. “Modified Version” means Package, changed, changes explicitly requested Copyright Holder. “Original License” means Artistic License Distributed Standard Version Package, current version may modified Perl Foundation future. “Source” form means source code, documentation source, configuration files Package. “Compiled” form means compiled bytecode, object code, binary, form resulting mechanical transformation translation Source form. Permission Use Modification Without Distribution ******************************************************** permitted use Standard Version create use Modified Versions purpose without restriction, provided Distribute Modified Version. Permissions Redistribution Standard Version ****************************************************** may Distribute verbatim copies Source form Standard Version Package medium without restriction, either gratis Distributor Fee, provided duplicate original copyright notices associated disclaimers. discretion, verbatim copies may may include Compiled form Package. may apply bug fixes, portability changes, modifications made available Copyright Holder. resulting Package still considered Standard Version, subject Original License. Distribution Modified Versions Package Source ********************************************************** may Distribute Modified Version Source (either gratis Distributor Fee, without Compiled form Modified Version) provided clearly document differs Standard Version, including, limited , documenting non-standard features, executables, modules, provided least ONE following: make Modified Version available Copyright Holder Standard Version, Original License, Copyright Holder may include modifications Standard Version. ensure installation Modified Version prevent user installing running Standard Version. addition, Modified Version must bear name different name Standard Version. allow anyone receives copy Modified Version make Source form Modified Version available others Original License license permits licensee freely copy, modify redistribute Modified Version using licensing terms apply copy licensee received, requires Source form Modified Version, works derived , made freely available license fees prohibited Distributor Fees allowed. Distribution Compiled Forms Standard Version Modified ****************************************************************** Versions without Source *************************** may Distribute Compiled forms Standard Version without Source, provided include complete instructions get Source Standard Version. instructions must valid time distribution. instructions, time carrying distribution, become invalid, must provide new instructions demand cease distribution. provide valid instructions cease distribution within thirty days become aware instructions invalid, forfeit rights license. may Distribute Modified Version Compiled form without Source, provided comply Section 4 respect Source Modified Version. Aggregating Linking Package ********************************** may aggregate Package (either Standard Version Modified Version) packages Distribute resulting aggregation provided charge licensing fee Package. Distributor Fees permitted, licensing fees components aggregation permitted. terms license apply use Distribution Standard Modified Versions included aggregation. permitted link Modified Standard Versions works, embed Package larger work , build stand-alone binary bytecode versions applications include Package, Distribute result without restriction, provided result expose direct interface Package. Items Considered Part Modified Version ******************************************************** Works (including, limited , modules scripts) merely extend make use Package, , , cause Package Modified Version. addition, works considered parts Package , subject terms license. General Provisions ****************** use, modification, distribution Standard Modified Versions governed Artistic License. using, modifying distributing Package, accept license. use, modify, distribute Package, accept license. Modified Version derived Modified Version made someone , nevertheless required ensure Modified Version complies requirements license. license grant right use trademark, service mark, tradename, logo Copyright Holder. license includes non-exclusive, worldwide, free--charge patent license make, made, use, offer sell, sell, import otherwise transfer Package respect patent claims licensable Copyright Holder necessarily infringed Package. institute patent litigation (including cross-claim counterclaim) party alleging Package constitutes direct contributory patent infringement, Artistic License shall terminate date litigation filed. Disclaimer Warranty: PACKAGE PROVIDED COPYRIGHT HOLDER CONTRIBUTORS “’ WITHOUT EXPRESS IMPLIED WARRANTIES. IMPLIED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE, NON-INFRINGEMENT DISCLAIMED EXTENT PERMITTED LOCAL LAW. UNLESS REQUIRED LAW, COPYRIGHT HOLDER CONTRIBUTOR LIABLE DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL DAMAGES ARISING WAY USE PACKAGE, EVEN ADVISED POSSIBILITY DAMAGE.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xatac_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell ATAC Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xatac_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell ATAC Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xcrispr_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell CRISPR Screening Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xcrispr_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell CRISPR Screening Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xmultiome_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell Multiome Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xmultiome_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell Multiome Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"Pros: Applicable frozen samples Capture nascent transcripts Sensible tissues multiple nuclei one cell Cons: Lose spatial information expensive less flexible open source technologies","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/10xnuclei_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Nuclei Isolation Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Bivariate spatial statistics","text":"Consider two variables correlated, say Pearson correlation 0.8. observations spatially referenced. locations observations can permuted without affecting Pearson correlation. purpose bivariate spatial statistics indicate correlation value (Pearson correlation), spatial autocorrelation co-patterning. One bivariate methods implemented Voyager cross variogram, shown variogram vignette. vignette demonstrates bivariate spatial statistics, use spatial neighborhood graph, mouse skeletal muscle Visium dataset. load packages used: list bivariate global methods can seen : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods(). QC performed another vignette, vignette plot QC metrics. image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(pheatmap) library(scico) theme_set(theme_bw()) listSFEMethods(variate = \"bi\", scope = \"global\") #> name description #> 1 lee Lee's bivariate statistic #> 2 lee.mc Lee's bivariate static with permutation testing #> 3 lee.test Lee's L test #> 4 cross_variogram Cross variogram #> 5 cross_variogram_map Cross variogram map (sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\") sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] sfe_tissue <- logNormCounts(sfe_tissue) colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"lees-l","dir":"Articles","previous_headings":"","what":"Lee’s L","title":"Bivariate spatial statistics","text":"Lee’s L (Lee 2001) developed relating Moran’s Pearson correlation, defined \\[ L_{X,Y} = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\left[ \\sum_{j=1}^n w_{ij} (x_j - \\bar{x}) \\right] \\left[ \\sum_{j=1}^n w_{ij} (y_j - \\bar{y}) \\right]}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2}\\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2} }, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) \\(y\\) variables values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood. compute Lee’s L top highly variagle genes (HVGs) dataset: bivariate global results can different formats (matrix Lee’s L lists many methods), results stored SFE object. gives spatially informed correlation matrix among genes, can plotted heatmap: coexpression blocks can seen. Note unlike Pearson correlation, diagonal 1, \\[ L_{X,X} = \\frac{\\sum_i (\\tilde x_i - \\bar x)^2}{\\sum_i (x_i - \\bar x)^2} = \\mathrm{SSS}_X, \\] approximated ratio variance spatially lagged \\(x\\) variance \\(x\\). spatial lag introduces smoothing, spatial lag reduced variance, making diagonal less 1. spatial smoothing scalar (SSS), Moran’s approximately Pearson correlation \\(X\\) spatially lagged \\(X\\) (\\(\\tilde X\\)) multiplied SSS: \\[ \\approx \\mathrm{SSS}_X \\cdot \\rho_{X, \\tilde X} \\] Similarly Lee’s L, shown (Lee 2001), \\[ L_{X, Y} = \\sqrt{\\mathrm{SSS}_X}\\sqrt{\\mathrm{SSS}_Y} \\cdot \\rho_{\\tilde X, \\tilde Y} \\] spatial clustering, variance less reduced spatial lag, leading larger SSS. Hence \\(X\\) \\(Y\\) spatially distributed like salt pepper strongly correlated, Lee’s L low lack spatial autocorrelation leads small SSS. Weighted correlation network analysis (WGCNA) (Langfelder Horvath 2008) time honored method find gene co-expression modules, can take correlation matrix. interesting apply WGCNA Lee’s L matrix identify spatially informed gene co-expression modules.","code":"hvgs <- getTopHVGs(sfe_tissue, fdr.threshold = 0.01) res <- calculateBivariate(sfe_tissue, type = \"lee\", feature1 = hvgs) pal_rng <- getDivergeRange(res) pal <- scico(256, begin = pal_rng[1], end = pal_rng[2], palette = \"vik\") pheatmap(res, color = pal, show_rownames = FALSE, show_colnames = FALSE, cellwidth = 1, cellheight = 1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"local-lee","dir":"Articles","previous_headings":"","what":"Local Lee","title":"Bivariate spatial statistics","text":"Local Lee’s L (Lee 2001) defined \\[ L_i = \\frac{n\\left[ \\sum_{j=1}^n w_{ij} (x_j - \\bar{x}) \\right] \\left[ \\sum_{j=1}^n w_{ij} (y_j - \\bar{y}) \\right]}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2}\\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2} } \\] Compare global L previous section. Local L sum locations \\(\\). contribution location global L can show spatial heterogeneity relationship two variables. bivariate local methods Voyager listed : compute local L two myofiber marker genes one gene highly expressed injury site: Bivariate local results stored localResults field feature names pairwise combinations features supplied. feature1 specified, bivariate method applied pairwise combinations feature1. Lee’s L, \\(L_{X,Y}\\) \\(L_{Y,X}\\) computed although . However, bivariate methods symmetric (see next section). next release (Bioconductor 3.18), may introduce another argument indicate whether method symmetric compute \\(L_{X,Y}\\) \\(L_{Y,X}\\). First plot three genes individually: plot local L’s: see regions Myh1 Myh2 co-expressed, myosins Ftl1 negatively correlated. \\(L_{X,X}\\) also computed, can plot local SSS three genes: See local SSS compares local Moran’s : patterns qualitatively , local Moran’s negative heterogeneous regions, SSS can’t negative.","code":"listSFEMethods(\"bi\", \"local\") #> name description #> 1 locallee Local Lee's bivariate statistic #> 2 localmoran_bv Local bivariate Moran's I sfe_tissue <- runBivariate(sfe_tissue, \"locallee\", swap_rownames = \"symbol\", feature1 = c(\"Myh2\", \"Myh1\", \"Ftl1\")) localResultFeatures(sfe_tissue, \"locallee\") #> [1] \"Myh2__Myh2\" \"Myh1__Myh2\" \"Ftl1__Myh2\" \"Myh2__Myh1\" \"Myh1__Myh1\" #> [6] \"Ftl1__Myh1\" \"Myh2__Ftl1\" \"Myh1__Ftl1\" \"Ftl1__Ftl1\" plotSpatialFeature(sfe_tissue, c(\"Myh2\", \"Myh1\", \"Ftl1\"), swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4) plotLocalResult(sfe_tissue, \"locallee\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\"), colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"locallee\", c(\"Myh1__Myh1\", \"Myh2__Myh2\", \"Ftl1__Ftl1\"), colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- runUnivariate(sfe_tissue, \"localmoran\", c(\"Myh2\", \"Myh1\", \"Ftl1\"), swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"localmoran\", c(\"Myh1\", \"Myh2\", \"Ftl1\"), colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"bivariate-local-moran","dir":"Articles","previous_headings":"","what":"Bivariate local Moran","title":"Bivariate spatial statistics","text":"spdep package implements bivariate version local Moran, basically \\[ I_{X_i,Y_i} = (n-1)\\frac{(x_i - \\bar{x})\\sum_{j=1}^n w_{ij} (y_j - \\bar{y})}{\\sqrt{\\sum_{=1}^n (x_i - \\bar{x})^2} \\sqrt{\\sum_{=1}^n (y_i - \\bar{y})^2}}. \\] Note symmetric, .e. \\(I_{X_i,Y_i} \\neq I_{Y_i,X_i}\\). Permutation testing performed get pseudo p-value First plot bivariate local Moran’s values first row plots XY second row plots YX; note similar, . bivariate local Moran mean? ’s kind like contribution location correlation \\(x\\) spatially lagged \\(y\\), \\(x\\) smoothed. contrast, Lee’s L scaled Pearson correlation spatially lagged \\(x\\) spatially lagged \\(y\\). permutation testing performed, can plot pseudo-p-value, correcting multiple testing based spatial neighborhood graph: Note p-values asymetric, according source code localmoran_bv(), \\(y\\) permuted, \\(x\\). ’s also related Wartenberg’s spatial PCA (Wartenberg 1985), Moran’s expressed matrix form: \\[ \\mathbf{} = \\frac{\\mathbf{Z}^T\\mathbf{WZ}}{\\mathbf 1^T \\mathbf{W1}}, \\] \\(\\mathbf Z\\) data matrix scaled centered variables columns, \\(\\mathbf W\\) spatial weights matrix, \\(\\mathbf 1\\) vector 1’s, denominator effect \\(\\sum_{=1}^n \\sum_{j=1}^n w_{ij}\\). diagonal entries Moran’s ’s variables, diagonal entries global versions computed sum bivariate local Moran’s ’s divide sum spatial weights. \\(\\mathbf W\\) doesn’t symmetric, matrix may symmetric. Wartenberg diagonalized matrix place covariance matrix spatial PCA. using scaled centered data row normalized spatial weights matrix, MULTISPATI PCA equivalent Wartenberg’s approach (Dray, Saı̈d, Débias 2008). Lee considered asymmetry inadequacy Wartenberg’s approach bivariate association measure (Lee 2001). ’m sure bivariate local Moran’s helps data analysis, interesting piece history.","code":"sfe_tissue <- runBivariate(sfe_tissue, \"localmoran_bv\", c(\"Myh1\", \"Myh2\", \"Ftl1\"), swap_rownames = \"symbol\", nsim = 1000) localResultFeatures(sfe_tissue, \"localmoran_bv\") #> [1] \"Myh1__Myh1\" \"Myh2__Myh1\" \"Ftl1__Myh1\" \"Myh1__Myh2\" \"Myh2__Myh2\" #> [6] \"Ftl1__Myh2\" \"Myh1__Ftl1\" \"Myh2__Ftl1\" \"Ftl1__Ftl1\" localResultAttrs(sfe_tissue, \"localmoran_bv\", \"Myh1__Myh2\") #> [1] \"Ibvi\" \"E.Ibvi\" \"Var.Ibvi\" #> [4] \"Z.Ibvi\" \"Pr(z != E(Ibvi))\" \"Pr(z != E(Ibvi)) Sim\" #> [7] \"Pr(folded) Sim\" \"-log10p Sim\" \"-log10p_adj Sim\" plotLocalResult(sfe_tissue, \"localmoran_bv\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\", \"Myh2__Myh1\", \"Ftl1__Myh2\", \"Ftl1__Myh1\"), colGeometryName = \"spotPoly\", attribute = \"Ibvi\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"localmoran_bv\", c(\"Myh1__Myh2\", \"Myh2__Ftl1\", \"Myh1__Ftl1\", \"Myh2__Myh1\", \"Ftl1__Myh2\", \"Ftl1__Myh1\"), colGeometryName = \"spotPoly\", attribute = \"-log10p_adj Sim\", image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/bivariate.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Bivariate spatial statistics","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] scico_1.5.0 pheatmap_1.0.12 #> [3] scran_1.30.2 scater_1.30.1 #> [5] ggplot2_3.5.1 scuttle_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] SpatialFeatureExperiment_1.3.0 SFEData_1.4.0 #> [19] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 sf_1.0-16 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 DBI_1.2.2 #> [19] RColorBrewer_1.1-3 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] dqrng_0.3.2 pkgdown_2.0.9 #> [33] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [35] DelayedArray_0.28.0 tidyselect_1.2.1 #> [37] farver_2.1.1 ScaledMatrix_1.10.0 #> [39] viridis_0.6.5 BiocFileCache_2.10.2 #> [41] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [43] e1071_1.7-14 systemfonts_1.0.6 #> [45] dbscan_1.1-12 tools_4.3.3 #> [47] ggnewscale_0.4.10 ragg_1.3.0 #> [49] Rcpp_1.0.12 glue_1.7.0 #> [51] gridExtra_2.3 SparseArray_1.2.4 #> [53] xfun_0.43 dplyr_1.1.4 #> [55] HDF5Array_1.30.1 withr_3.0.0 #> [57] BiocManager_1.30.22 fastmap_1.1.1 #> [59] boot_1.3-30 rhdf5filters_1.14.1 #> [61] bluster_1.12.0 fansi_1.0.6 #> [63] spData_2.3.0 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] RSQLite_2.3.6 utf8_1.2.4 #> [73] generics_0.1.3 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 spdep_1.3-3 #> [79] pkgconfig_2.0.3 gtable_0.3.5 #> [81] blob_1.2.4 XVector_0.42.0 #> [83] htmltools_0.5.8.1 scales_1.3.0 #> [85] png_0.1-8 SpatialExperiment_1.12.0 #> [87] knitr_1.45 rjson_0.2.21 #> [89] curl_5.2.1 proxy_0.4-27 #> [91] cachem_1.0.8 rhdf5_2.46.1 #> [93] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [95] parallel_4.3.3 vipor_0.4.7 #> [97] AnnotationDbi_1.64.1 desc_1.4.3 #> [99] s2_1.1.6 pillar_1.9.0 #> [101] grid_4.3.3 vctrs_0.6.5 #> [103] promises_1.3.0 BiocSingular_1.18.0 #> [105] dbplyr_2.5.0 beachmat_2.18.1 #> [107] xtable_1.8-4 cluster_2.1.6 #> [109] beeswarm_0.4.0 evaluate_0.23 #> [111] magick_2.8.3 cli_3.6.2 #> [113] locfit_1.5-9.9 compiler_4.3.3 #> [115] rlang_1.1.3 crayon_1.5.2 #> [117] labeling_0.4.3 classInt_0.4-10 #> [119] fs_1.6.4 ggbeeswarm_0.7.2 #> [121] viridisLite_0.4.2 deldir_2.0-4 #> [123] BiocParallel_1.36.0 munsell_0.5.1 #> [125] Biostrings_2.70.3 Matrix_1.6-5 #> [127] ExperimentHub_2.10.0 patchwork_1.2.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] interactiveDisplayBase_1.40.0 highr_0.10 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"Pros: Widely used, tested many cell types tissues Many existing datasets, including many 10X website High throughput, applied atlases millions cells Cons: Lose spatial information expensive open source technologies Less flexible tissues skeletal muscles challenging dissociate","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/chromium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"PhenoCycler Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution Formalin fixed, paraffin embedded (FFPE) tissue compatible Cons: Requires panels proteins usually dozens antibodies, standard highly multiplexed immunofluorescence. Akoya sells curated panels.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"PhenoCycler Processing Workflows with Voyager","text":"Several CODEX datasets generated HuBMAP Consortium available download data portal. Raw processed data typically avaiable several fields view can readily combined single SpatialFeatureExperiment(SFE) object. tutorial processing output various spatial transcriptomics technologies SFE object use Voyager available vignette linked .","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/codex_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"PhenoCycler Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated CODEX technology.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"CosMX Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible histological staining including DAPI 100 proteins can quantified CosMX along side RNAs Cons: curated panel usually hundred genes required. However, Nanostring provides curated gene panels common applications oncology, neuroscience, immunology, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"CosMX Processing Workflows with Voyager","text":"Nanostring released CosMX FFPE dataset website. tutorial processing output various spatial transcriptomics technologies, including CosMX, SpatialFeatureExperiment(SFE) object use Voyager available . vignette provides technology specific notes data downloaded Nanostring. Nanostring provides cell segmentation data images, cell centroid coordinates provided metadata.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/cosmx_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"CosMX Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated CosMX SMI. publicly available CosMX dataset profiles 960 genes across 8 non-small-cell lung cancer (NSCLC) samples.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"visium-space-ranger-output","dir":"Articles","previous_headings":"","what":"Visium Space Ranger output","title":"Create a SpatialFeatureExperiment object","text":"10x Genomics Space Ranger output Visium experiment can read similar manner SpatialExperiment; SpatialFeatureExperiment SFE object spotPoly column geometry spot polygons. filtered matrix (.e. spots tissue) read , column graph called visium also present spatial neighborhood graph Visium spots tissue. graph computed spots read regardless whether tissue. results tissue capture outs directory. Inside outs directory two directories: raw_reature_bc_matrix unfiltered gene count matrix, spatial spatial information. DropletUtils package function read10xCounts() reads gene count matrix. SPE reads spatial information, SFE uses spatial information construct Visium spot polygons spatial neighborhood graphs. Inside spatial directory: tissue_lowres_image.png low resolution image tissue. Inside scalefactors_json.json file: spot_diameter_fullres diameter Visium spot full resolution H&E image pixels. tissue_hires_scalef tissue_lowres_scalef ratio size high resolution (full resolution) low resolution H&E image full resolution image. fiducial_diameter_fullres diameter fiducial spot used align spots H&E image pixels full resolution image. tissue_positions_list.csv file contains information spatial coordinates spots whether spot tissue automatically detected Space Ranger manually annotated Loupe browser. polygon tissue boundary available, whether image processing manual annotation, geometric operations supported SFE package, based sf package, can used find spots intersect tissue spots contained tissue. Geometric operations can also find polygons intersections spots tissue, results can get messy since intersections can polygons also points lines. Now read toy data Space Ranger output format. Since Bioconductor version 3.17 (Voyager version 1.2.0), image read SpatRaster object terra package, loaded memory unless necessary. plotting large image, downsampled thus fully loaded memory. unit can set unit argument, can either pixels full resolution image microns. latter calculated former based spacing spots, known 100 microns. Space Ranger output includes gene count matrix, spot coordinates, spot diameter. Space Ranger output include nuclei segmentation pathologist annotation histological regions. Extra image processing, ImageJ QuPath, required geometries.","code":"# Example from SpatialExperiment dir <- system.file( file.path(\"extdata\", \"10xVisium\"), package = \"SpatialExperiment\") sample_ids <- c(\"section1\", \"section2\") (samples <- file.path(dir, sample_ids, \"outs\")) #> [1] \"/Users/runner/work/_temp/Library/SpatialExperiment/extdata/10xVisium/section1/outs\" #> [2] \"/Users/runner/work/_temp/Library/SpatialExperiment/extdata/10xVisium/section2/outs\" list.files(samples[1]) #> [1] \"raw_feature_bc_matrix\" \"spatial\" list.files(file.path(samples[1], \"spatial\")) #> [1] \"scalefactors_json.json\" \"tissue_lowres_image.png\" #> [3] \"tissue_positions_list.csv\" fromJSON(file = file.path(samples[1], \"spatial\", \"scalefactors_json.json\")) #> $spot_diameter_fullres #> [1] 89.44476 #> #> $tissue_hires_scalef #> [1] 0.1701114 #> #> $fiducial_diameter_fullres #> [1] 144.4877 #> #> $tissue_lowres_scalef #> [1] 0.05103343 (sfe3 <- read10xVisiumSFE(samples, dirs = samples, sample_id = sample_ids, type = \"sparse\", data = \"raw\", images = \"lowres\", unit = \"full_res_image_pixel\")) #> class: SpatialFeatureExperiment #> dim: 50 99 #> metadata(0): #> assays(1): counts #> rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000005886 ENSMUSG00000101476 #> rowData names(1): symbol #> colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> AAAGTCGACCCTCAGT-1-1 AAAGTGCCATCAATTA-1-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> section1: #> section2:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"vizgen-merfish-output","dir":"Articles","previous_headings":"Visium Space Ranger output","what":"Vizgen MERFISH output","title":"Create a SpatialFeatureExperiment object","text":"commercialized MERFISH Vizgen standard output format, can read SFE readVizgen(). cell segmentation field view (FOV) separate HDF5 file MERFISH dataset can hundreds FOVs, strongly recommend reading MERFISH output server large number CPU cores. Alternatively, MERFISH datasets store cell segmentation parquet file, can easily read R. read toy dataset first FOV real dataset: unit always microns.","code":"dir_use <- system.file(file.path(\"extdata\", \"vizgen\"), package = \"SpatialFeatureExperiment\") (sfe_mer <- readVizgen(dir_use, z = 0L, image = \"PolyT\", use_cellpose = FALSE)) #> class: SpatialFeatureExperiment #> dim: 385 100 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(0): #> colnames(100): 103327291694389284070574461648020091166 #> 105028411815552368766949841604861213395 ... #> 99103300832376657987379734140330816574 #> 99471994882184799235845481075474519252 #> colData names(7): fov volume ... max_y sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: micron #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"create-sfe-object-from-scratch","dir":"Articles","previous_headings":"","what":"Create SFE object from scratch","title":"Create a SpatialFeatureExperiment object","text":"SFE object can constructed scratch assay matrices metadata. toy example, dgCMatrix used, since SFE inherits SingleCellExperiment (SCE), types arrays supported SCE delayed arrays also work. sufficient create SPE object, SFE object, even though sf data frame constructed geometries. constructor behaves similarly SPE constructor. centroid coordinates Visium spots example can converted spot polygons spotDiameter argument, can also relevant technologies round spots beads, Slide-seq. Spot diameter pixels full resolution images can found scalefactors_json.json file Space Ranger output. geometries spatial graphs can added calling constructor. Geometries can also supplied constructor.","code":"# Visium barcode location from Space Ranger data(\"visium_row_col\") coords1 <- visium_row_col[visium_row_col$col < 6 & visium_row_col$row < 6,] coords1$row <- coords1$row * sqrt(3) # Random toy sparse matrix set.seed(29) col_inds <- sample(1:13, 13) row_inds <- sample(1:5, 13, replace = TRUE) values <- sample(1:5, 13, replace = TRUE) mat <- sparseMatrix(i = row_inds, j = col_inds, x = values) colnames(mat) <- coords1$barcode rownames(mat) <- sample(LETTERS, 5) sfe3 <- SpatialFeatureExperiment(list(counts = mat), colData = coords1, spatialCoordsNames = c(\"col\", \"row\"), spotDiameter = 0.7) # Convert regular data frame with coordinates to sf data frame cg <- df2sf(coords1[,c(\"col\", \"row\")], c(\"col\", \"row\"), spotDiameter = 0.7) rownames(cg) <- colnames(mat) sfe3 <- SpatialFeatureExperiment(list(counts = mat), colGeometries = list(foo = cg))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"technology-specific-notes","dir":"Articles","previous_headings":"Create SFE object from scratch","what":"Technology specific notes","title":"Create a SpatialFeatureExperiment object","text":"commercial technologies function directly read outputs. may implement functions next version SpatialFeatureExperiment. now show example code read output CosMX Xenium.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"gene-count-matrix-and-cell-metadata","dir":"Articles","previous_headings":"Create SFE object from scratch > Technology specific notes","what":"Gene count matrix and cell metadata","title":"Create a SpatialFeatureExperiment object","text":"gene count matrix cell metadata (including cell centroid coordinates) example datasets technologies CosMX Vizgen CSV files. recommend vroom package quickly read large CSV files. CSV files read data frames. gene count matrix, can converted matrix sparse dgCMatrix. matrix may need transposed genes rows cells columns. smFISH based data tend less sparse scRNA-seq data, using sparse matrix worthwhile since matrix still 50% zero. 10x Genomics’ new single cell resolution technology Xenium, gene count matrix h5 file, can read R SCE object DropletUtils::read10xCounts(). can converted SpatialExperiment, SpatialFeatureExperiment. gene count matrix DelayedArray, data loaded memory operations matrix performed chunks. DelayedArray converted dgCMatrix memory. cell metadata available CSV format, ’s also parquet format compact disk, can read R data frame arrow package. Example code:","code":"# Download data from https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast system(\"curl -O https://cf.10xgenomics.com/samples/xenium/1.0.1/Xenium_FFPE_Human_Breast_Cancer_Rep1/Xenium_FFPE_Human_Breast_Cancer_Rep1_outs.zip\") system(\"unzip Xenium_FFPE_Human_Breast_Cancer_Rep1_outs.zip\") system(\"mv outs outs_R1\") system(\"curl -O https://cf.10xgenomics.com/samples/xenium/1.0.1/Xenium_FFPE_Human_Breast_Cancer_Rep2/Xenium_FFPE_Human_Breast_Cancer_Rep2_outs.zip\") system(\"unzip Xenium_FFPE_Human_Breast_Cancer_Rep2_outs.zip\") system(\"mv outs outs_R2\") library(SpatialExperiment) library(DropletUtils) #library(arrow) sce <- read10xCounts(\"outs_R1/cell_feature_matrix.h5\") cell_info <- read_parquet(\"outs_R1/cells.parquet\") # Add the centroid coordinates to colData colData(sce) <- cbind(colData(sce), cell_info[,-1]) spe <- toSpatialExperiment(sce, spatialCoordsNames = c(\"x_centroid\", \"y_centroid\")) sfe <- toSpatialFeatureExperiment(spe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"cell-polygons","dir":"Articles","previous_headings":"Create SFE object from scratch > Technology specific notes","what":"Cell polygons","title":"Create a SpatialFeatureExperiment object","text":"File format cell polygons (available) different formats different technology. cell polygons sf data frames put colGeometries() SFE object. section explains number smFISH-based technologies. Xenium, cell polygons come CSV parquet files can directly read R data frame, 2 columns x y coordinates, one indicating cell coordinates belong . Change name cell ID column “ID”, use SpatialFeatureExperiment::df2sf() convert data frame sf data frame POLYGON geometry. Example code: CoxMX, cell polygons CSV files. Besides two coordinates columns, ’s column field view (FOV) another cell ID. However, unlike Xenium, cell IDs unique FOV, concatenated FOV make unique. df2sf() can also used convert regular data frame sf. Example code: See code used construct example datasets SFEData examples. Use sf::st_is_valid() check polygons valid. Polygons self-intersection valid, throw error geometric operations. common reason polygons invalid protruding line, can eliminated sf::st_buffer(cell_sf, dist = 0). Use sf::st_is_valid(cell_sf, reason = TRUE), plot invalid polygons, find polygons valid.","code":"#library(arrow) cell_poly <- read_parquet(\"outs_R2/cell_boundaries.parquet\") # Here the first column is cell ID names(cell_poly)[1] <- \"ID\" # \"vertex_x\" and \"vertex_y\" are the column names for coordinates here cell_sf <- df2sf(cell_poly, c(\"vertex_x\", \"vertex_y\"), geometryType = \"POLYGON\") # Download data from https://nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/ # stored here: https://www.dropbox.com/s/hl3peavrx92bluy/Lung5_Rep1-polygons.csv?dl=0 system(\"wget https://www.dropbox.com/s/hl3peavrx92bluy/Lung5_Rep1-polygons.csv?dl=1\") system(\"mv Lung5_Rep1-polygons.csv?dl=1 Lung5_Rep1-polygons.csv\") library(vroom) library(tidyr) cell_poly <- vroom(\"Lung5_Rep1-polygons.csv\") cell_poly <- cell_poly |> unite(\"ID\", fov:cellID) cell_sf <- df2sf(cell_poly, spatialCoordsNames = c(\"x_global_px\", \"y_global_px\"), geometryType = \"POLYGON\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Create a SpatialFeatureExperiment object","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] Matrix_1.6-5 rjson_0.2.21 #> [3] SpatialFeatureExperiment_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] DBI_1.2.2 bitops_1.0-7 #> [3] deldir_2.0-4 s2_1.1.6 #> [5] rlang_1.1.3 magrittr_2.0.3 #> [7] matrixStats_1.3.0 e1071_1.7-14 #> [9] compiler_4.3.3 DelayedMatrixStats_1.24.0 #> [11] systemfonts_1.0.6 vctrs_0.6.5 #> [13] pkgconfig_2.0.3 SpatialExperiment_1.12.0 #> [15] wk_0.9.1 crayon_1.5.2 #> [17] fastmap_1.1.1 magick_2.8.3 #> [19] XVector_0.42.0 scuttle_1.12.0 #> [21] utf8_1.2.4 rmarkdown_2.26 #> [23] tzdb_0.4.0 ragg_1.3.0 #> [25] bit_4.0.5 purrr_1.0.2 #> [27] xfun_0.43 bluster_1.12.0 #> [29] beachmat_2.18.1 zlibbioc_1.48.2 #> [31] cachem_1.0.8 GenomeInfoDb_1.38.8 #> [33] jsonlite_1.8.8 rhdf5filters_1.14.1 #> [35] DelayedArray_0.28.0 scico_1.5.0 #> [37] Rhdf5lib_1.24.2 BiocParallel_1.36.0 #> [39] terra_1.7-71 parallel_4.3.3 #> [41] cluster_2.1.6 R6_2.5.1 #> [43] bslib_0.7.0 limma_3.58.1 #> [45] boot_1.3-30 GenomicRanges_1.54.1 #> [47] jquerylib_0.1.4 Rcpp_1.0.12 #> [49] SummarizedExperiment_1.32.0 knitr_1.45 #> [51] R.utils_2.12.3 IRanges_2.36.0 #> [53] igraph_2.0.3 tidyselect_1.2.1 #> [55] abind_1.4-5 yaml_2.3.8 #> [57] codetools_0.2-20 lattice_0.22-6 #> [59] tibble_3.2.1 Biobase_2.62.0 #> [61] evaluate_0.23 desc_1.4.3 #> [63] sf_1.0-16 units_0.8-5 #> [65] spData_2.3.0 proxy_0.4-27 #> [67] pillar_1.9.0 MatrixGenerics_1.14.0 #> [69] KernSmooth_2.23-22 stats4_4.3.3 #> [71] generics_0.1.3 vroom_1.6.5 #> [73] sp_2.1-4 RCurl_1.98-1.14 #> [75] S4Vectors_0.40.2 ggplot2_3.5.1 #> [77] sparseMatrixStats_1.14.0 munsell_0.5.1 #> [79] scales_1.3.0 class_7.3-22 #> [81] glue_1.7.0 tools_4.3.3 #> [83] ggnewscale_0.4.10 BiocNeighbors_1.20.2 #> [85] RSpectra_0.16-1 locfit_1.5-9.9 #> [87] fs_1.6.4 rhdf5_2.46.1 #> [89] grid_4.3.3 spdep_1.3-3 #> [91] DropletUtils_1.22.0 edgeR_4.0.16 #> [93] colorspace_2.1-0 SingleCellExperiment_1.24.0 #> [95] patchwork_1.2.0 GenomeInfoDbData_1.2.11 #> [97] HDF5Array_1.30.1 cli_3.6.2 #> [99] textshaping_0.3.7 fansi_1.0.6 #> [101] S4Arrays_1.2.1 dplyr_1.1.4 #> [103] gtable_0.3.5 R.methodsS3_1.8.2 #> [105] sass_0.4.9 digest_0.6.35 #> [107] BiocGenerics_0.48.1 classInt_0.4-10 #> [109] dqrng_0.3.2 SparseArray_1.2.4 #> [111] htmlwidgets_1.6.4 R.oo_1.26.0 #> [113] memoise_2.0.1 htmltools_0.5.8.1 #> [115] pkgdown_2.0.9 lifecycle_1.0.4 #> [117] statmod_1.5.0 bit64_4.0.5"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"downloading-the-data","dir":"Articles","previous_headings":"","what":"Downloading the data","title":"How to create a SpatialFeatureExperiment object","text":"data used recent publication, High Resolution Slide-seqV2 Spatial Transcriptomics Enables Discovery Disease-Specific Cell Neighborhoods Pathways available download GEO (Accession Number: GSE190094. demonstrate use ffq access FTP links downloading relevant data. download data single WT sample. commented line shows install ffq R terminal. output command metadata GSM5713341. can use curl wget download files FTP links one--one. Files beginning ftp:// can read directly R package vroom. Files uncompressed reading. files automatically downloaded uncompressed. use method , commented lines show download files using curl.","code":"# system(\"pip install ffq\") system(\"ffq -l1 GSM5713341\") # system(\"curl -O ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_MappedDGEForR.csv.gz\") # system(\"curl -O ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_BeadLocationsForR.csv.gz\") # list.files(pattern = \"*.gz\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"reading-in-the-data","dir":"Articles","previous_headings":"","what":"Reading in the data","title":"How to create a SpatialFeatureExperiment object","text":"","code":"mtx <- vroom(\"ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_MappedDGEForR.csv.gz\") centroids <- vroom(\"ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM5713nnn/GSM5713341/suppl/GSM5713341_Puck_191112_04_BeadLocationsForR.csv.gz\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"construct-a-sfe-object","dir":"Articles","previous_headings":"","what":"Construct a SFE object","title":"How to create a SpatialFeatureExperiment object","text":"count matrix bead locations provided authors. pass constructor SpatialFeatureExperiment object. files read data frames. convert gene count matrix matrix sparse dgCMatrix. , spot locations provided CSV file. two columns particular interest, namely xcoord ycoord. barcode column corresponds barcodes count matrix. calling SpatialFeatureExperiment constructor, spatial coordinates must converted sf data frame using df2sf(). coordinates centroid positions, indicate geometryType=\"POINT\". Now ingredients create SFE object. values assays colGeometries arguments must passed list shown .","code":"# Note: if using Google Colab, this step might run out of RAM # If this happens, please upgrade to Colab Pro rn <- mtx$Row mtx <- as.matrix(mtx[,-1]) rownames(mtx) <- rn mtx <- as(mtx, \"dgCMatrix\") colnames(centroids)[1] <- \"ID\" centroids <- df2sf( centroids, geometryType = \"POINT\", spatialCoordsNames=c(\"xcoord\",\"ycoord\")) sfe <- SpatialFeatureExperiment( assays = list(counts = mtx), colGeometries = list(centroids = centroids) ) sfe"},{"path":"https://pachterlab.github.io/voyager/dev/articles/create_sfe_v2.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"How to create a SpatialFeatureExperiment object","text":"","code":"sessionInfo()"},{"path":[]},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Multivariate local Geary's C","text":"Local Geary’s C (Anselin 1995) defined : \\[ c_i = \\sum_jw_{ij}(x_i - x_j)^2, \\] \\(w_{ij}\\)s spatial weights location \\(\\) location \\(j\\) \\(x\\) variable spatial location. generalized multiple variables (Anselin 2019): \\[ c_{k,} = \\sum_{v=1}^k c_{v,}, \\] \\(k\\) variables. essentially spatially weighted sum squared distances locations feature space. vignette demonstrates usage multivariate local Geary’s C. load packages used: QC performed another vignette, vignette plot QC metrics. image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(spdep) theme_set(theme_bw()) (sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\") sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] sfe_tissue <- logNormCounts(sfe_tissue) colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"gene-expression","dir":"Articles","previous_headings":"","what":"Gene expression","title":"Multivariate local Geary's C","text":"compute multivariate local C top highly variagle genes (HVGs) dataset: results stored reducedDim although ’s really dimension reduction. can also go colData dest = \"colData\". test two sided, alternative argument can set “greater” test positive spatial autocorrelation “less” negative spatial autocorrelation. Geary’s C, value 1 indicates positive spatial autocorrelation value 1 indicates negative spatial autocorrelation. Local Geary’s C scaled, square difference expression, low value means homogeneous neighborhood high value means heterogeneous neighborhood. considering 341 top HVGs, muscle tendon junction unjury site heterogeneous, detected negative cluster. Permutation testing performed, although Anselin noted pseudo-p-values taken indicative interesting regions interpreted strict sense. Warm colors indicate adjusted p < 0.05. interpreted along clusters. dataset, interestingly homogeneous regions myofibers, interestingly heterogeneous region injury site. significant regions positive cluster, center injury site significant negative cluster.","code":"hvgs <- getTopHVGs(sfe_tissue, fdr.threshold = 0.01) sfe_tissue <- runMultivariate(sfe_tissue, \"localC_perm_multi\", subset_row = hvgs) names(reducedDim(sfe_tissue, \"localC_perm_multi\")) #> [1] \"localC_perm_multi\" \"E.Ci\" \"Var.Ci\" #> [4] \"Z.Ci\" \"Pr(z != E(Ci))\" \"Pr(z != E(Ci)) Sim\" #> [7] \"Pr(folded) Sim\" \"Skewness\" \"Kurtosis\" #> [10] \"-log10p Sim\" \"-log10p_adj Sim\" \"cluster\" spatialReducedDim(sfe_tissue, \"localC_perm_multi\", c(1, 12), image_id = \"lowres\", maxcell = 5e4) spatialReducedDim(sfe_tissue, \"localC_perm_multi\", c(11, 12), image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"top-principal-components","dir":"Articles","previous_headings":"","what":"Top principal components","title":"Multivariate local Geary's C","text":"multivariate local Geary’s C spatially weighted sum squared distances locations feature space, ’s affected curse dimensionality used large number features, uniformly distributed data points higher dimensions become equidistant increasing number dimensions. However, real data uniformly distributed can much smaller effective dimension number features, many genes co-regulated. Anselin suggested using main principal components, issue curse dimensionality remains investigated. Furthermore, cosine Manhattan distances suggested mitigate curse dimensionality, wonder use instead Euclidean distance feature space multivariate local Geary’s C. perform multivariate local Geary’s C top PCs: percentage variance explained top 20 PCs? area seem significant permutation test larger HVGs, area considered negative clusters smaller. significant regions pretty much positive cluster. differences results anything curse dimensionality? Twenty dimensions can still exhibit curse dimensionality, 300 HVGs worse. lose lot information, including negative spatial autocorrelation, using 20 PCs?","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 20, scale = TRUE) ElbowPlot(sfe_tissue) sum(attr(reducedDim(sfe_tissue, \"PCA\"), \"percentVar\")) #> [1] 38.8627 out <- localC_perm(reducedDim(sfe_tissue, \"PCA\"), listw = colGraph(sfe_tissue, \"visium\")) out <- Voyager:::.localCpermmulti2df(out, nb = colGraph(sfe_tissue, \"visium\")$neighbours, p.adjust.method = \"BH\") reducedDim(sfe_tissue, \"localC_PCs\", withDimnames = FALSE) <- out spatialReducedDim(sfe_tissue, \"localC_PCs\", c(1, 12), image_id = \"lowres\", maxcell = 5e4) spatialReducedDim(sfe_tissue, \"localC_PCs\", c(11, 12), image_id = \"lowres\", maxcell = 5e4, divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/localc.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Multivariate local Geary's C","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] spdep_1.3-3 sf_1.0-16 #> [3] spData_2.3.0 scran_1.30.2 #> [5] scater_1.30.1 ggplot2_3.5.1 #> [7] scuttle_1.12.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [19] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] DBI_1.2.2 RColorBrewer_1.1-3 #> [19] abind_1.4-5 zlibbioc_1.48.2 #> [21] purrr_1.0.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 dbscan_1.1-12 #> [45] tools_4.3.3 ggnewscale_0.4.10 #> [47] ragg_1.3.0 Rcpp_1.0.12 #> [49] glue_1.7.0 gridExtra_2.3 #> [51] SparseArray_1.2.4 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 bluster_1.12.0 #> [61] fansi_1.0.6 digest_0.6.35 #> [63] rsvd_1.0.5 R6_2.5.1 #> [65] mime_0.12 textshaping_0.3.7 #> [67] colorspace_2.1-0 wk_0.9.1 #> [69] RSQLite_2.3.6 utf8_1.2.4 #> [71] generics_0.1.3 class_7.3-22 #> [73] httr_1.4.7 htmlwidgets_1.6.4 #> [75] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [77] scico_1.5.0 gtable_0.3.5 #> [79] blob_1.2.4 XVector_0.42.0 #> [81] htmltools_0.5.8.1 scales_1.3.0 #> [83] png_0.1-8 SpatialExperiment_1.12.0 #> [85] knitr_1.45 rjson_0.2.21 #> [87] curl_5.2.1 proxy_0.4-27 #> [89] cachem_1.0.8 rhdf5_2.46.1 #> [91] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [93] parallel_4.3.3 vipor_0.4.7 #> [95] AnnotationDbi_1.64.1 desc_1.4.3 #> [97] s2_1.1.6 pillar_1.9.0 #> [99] grid_4.3.3 vctrs_0.6.5 #> [101] promises_1.3.0 BiocSingular_1.18.0 #> [103] dbplyr_2.5.0 beachmat_2.18.1 #> [105] xtable_1.8-4 cluster_2.1.6 #> [107] beeswarm_0.4.0 evaluate_0.23 #> [109] magick_2.8.3 cli_3.6.2 #> [111] locfit_1.5-9.9 compiler_4.3.3 #> [113] rlang_1.1.3 crayon_1.5.2 #> [115] labeling_0.4.3 classInt_0.4-10 #> [117] fs_1.6.4 ggbeeswarm_0.7.2 #> [119] viridisLite_0.4.2 deldir_2.0-4 #> [121] BiocParallel_1.36.0 munsell_0.5.1 #> [123] Biostrings_2.70.3 Matrix_1.6-5 #> [125] ExperimentHub_2.10.0 patchwork_1.2.0 #> [127] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [129] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [131] statmod_1.5.0 shiny_1.8.1.1 #> [133] interactiveDisplayBase_1.40.0 highr_0.10 #> [135] AnnotationHub_3.10.1 igraph_2.0.3 #> [137] memoise_2.0.1 bslib_0.7.0 #> [139] bit_4.0.5"},{"path":[]},{"path":[]},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"MERFISH Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible histological staining including DAPI Protein co-detection supported Cons: curated panel usually hundred genes required. However, Vizgen provides curated gene panels neuroscience oncology, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"MERFISH Processing Workflows with Voyager","text":"Several MERFISH datasets generated MERSCOPE Platform publicly available Vizgen website. provide examples available processing output various spatial transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager vignette . vignette provides technology specific notes data downloaded Vizgen. Briefly, Vizgen provides cell metadata gene count matrix CSV files can read quickly vroom package. Cell segmentation data provided HDF5 files delineated field view.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/merfish_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"MERFISH Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated MERFISH technology. publicly available MERFISH datasets profile hundreds genes hundreds thousands millions cells. Thus, vignettes linked can provide context capabilities Voyager moderate large datasets.","code":""},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"Due large number genes quantified single cell spatial transcriptomics, dimension reduction part standard workflow analyze data, visualize, help interpreting data, distill relevant information reduce noise, facilitate downstream analyses clustering pseudotime, project different samples shared latent space data integration, . first dimension reduction methods learn , good old principal component analysis (PCA), tSNE, UMAP, don’t use spatial information. rise spatial transcriptomics, dimension reduction methods take spatial dependence account written. , SpatialPCA (Shang Zhou 2022), NSF (Townes Engelhardt 2023), MEFISTO (Velten et al. 2022) use factor analysis probabilistic PCA related factor analysis, model factors Gaussian processes, spatial kernel covariance matrix, factors positive spatial autocorrelation can used downstream clustering clusters can spatially coherent. use graph convolution networks spatial neighborhood graph find spatially informed embeddings cells, conST (Zong et al. 2022) SpaceFlow (Ren et al. 2022). SpaSRL (Zhang et al. 2023) finds low dimension projection spatial neighborhood augmented data. Spatially informed dimension reduction actually new, dates back least 1985, Wartenberg’s crossover Moran’s PCA (Wartenberg 1985), generalized developed MULTISPATI PCA (Dray, Saı̈d, Débias 2008), implemented adespatial package CRAN. short, PCA tries maximize variance explained PC, MULTISPATI maximizes product Moran’s variance explained. Also, eigenvalues PCA non-negative, covariance matrix positive semidefinite, MULTISPATI can give negative eigenvalues, represent negative spatial autocorrelation, can present interesting common positive spatial autocorrelation often masked latter (Griffith 2019). single cell -omics conventions, let \\(X\\) denote gene count matrix whose columns cells Visium spots whose rows genes, \\(n\\) columns. Let \\(W\\) denote row normalized \\(n\\times n\\) adjacency matrix spatial neighborhood graph cells Visium spots, symmetric. MULTISPATI diagonalizes symmetric matrix \\[ H = \\frac 1 {2n} X(W^t+W)X^t \\] However, implementation adespatial general can used multivariate analyses duality diagram paradigm, correspondence analysis; equation simplified just PCA, without introduce duality diagram . Voyager 1.2.0 (Bioconductor 3.17) much faster implementation MULTISPATI PCA based RSpectra. See benchmark . vignette, perform MULTISPATI PCA MERFISH mouse liver dataset. See first vignette using dataset . load packages used: MULTISPATI PCA one multivariate methods introduced Voyager 1.2.0. multivariate methods Voyager listed : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods().","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(scuttle) library(ggplot2) library(stringr) library(tidyr) library(tibble) library(bluster) library(BiocSingular) library(BiocParallel) library(sf) library(patchwork) theme_set(theme_bw()) (sfe <- VizgenLiverData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 385 395215 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(395215): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(9): fov volume ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: listSFEMethods(variate = \"multi\") #> name description #> 1 multispati MULTISPATI PCA #> 2 localC_multi Multivariate local Geary's C #> 3 localC_perm_multi Multivariate local Geary's C permutation testing"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"QC already performed first vignette. QC , see first vignette details. Remove outliers empty cells: still 390,000 cells left removing outliers. Next compute Moran’s QC metrics, requires spatial neighborhood graph: Moran’s little negative, permutation testing, significant, though can also large number cells. lower bound Moran’s given spatial neighborhood graph usually closer -0.5 -1, upper bound usually around 1. bounds given specific spatial neighborhood graph can found moranBounds(), double centers adjacency matrix, hence making dense, isn’t enough memory use entire dataset. can look Moran bounds small subset data, might generalizable whole dataset given tissue appears quite homogeneous space. considering bounds, MOran’s values QC metrics like whose magnitudes seem substantial nCounts nGenes ’s positive spatial autocorrelation. may mild moderate negative spatial autocorrelation.","code":"is_blank <- str_detect(rownames(sfe), \"^Blank-\") sfe <- addPerCellQCMetrics(sfe, subset = list(blank = is_blank)) get_neg_ctrl_outliers <- function(col, sfe, nmads = 3, log = FALSE) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\", nmads = nmads, log = log) outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } sfe <- get_neg_ctrl_outliers(\"subsets_blank_percent\", sfe, log = TRUE) inds <- !sfe$is_blank_outlier & sfe$nCounts > 0 (sfe <- sfe[, inds]) #> class: SpatialFeatureExperiment #> dim: 385 390348 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(390348): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(16): fov volume ... total is_blank_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 35.006 0.249 35.303 features_use <- c(\"nCounts\", \"nGenes\", \"volume\") sfe <- colDataUnivariate(sfe, \"moran.mc\", features_use, colGraphName = \"knn5\", nsim = 49, BPPARAM = MulticoreParam(2)) plotMoranMC(sfe, features_use) bbox_use <- c(xmin = 6000, xmax = 7000, ymin = 4000, ymax = 5000) inds2 <- st_intersects(cellSeg(sfe), st_as_sfc(st_bbox(bbox_use)), sparse = FALSE)[,1] sfe_sub <- sfe[, inds2] (mb <- moranBounds(colGraph(sfe_sub, \"knn5\"))) #> Imin Imax #> -0.6079436 1.0608389 setNames(colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"volume\"), \"moran.mc_statistic_sample01\"] / mb[\"Imin\"], features_use) #> nCounts nGenes volume #> 0.17839356 0.15168017 0.03211427 # Normalize data sfe <- logNormCounts(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"hepatic-zonation","dir":"Articles","previous_headings":"","what":"Hepatic zonation","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"dataset comes relatively large piece tissue need zoom smaller region better see local structures. specify bounding box. portal triad shown near top right bounding box. two large vessels left bottom right central veins. portal triad consists hepatic artery, portal vein brings blood intestine, bile duct, ’s oxygenated. regions around central vein deoxygenated. different oxygen nutrient contents mean hepatocytes play different metabolic roles zones portal triad central vein. plot zonation marker genes (Halpern et al. 2017). 3 marker genes present dataset. first two pericentral (near central vein), last one periportal (near portal triad). Besides hepatocytes, liver also many endothelial cells Kupffer cells (macrophages). Marker genes cells (Bonnardel et al. 2019) plotted visualize cell types space: one Kupffer cell markers available dataset. Expression gene seem spatially coherent. 3 endothelial cell marker genes available dataset. Wnt2 seems pericentral, Ltbp4 Efnb2 seem periportal. marker genes show top PC loadings non-spatial spatial PCA.","code":"bbox_use <- c(xmin = 6100, xmax = 7100, ymin = 7500, ymax = 8500) markers <- c(\"Axin2\", \"Cyp1a2\", \"Gstm3\", \"Psmd4\", # Pericentral \"Cyp2e1\", \"Asl\", \"Alb\", \"Ass1\", # Monotonic but has intermediate \"Hamp\", \"Igfbp2\", \"Cyp8b1\", \"Mup3\", # Non-monotonic \"Arg1\", \"Pck1\", \"C2\", \"Sdhd\") # Periportal (inds <- which(markers %in% rownames(sfe))) #> [1] 1 2 14 plotSpatialFeature(sfe, markers[inds], colGeometryName = \"cellSeg\", ncol = 3, bbox = bbox_use) # Kuppfer cells kc_genes <- c(\"Timd4\", \"Vsig4\", \"Clec4f\", \"Clec1b\", \"Il18bp\", \"C6\", \"Irf7\", \"Slc40a1\", \"Cdh5\", \"Nr1h3\", \"Dmpk\", \"Paqr9\", \"Pcolce2\", \"Kcna2\", \"Gbp8\", \"Iigp1\", \"Helz2\", \"Cd207\", \"Icos\", \"Adcy4\", \"Slc1a2\", \"Rsad2\", \"Slc16a9\", \"Cd209f\", \"Oasl1\", \"Fam167a\") which(kc_genes %in% rownames(sfe)) #> [1] 9 plotSpatialFeature(sfe, kc_genes[9], colGeometryName = \"cellSeg\", bbox = bbox_use) # Endothelial cells lec_genes <- c(\"Rspo3\", \"Wnt2\", \"Wnt9b\", \"Pcdhgc5\", \"Ecm1\", \"Ltbp4\", \"Efnb2\") (inds_lec <- which(lec_genes %in% rownames(sfe))) #> [1] 2 6 7 plotSpatialFeature(sfe, lec_genes[inds_lec], colGeometryName = \"cellSeg\", bbox = bbox_use, ncol = 3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"non-spatial-pca","dir":"Articles","previous_headings":"","what":"Non-spatial PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"First run non-spatial PCA, compare MULTISPATI. ’s pretty quick almost 400,000 cells, aren’t many genes . Use elbow plot see variance explained PC: Plot top gene loadings PC Many genes seem related endothelium. PC1 PC4 concern Kupffer cells well, Kupffer cell marker gene Cdh5 high loading. Plot first 4 PCs space PC1 PC4 highlight major blood vessels, PC2 PC3 less spatial structure. CosMX Xenium datasets website, top PCs clear spatial structures despite absence spatial information non-spatial PCA clear spatial compartments cell types, seem case dataset except blood vessels. seen genes strong spatial structures. PC2 PC3 don’t seem large scale spatial structure, may local spatial structure obvious plotting entire section, zoom bounding box shows hepatic zonation. ’s spatial structure smaller scale, perhaps negative spatial autocorrelation.","code":"set.seed(29) system.time( sfe <- runPCA(sfe, ncomponents = 20, subset_row = !is_blank, exprs_values = \"logcounts\", scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed #> 18.992 1.225 21.017 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) #> Ncells 16055263 857.5 25943217 1385.6 NA 25943217 1385.6 #> Vcells 239248882 1825.4 502470759 3833.6 16384 497137484 3792.9 ElbowPlot(sfe) plotDimLoadings(sfe) spatialReducedDim(sfe, \"PCA\", 4, colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0) spatialReducedDim(sfe, \"PCA\", ncomponents = 4, colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"multispati-pca","dir":"Articles","previous_headings":"","what":"MULTISPATI PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"plot positive negative eigenvalues. Note eigenvalues variance explained. Instead, product variance explained Moran’s . positive eigenvalues correspond eigenvectors simultaneously explain variance large positive Moran’s . negative eigenvalues correspond eigenvectors simultaneously explain variance negative Moran’s . positive eigenvalues drop sharply PC1 PC4, one negative eigenvalue might interesting, unsurprising given moderately negative Moran’s nCounts nGenes. However, first MERFISH vignette, none genes negative Moran’s . Perhaps negative eigenvalue comes negative spatial autocorrelation gene program “eigengene” obvious individual genes. beauty multivariate analysis. components mean? component linear combination genes maximize product variance explained Moran’s . second component maximizes product provided ’s orthogonal first component, . loss variance explained usually huge, components can considered axes along spatially coherent groups spots separated much possible according expression highly variable genes, theory, clustering positive MULTISPATI components give spatially coherent clusters. spatial coherence, MULTISPATI might robust outliers. gene loadings, PC40 seems separate endothelial cells Kupffer cells hepatocytes. Plot PCs: first two PCs pick zoning. PC3 seems smaller scale spatial structure. PC”40” (really 300 something) example negative spatial autocorrelation biology. Kupffer cells endothelial cells scattered among hepatocytes may play functional role. mean non-spatial PCA bad. MULTISPATI tends lose much variance explained per PC positive eigenvalues, identifies co-expressed genes spatially structured expression patterns. MULTISPATI tells different story non-spatial PCA. PCA cell embeddings often used downstream analysis. Whether use MULTISPATI embeddings instead many PCs use depend questions asked downstream analyses.","code":"system.time({ sfe <- runMultivariate(sfe, \"multispati\", colGraphName = \"knn5\", nfposi = 20, nfnega = 20) }) #> Warning in asMethod(object): sparse->dense coercion: allocating vector of size #> 1.1 GiB #> user system elapsed #> 176.516 20.441 212.961 ElbowPlot(sfe, nfnega = 20, reduction = \"multispati\") plotDimLoadings(sfe, dims = c(1:3, 40), reduction = \"multispati\") spatialReducedDim(sfe, \"multispati\", components = c(1:3, 40), colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"morans-i","dir":"Articles","previous_headings":"Spatial autocorrelation of principal components","what":"Moran’s I","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"compare Moran’s cell embeddings non-spatial MULTISPATI PC: MULTISPATI, Moran’s high PC1 PC2, sharply drops. Moran’s PC negative eigenvalues negative, means large magnitude eigenvalue comes explaining variance. However, considering lower bound Moran’s around -0.6 instead -1, magnitude Moran’s PC negative eigenvalue trivial. Non-spatial PCs sorted Moran’s ; PC5 surprising large Moran’s . PC5 must zonation. Also show larger scale:","code":"# non-spatial sfe <- reducedDimMoransI(sfe, dimred = \"PCA\", components = 1:20, BPPARAM = MulticoreParam(2)) # spatial sfe <- reducedDimMoransI(sfe, dimred = \"multispati\", components = 1:40, BPPARAM = MulticoreParam(2)) df_moran <- tibble(PCA = reducedDimFeatureData(sfe, \"PCA\")$moran_sample01[1:20], MULTISPATI_pos = reducedDimFeatureData(sfe, \"multispati\")$moran_sample01[1:20], MULTISPATI_neg = reducedDimFeatureData(sfe,\"multispati\")$moran_sample01[21:40] |> rev(), index = 1:20) data(\"ditto_colors\") df_moran |> pivot_longer(cols = -index, values_to = \"value\", names_to = \"name\") |> ggplot(aes(index, value, color = name)) + geom_line() + scale_color_manual(values = ditto_colors) + geom_hline(yintercept = 0, color = \"gray\") + geom_hline(yintercept = mb, linetype = 2, color = \"gray\") + scale_y_continuous(breaks = scales::breaks_pretty()) + scale_x_continuous(breaks = scales::breaks_width(5)) + labs(y = \"Moran's I\", color = \"Type\", x = \"Component\") min(df_moran$MULTISPATI_neg) / mb[1] #> Imin #> 0.1374483 spatialReducedDim(sfe, \"PCA\", component = 5, colGeometryName = \"cellSeg\", divergent = TRUE, diverge_center = 0, bbox = bbox_use) spatialReducedDim(sfe, \"PCA\", components = 5, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"moran-scatter-plot","dir":"Articles","previous_headings":"Spatial autocorrelation of principal components","what":"Moran scatter plot","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"Local positive negative spatial autocorrelation can average global Moran’s . zoomed plots gene loadings , PCs endothelial cells. Moran scatter plot can help discovering local heterogeneity. PCs 1-3 fainter clusters outside main cluster, indicating heterogeneous spatial autocorrelation. Also make Moran scatter plots MULTISPATI interesting clusters.","code":"sfe <- reducedDimUnivariate(sfe, \"moran.plot\", dimred = \"PCA\", components = 1:6) plts <- lapply(seq_len(6), function(i) { moranPlot(sfe, paste0(\"PC\", i), binned = TRUE, hex = TRUE, plot_influential = FALSE) }) wrap_plots(plts, widths = 1, heights = 1) + plot_layout(ncol = 3) + plot_annotation(tag_levels = \"1\", title = \"Moran scatter plot for non-spatial PCs\") & theme(legend.position = \"none\") sfe <- reducedDimUnivariate(sfe, \"moran.plot\", dimred = \"multispati\", components = c(1:5, 40), # Not to overwrite non-spatial PCA moran plots name = \"moran.plot2\") plts2 <- lapply(c(1:5, 40), function(i) { moranPlot(sfe, paste0(\"PC\", i), binned = TRUE, hex = TRUE, plot_influential = FALSE, name = \"moran.plot2\") }) wrap_plots(plts2, widths = 1, heights = 1) + plot_layout(ncol = 3) + plot_annotation(tag_levels = \"1\", title = \"Moran scatter plot for MULTISPATI PCs\") & theme(legend.position = \"none\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"clustering-with-multispati-pca","dir":"Articles","previous_headings":"","what":"Clustering with MULTISPATI PCA","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"standard scRNA-seq data analysis workflow, k nearest neighbor graph found PCA space, used graph based clustering Louvain Leiden, used perform differential expression. Spatial dimension reductions can similarly used perform clustering, identify spatial regions tissue, done (Shang Zhou 2022; Ren et al. 2022; Zhang et al. 2023). type studies often use manual segmentation ground truth compare different methods identify spatial regions. problem spatial region methods meant help us identify novel spatial regions based new -omics data, might reveal ’s previously unknown manual annotations. output method doesn’t match manual annotations, might simply pointing previously unknown aspect tissue rather wrong. Depending questions asked, can simultaneously multiple spatial partitions. happens geographical space. instance, ’s land use neighborhood boundaries, equally valid watershed boundaries types rock formation. one relevant depends questions asked. perform Leiden clustering non-spatial MULTISPATI PCA compare results. k nearest neighbor graph, used default k = 10. See clustering positive MULTISPATI PCs give spatially coherent clusters Plot clusters space: MULTISPATI clusters look somewhat spatially structured clusters non-spatial PCA. Also zoom small area: clusters mean? Clusters supposed groups different spots similar within group, sharing characteristics. Non-spatial MULTISPATI PCA use different characteristics clustering. Non-spatial PCA finds genes good telling cell types apart, although genes may happen spatially structured. Non-spatial clustering aims find groups gene expression, cells similar gene expression can surrounded cells types histological space. just like mapping Art Deco buildings, often near Spanish revival Beaux Art buildings whose styles quite different perform different functions, thus necessarily forming coherent spatial region. contrast, MULTISPATI’s positive components find genes must characterize spatial regions addition distinguishing different cell types. genes involved MULTISPATI component may interesting clusters. interesting perform gene set enrichment analysis, interpret sort spatial patterns spatially variable genes. like mapping buildings built, Art Deco, Spanish revival, Beaux Art popular 1920s 1930s end cluster form spatially coherent region, can found DTLA Historical Core Jewelry District, Old Pasadena. Hence non-spatial clustering spatial data isn’t necessarily bad. Rather, tells different story reveals different aspects data spatial clustering.","code":"system.time({ set.seed(29) sfe$clusts_nonspatial <- clusterCells(sfe, use.dimred = \"PCA\", BLUSPARAM = NNGraphParam( cluster.fun = \"leiden\", cluster.args = list( objective_function = \"modularity\", resolution_parameter = 1 ) )) }) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 947.058 9.214 959.522 system.time({ set.seed(29) sfe$clusts_multispati <- clusterRows(reducedDim(sfe, \"multispati\")[,1:20], BLUSPARAM = NNGraphParam( cluster.fun = \"leiden\", cluster.args = list( objective_function = \"modularity\", resolution_parameter = 1 ) )) }) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 740.368 8.836 754.958 plotSpatialFeature(sfe, c(\"clusts_nonspatial\", \"clusts_multispati\"), colGeometryName = \"centroids\", scattermore = TRUE) & guides(colour = guide_legend(override.aes = list(size=2), ncol = 2)) plotSpatialFeature(sfe, c(\"clusts_nonspatial\", \"clusts_multispati\"), colGeometryName = \"cellSeg\", bbox = bbox_use) & guides(fill = guide_legend(ncol = 2))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/multispati.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"MULTISPATI PCA and negative spatial autocorrelation","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] patchwork_1.2.0 sf_1.0-16 #> [3] BiocParallel_1.36.0 BiocSingular_1.18.0 #> [5] bluster_1.12.0 tibble_3.2.1 #> [7] tidyr_1.3.1 stringr_1.5.1 #> [9] scran_1.30.2 scater_1.30.1 #> [11] ggplot2_3.5.1 scuttle_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SpatialFeatureExperiment_1.3.0 SFEData_1.4.0 #> [25] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] RColorBrewer_1.1-3 DBI_1.2.2 #> [19] abind_1.4-5 zlibbioc_1.48.2 #> [21] purrr_1.0.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 tools_4.3.3 #> [45] ggnewscale_0.4.10 ragg_1.3.0 #> [47] Rcpp_1.0.12 glue_1.7.0 #> [49] gridExtra_2.3 SparseArray_1.2.4 #> [51] mgcv_1.9-1 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 fansi_1.0.6 #> [61] spData_2.3.0 digest_0.6.35 #> [63] rsvd_1.0.5 R6_2.5.1 #> [65] mime_0.12 textshaping_0.3.7 #> [67] colorspace_2.1-0 wk_0.9.1 #> [69] scattermore_1.2 RSQLite_2.3.6 #> [71] hexbin_1.28.3 utf8_1.2.4 #> [73] generics_0.1.3 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 spdep_1.3-3 #> [79] pkgconfig_2.0.3 scico_1.5.0 #> [81] gtable_0.3.5 blob_1.2.4 #> [83] XVector_0.42.0 htmltools_0.5.8.1 #> [85] scales_1.3.0 png_0.1-8 #> [87] SpatialExperiment_1.12.0 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 fs_1.6.4 #> [121] ggbeeswarm_0.7.2 stringi_1.8.3 #> [123] viridisLite_0.4.2 deldir_2.0-4 #> [125] munsell_0.5.1 Biostrings_2.70.3 #> [127] Matrix_1.6-5 ExperimentHub_2.10.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] highr_0.10 interactiveDisplayBase_1.40.0 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"areal spatial data, spatial neighborhood graph used indicate proximity, required spatial analysis methods package spdep. One methods find spatial neighborhood graph k nearest neighbors, also commonly used gene expression PCA space graph-based clustering cells non-spatial scRNA-seq data. use k nearest neighbors graph PCA space rather histological space “spatial” analyses non-spatial scRNA-seq data? try analysis human peripheral blood mononuclear cells (PBMC) scRNA-seq dataset, doesn’t originally histological spatial organization. packages loaded analysis: download filtered Cell Ranger gene count matrix 10X website. empty droplets already removed. loaded R SingleCellExperiment (SCE) object.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SpatialExperiment) library(DropletUtils) library(BiocNeighbors) library(scater) library(scran) library(bluster) library(BiocParallel) library(scuttle) library(stringr) library(BiocSingular) library(spdep) library(patchwork) library(dplyr) library(reticulate) theme_set(theme_bw()) # Specify Python version to use gget PY_PATH <- Sys.which(\"python\") use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function # Load gget gget <- import(\"gget\") if (!dir.exists(\"filtered_feature_bc_matrix\")) { download.file(\"https://cf.10xgenomics.com/samples/cell-exp/3.0.2/5k_pbmc_v3_nextgem/5k_pbmc_v3_nextgem_filtered_feature_bc_matrix.tar.gz\", destfile = \"5kpbmc.tar.gz\", quiet = TRUE) system(\"tar -xzf 5kpbmc.tar.gz\") } (sce <- read10xCounts(\"filtered_feature_bc_matrix/\")) #> class: SingleCellExperiment #> dim: 33538 5155 #> metadata(1): Samples #> assays(1): counts #> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 #> ENSG00000268674 #> rowData names(3): ID Symbol Type #> colnames: NULL #> colData names(2): Sample Barcode #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): colnames(sce) <- sce$Barcode"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"perform basic QC, remove low quality cells high proportion mitochondrially encoded counts. addPerCellQCMetrics() function computes total UMI counts detected per cell (sum), number genes detected per cell (detected), sum detected mitochondrial counts, percentage mitochondrial counts per cell. 2D histogram plotted better show point density plot. Remove cells >20% mitochondrial counts","code":"is_mito <- str_detect(rowData(sce)$Symbol, \"^MT-\") sum(is_mito) #> [1] 13 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"Sample\" \"Barcode\" \"sum\" #> [4] \"detected\" \"subsets_mito_sum\" \"subsets_mito_detected\" #> [7] \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) plotColData(sce, x = \"sum\", y = \"subsets_mito_percent\", bins = 100) sce <- sce[, sce$subsets_mito_percent < 20] sce <- sce[rowSums(counts(sce)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"basic-non-spatial-analyses","dir":"Articles","previous_headings":"","what":"Basic non-spatial analyses","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"normalize data, perform PCA, cluster cells, find marker genes clusters. Use highly variable genes PCA: many PCs shall use analyses? Variance explained drops sharply PC1 PC4 levels . Plot genes largest loadings top 4 PCs: keep little information, use 10 PCs, variance explained levels even . plot cells first 4 PCs matrix plot. diagonals density plots number cells projected PC. x axis correspond columns matrix plot, y axis correspond rows, plot row 1 column 2 PC2 x axis PC1 y axis. cells colored clusters found previous code chunk. many cells cluster? use conventional Wilcoxon rank sum test find marker genes cluster. test compares cluster rest cells, genes highly expressed cluster compared cells considered. result list data frames, data frame corresponds one cluster. Areas receiver operator curve (AUC), distinguishing cluster vs. cluster, also included. closer 1 better, 0.5 means better random guessing. false discovery rate (FDR) column contains Benjamini-Hochberg corrected p-values. Genes data frames already sorted p-values. See specific top markers cluster: can use gget info module gget package get additional information marker genes. example, NCBI description:","code":"#clusts <- quickCluster(sce) #sce <- computeSumFactors(sce, cluster = clusts) #sce <- sce[, sizeFactors(sce) > 0] sce <- logNormCounts(sce) dec <- modelGeneVar(sce, lowess = FALSE) hvgs <- getTopHVGs(dec, n = 2000) set.seed(29) sce <- runPCA(sce, ncomponents = 30, BSPARAM = IrlbaParam(), subset_row = hvgs, scale = TRUE) ElbowPlot(sce, ndims = 30) plotDimLoadings(sce, swap_rownames = \"Symbol\") sce$cluster <- clusterRows(reducedDim(sce, \"PCA\")[,1:10], BLUSPARAM = SNNGraphParam(cluster.fun = \"leiden\", k = 10, cluster.args = list( resolution=0.5, objective_function = \"modularity\" ))) plotPCA(sce, ncomponents = 4, color_by = \"cluster\") table(sce$cluster) #> #> 1 2 3 4 5 6 7 8 #> 1057 1029 1278 590 415 207 27 26 markers <- findMarkers(sce, groups = colData(sce)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[4]] #> DataFrame with 21932 rows and 10 columns #> p.value FDR summary.AUC AUC.1 AUC.2 #> #> ENSG00000105369 2.91320e-18 6.38923e-14 1.000000 0.999965 0.999489 #> ENSG00000007312 7.06943e-18 7.75234e-14 0.994068 0.990666 0.990071 #> ENSG00000104894 1.32243e-17 9.66782e-14 0.989896 0.994232 0.988447 #> ENSG00000196735 4.83219e-17 2.16009e-13 0.981063 0.998555 0.938579 #> ENSG00000156738 4.92451e-17 2.16009e-13 0.972316 0.999639 0.998674 #> ... ... ... ... ... ... #> ENSG00000184274 1 1 0.5 0.500000 0.5 #> ENSG00000273796 1 1 0.5 0.500000 0.5 #> ENSG00000274248 1 1 0.5 0.500000 0.5 #> ENSG00000160282 1 1 0.5 0.499527 0.5 #> ENSG00000228137 1 1 0.5 0.500000 0.5 #> AUC.3 AUC.5 AUC.6 AUC.7 AUC.8 #> #> ENSG00000105369 0.999977 0.999534 0.997208 0.999937 1.000000 #> ENSG00000007312 0.991763 0.987639 0.986535 0.989077 0.994068 #> ENSG00000104894 0.994081 0.992534 0.992201 0.998305 0.989896 #> ENSG00000196735 0.998990 0.996269 0.998657 0.996798 0.981063 #> ENSG00000156738 0.999930 0.995512 0.999664 0.972316 1.000000 #> ... ... ... ... ... ... #> ENSG00000184274 0.499609 0.5 0.500000 0.5 0.5 #> ENSG00000273796 0.499218 0.5 0.500000 0.5 0.5 #> ENSG00000274248 0.498826 0.5 0.500000 0.5 0.5 #> ENSG00000160282 0.499609 0.5 0.500000 0.5 0.5 #> ENSG00000228137 0.500000 0.5 0.497585 0.5 0.5 top_markers <- unlist(lapply(markers, function(x) head(rownames(x), 1))) top_markers_symbol <- rowData(sce)[top_markers, \"Symbol\"] plotExpression(sce, top_markers_symbol, x = \"cluster\", swap_rownames = \"Symbol\", point_fun = function(...) list()) gget_info <- gget$info(top_markers) rownames(gget_info) <- gget_info$ensembl_gene_name select(gget_info, ncbi_description) #> ncbi_description #> TRAC T cell receptors recognize foreign antigens which have been processed as small peptides and bound to major histocompatibility complex (MHC) molecules at the surface of antigen presenting cells (APC). Each T cell receptor is a dimer consisting of one alpha and one beta chain or one delta and one gamma chain. In a single cell, the T cell receptor loci are rearranged and expressed in the order delta, gamma, beta, and alpha. If both delta and gamma rearrangements produce functional chains, the cell expresses delta and gamma. If not, the cell proceeds to rearrange the beta and alpha loci. This region represents the germline organization of the T cell receptor alpha and delta loci. Both the alpha and delta loci include V (variable), J (joining), and C (constant) segments and the delta locus also includes diversity (D) segments. The delta locus is situated within the alpha locus, between the alpha V and J segments. During T cell development, the delta chain is synthesized by a recombination event at the DNA level joining a D segment with a J segment; a V segment is then joined to the D-J gene. The alpha chain is synthesized by recombination joining a single V segment with a J segment. For both chains, the C segment is later joined by splicing at the RNA level. Recombination of many different V segments with several J segments provides a wide range of antigen recognition. Additional diversity is attained by junctional diversity, resulting from the random additional of nucleotides by terminal deoxynucleotidyltransferase. Five variable segments can be used in either alpha or delta chains and are described by TRAV/DV symbols. Several V and J segments of the alpha locus are known to be incapable of encoding a protein and are considered pseudogenes. [provided by RefSeq, Aug 2016] #> MNDA The myeloid cell nuclear differentiation antigen (MNDA) is detected only in nuclei of cells of the granulocyte-monocyte lineage. A 200-amino acid region of human MNDA is strikingly similar to a region in the proteins encoded by a family of interferon-inducible mouse genes, designated Ifi-201, Ifi-202, and Ifi-203, that are not regulated in a cell- or tissue-specific fashion. The 1.8-kb MNDA mRNA, which contains an interferon-stimulated response element in the 5-prime untranslated region, was significantly upregulated in human monocytes exposed to interferon alpha. MNDA is located within 2,200 kb of FCER1A, APCS, CRP, and SPTA1. In its pattern of expression and/or regulation, MNDA resembles IFI16, suggesting that these genes participate in blood cell-specific responses to interferons. [provided by RefSeq, Jul 2008] #> RPL32 Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L32E family of ribosomal proteins. It is located in the cytoplasm. Although some studies have mapped this gene to 3q13.3-q21, it is believed to map to 3p25-p24. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding the same protein have been observed for this gene. [provided by RefSeq, Jul 2008] #> CD79A The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-alpha protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. [provided by RefSeq, Jul 2008] #> NKG7 Predicted to be integral component of plasma membrane. Predicted to be active in plasma membrane. [provided by Alliance of Genome Resources, Apr 2022] #> MALAT1 This gene produces a precursor transcript from which a long non-coding RNA is derived by RNase P cleavage of a tRNA-like small ncRNA (known as mascRNA) from its 3' end. The resultant mature transcript lacks a canonical poly(A) tail but is instead stabilized by a 3' triple helical structure. This transcript is retained in the nucleus where it is thought to form molecular scaffolds for ribonucleoprotein complexes. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer metastasis and cell migration, and it is involved in cell cycle regulation. Its upregulation in multiple cancerous tissues has been associated with the proliferation and metastasis of tumor cells. [provided by RefSeq, Mar 2015] #> CLU The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants.[provided by RefSeq, May 2011] #> BCL11A This gene encodes a C2H2 type zinc-finger protein by its similarity to the mouse Bcl11a/Evi9 protein. The corresponding mouse gene is a common site of retroviral integration in myeloid leukemia, and may function as a leukemia disease gene, in part, through its interaction with BCL6. During hematopoietic cell differentiation, this gene is down-regulated. It is possibly involved in lymphoma pathogenesis since translocations associated with B-cell malignancies also deregulates its expression. Multiple transcript variants encoding several different isoforms have been found for this gene. [provided by RefSeq, Jul 2008]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"spatial-analyses-for-qc-metrics","dir":"Articles","previous_headings":"","what":"“Spatial” analyses for QC metrics","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"Find k nearest neighbor graph PCA space Moran’s : using spdep since nb2listwdist() function distance based edge weighting requires 2-3 dimensional spatial coordinates coordinates 10 dimensions. , inverse distance weighting used edge weights. histological space, convert SCE object SpatialFeatureExperiment (SFE) use spatial analysis plotting functions Voyager, pretend first 2 PCs histological space. Add k nearest neighbor graph SFE object:","code":"foo <- findKNN(reducedDim(sce, \"PCA\")[,1:10], k=10, BNPARAM=AnnoyParam()) # Split by row foo_nb <- asplit(foo$index, 1) dmat <- 1/foo$distance # Row normalize the weights dmat <- sweep(dmat, 1, rowSums(dmat), FUN = \"/\") glist <- asplit(dmat, 1) # Sort based on index ord <- lapply(foo_nb, order) foo_nb <- lapply(seq_along(foo_nb), function(i) foo_nb[[i]][ord[[i]]]) class(foo_nb) <- \"nb\" glist <- lapply(seq_along(glist), function(i) glist[[i]][ord[[i]]]) listw <- list(style = \"W\", neighbours = foo_nb, weights = glist) class(listw) <- \"listw\" attr(listw, \"region.id\") <- colnames(sce) (sfe <- toSpatialFeatureExperiment(sce, spatialCoords = reducedDim(sce, \"PCA\")[,1:2], spatialCoordsNames = NULL)) #> class: SpatialFeatureExperiment #> dim: 21932 4629 #> metadata(1): Samples #> assays(2): counts logcounts #> rownames(21932): ENSG00000238009 ENSG00000239945 ... ENSG00000275063 #> ENSG00000271254 #> rowData names(3): ID Symbol Type #> colnames(4629): AAACCCAAGACAGCTG-1 AAACCCAAGTTAACGA-1 ... #> TTTGTTGTCACGGACC-1 TTTGTTGTCCACACCT-1 #> colData names(11): Sample Barcode ... cluster sample_id #> reducedDimNames(1): PCA #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : PC1 PC2 #> imgData names(0): #> #> unit: #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: colGraph(sfe, \"knn10\") <- listw"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"morans-i","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"total UMI counts (sum) genes detected (detected), Moran’s quite strong, ’s positive weaker percentage mitochondrial counts. second column, K, kurtosis feature interest.","code":"sfe <- colDataMoransI(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\")) colFeatureData(sfe)[c(\"sum\", \"detected\", \"subsets_mito_percent\"),] #> DataFrame with 3 rows and 2 columns #> moran_sample01 K_sample01 #> #> sum 0.655173 16.44603 #> detected 0.750133 6.01002 #> subsets_mito_percent 0.438934 6.13555"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"moran-plot","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Moran plot","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"local variations k nearest neighbors graph? Moran plot, x axis value cell, y axis average value among neighboring cells graph weighted edge weights. slope fitted line Moran’s . Sometimes clusters plot, showing different kinds neighborhoods. dashed lines averages x y axes. cells cluster around average, cluster cells lower total counts whose neighbors also lower total counts. also cluster cells higher total counts whose neighbors also higher total counts. clusters seem somewhat related gene expression based clusters. one main cluster plot number genes detected percentage mitochondrial counts. However, cells somewhat separated gene expression clusters. surprising gene expression clusters also based k nearest neighbor graph. Cluster 4 cells higher percentage mitochondrial counts neighbors.","code":"sfe <- colDataUnivariate(sfe, \"moran.plot\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) moranPlot(sfe, \"sum\", color_by = \"cluster\") moranPlot(sfe, \"detected\", color_by = \"cluster\") moranPlot(sfe, \"subsets_mito_percent\", color_by = \"cluster\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-morans-i","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Local Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"Also see local Moran’s 3 QC metrics: , don’t histological space. can visualize local “spatial” statistics? UMAP bad, case PCA can somewhat separate clusters. can use first 2 PCs histological space. reference, plot metrics clusters first 2 PCs. Plot local Moran’s metrics first 2 PCs: However, good 2D representation data easy plotting? Remember k nearest neighbor graph computed first 10 PCs rather first 2 PCs. graph tied 2D representation. can still plot histograms show distribution scatter plots compare local metric different variables, can colored another variable cluster. may added next release Voyager. now, add results interest colData(sfe) use existing colData plotting functions scater Voyager. y axis log transformed (hence warning bins cells), color cells long tail can seen cells don’t strong local Moran’s . Cells cluster 7 high local Moran’s total UMI counts genes detected, means tend homogeneous QC metrics. local Moran’s QC metrics relate ? Cells locally homogeneous total UMI counts also homogeneous number genes detected, surprising given correlation two. local Moran’s , sum vs percentage mitochondrial counts shows interesting pattern, highlighting clusters 4 7 Moran plots. local Moran’s relate value ? case, generally cells higher total counts also tend higher local Moran’s total counts. However, another wing cells lower total counts slightly higher local Moran’s total counts ’s central value total counts near 0 local Moran’s . density contour shows cells concentrated central value.","code":"sfe <- colDataUnivariate(sfe, \"localmoran\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) plotSpatialFeature(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\", \"cluster\")) plotLocalResult(sfe, \"localmoran\", c(\"sum\", \"detected\", \"subsets_mito_percent\"), colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2) localResultAttrs(sfe, \"localmoran\", \"sum\") #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" sfe$sum_localmoran <- localResult(sfe, \"localmoran\", \"sum\")[,\"Ii\"] sfe$detected_localmoran <- localResult(sfe, \"localmoran\", \"detected\")[,\"Ii\"] sfe$pct_mito_localmoran <- localResult(sfe, \"localmoran\", \"subsets_mito_percent\")[,\"Ii\"] # Colorblind friendly palette data(\"ditto_colors\") plotColDataFreqpoly(sfe, c(\"sum_localmoran\", \"detected_localmoran\", \"pct_mito_localmoran\"), bins = 50, color_by = \"cluster\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. plotColData(sfe, x = \"sum_localmoran\", y = \"detected_localmoran\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum_localmoran\", y = \"pct_mito_localmoran\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum\", y = \"sum_localmoran\", color_by = \"cluster\") + geom_density2d(data = as.data.frame(colData(sfe)), mapping = aes(x = sum, y = sum_localmoran), color = \"blue\", linewidth = 0.3) + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-spatial-heteroscedasticity-losh","dir":"Articles","previous_headings":"“Spatial” analyses for QC metrics","what":"Local spatial heteroscedasticity (LOSH)","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"LOSH indicates heterogeneity around cell k nearest neighbor graph. make non-spatial plots LOSH local Moran’s . , clusters 2 6 tend locally heterogeneous. total counts genes detected relate LOSH? generally cells higher LOSH total counts also higher LOSH genes detected, outliers high , heterogeneous neighborhoods. Absolute distance neighbors taken account adjacency matrix row normalized. interesting see outliers tend away 10 nearest neighbors, region PCA space cells apart. total counts relate LOSH? seem clear relationship case.","code":"sfe <- colDataUnivariate(sfe, \"LOSH\", c(\"sum\", \"detected\", \"subsets_mito_percent\")) plotLocalResult(sfe, \"LOSH\", c(\"sum\", \"detected\", \"subsets_mito_percent\"), colGeometryName = \"centroids\", ncol = 2) localResultAttrs(sfe, \"LOSH\", \"sum\") #> [1] \"Hi\" \"E.Hi\" \"Var.Hi\" \"Z.Hi\" \"x_bar_i\" \"ei\" sfe$sum_losh <- localResult(sfe, \"LOSH\", \"sum\")[,\"Hi\"] sfe$detected_losh <- localResult(sfe, \"LOSH\", \"detected\")[,\"Hi\"] sfe$pct_mito_losh <- localResult(sfe, \"LOSH\", \"subsets_mito_percent\")[,\"Hi\"] plotColDataFreqpoly(sfe, c(\"sum_losh\", \"detected_losh\", \"pct_mito_losh\"), bins = 50, color_by = \"cluster\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. plotColData(sfe, x = \"sum_losh\", y = \"detected_losh\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotColData(sfe, x = \"sum\", y = \"sum_losh\", color_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"spatial-analyses-for-gene-expression","dir":"Articles","previous_headings":"","what":"“Spatial” analyses for gene expression","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"First, need reorganize differential expression results:","code":"top_markers_df <- lapply(seq_along(markers), function(i) { out <- markers[[i]][markers[[i]]$FDR < 0.05, c(\"FDR\", \"summary.AUC\")] if (nrow(out)) out$cluster <- i out }) top_markers_df <- do.call(rbind, top_markers_df) top_markers_df$symbol <- rowData(sce)[rownames(top_markers_df), \"Symbol\"]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"morans-i-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"results added rowData(sfe). NA’s non-highly variable genes, Moran’s computed highly variable genes . Moran’s ’s highly variable genes distributed? Also, top cluster marker genes distribution? top marker genes quite positive Moran’s k nearest neighbor graph. also interesting color histogram gene sets. Since k nearest neighbor graph found PCA space, based gene expression, expected, Moran’s graph mostly positive, although often strong. small number genes slightly negative Moran’s . top genes look like PCA? marker genes cluster, cluster 9. Perhaps genes high Moran’s specific cell type. Moran’s relate cluster AUC cluster differential expression p-value? differential expression p-value relate Moran’s ? Generally, significant marker genes tend higher Moran’s . surprising clusters Moran’s based k nearest neighbor graph. Similarly, genes higher AUC tend higher Moran’s . clusters, generally speaking, genes specific cluster tend higher Moran’s . Let’s use permutation testing see Moran’s statistically significant: seem significant. correlogram finds Moran’s higher order neighbors can proxy distance. see different patterns decay spatial autocorrelation different length scales spatial autocorrelation. CLU marker gene specific smallest cluster, higher order neighbors likely clusters. Marker genes larger clusters hundreds cells nevertheless display different patterns correlogram.","code":"sfe <- runMoransI(sfe, features = hvgs, BPPARAM = MulticoreParam(2)) rowData(sfe) #> DataFrame with 21932 rows and 5 columns #> ID Symbol Type moran_sample01 #> #> ENSG00000238009 ENSG00000238009 AL627309.1 Gene Expression NA #> ENSG00000239945 ENSG00000239945 AL627309.3 Gene Expression NA #> ENSG00000241599 ENSG00000241599 AL627309.4 Gene Expression NA #> ENSG00000229905 ENSG00000229905 AL669831.2 Gene Expression NA #> ENSG00000237491 ENSG00000237491 AL669831.5 Gene Expression NA #> ... ... ... ... ... #> ENSG00000278817 ENSG00000278817 AC007325.4 Gene Expression NA #> ENSG00000278384 ENSG00000278384 AL354822.1 Gene Expression NA #> ENSG00000277856 ENSG00000277856 AC233755.2 Gene Expression NA #> ENSG00000275063 ENSG00000275063 AC233755.1 Gene Expression NA #> ENSG00000271254 ENSG00000271254 AC240274.1 Gene Expression NA #> K_sample01 #> #> ENSG00000238009 NA #> ENSG00000239945 NA #> ENSG00000241599 NA #> ENSG00000229905 NA #> ENSG00000237491 NA #> ... ... #> ENSG00000278817 NA #> ENSG00000278384 NA #> ENSG00000277856 NA #> ENSG00000275063 NA #> ENSG00000271254 NA plotRowDataHistogram(sfe, \"moran_sample01\", bins = 50) + geom_vline(data = as.data.frame(rowData(sfe)[top_markers,]) |> mutate(index = seq_along(top_markers)), aes(xintercept = moran_sample01, color = index)) + scale_color_continuous(breaks = scales::breaks_width(2)) #> Warning: Removed 19932 rows containing non-finite outside the scale range #> (`stat_bin()`). top_moran <- head(rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)], 4) plotSpatialFeature(sfe, top_moran, ncol = 2) top_moran_symbol <- rowData(sfe)[top_moran, \"Symbol\"] plotExpression(sfe, top_moran_symbol, swap_rownames = \"Symbol\") # See if markers are unique to clusters anyDuplicated(rownames(top_markers_df)) #> [1] 0 top_markers_df$moran <- rowData(sfe)[rownames(top_markers_df), \"moran_sample01\"] top_markers_df$log_p_adj <- -log10(top_markers_df$FDR) top_markers_df$cluster <- factor(top_markers_df$cluster, levels = seq_len(length(unique(top_markers_df$cluster)))) as.data.frame(top_markers_df) |> ggplot(aes(log_p_adj, moran)) + geom_point(aes(color = cluster)) + geom_smooth(method = \"lm\") + scale_color_manual(values = ditto_colors) #> `geom_smooth()` using formula = 'y ~ x' #> Warning: Removed 574 rows containing non-finite outside the scale range #> (`stat_smooth()`). #> Warning: Removed 574 rows containing missing values or values outside the scale range #> (`geom_point()`). as.data.frame(top_markers_df) |> ggplot(aes(summary.AUC, moran)) + geom_point(aes(color = cluster)) + geom_smooth(method = \"lm\") + scale_color_manual(values = ditto_colors) #> `geom_smooth()` using formula = 'y ~ x' #> Warning: Removed 574 rows containing non-finite outside the scale range #> (`stat_smooth()`). #> Warning: Removed 574 rows containing missing values or values outside the scale range #> (`geom_point()`). sfe <- runUnivariate(sfe, \"moran.mc\", features = top_markers, nsim = 200) top_markers_symbol #> [1] \"TRAC\" \"MNDA\" \"RPL32\" \"CD79A\" \"NKG7\" \"MALAT1\" \"CLU\" \"BCL11A\" plotMoranMC(sfe, top_markers, swap_rownames = \"Symbol\") system.time({ sfe <- runUnivariate(sfe, \"sp.correlogram\", top_markers, order = 6, zero.policy = TRUE, BPPARAM = MulticoreParam(2)) }) #> user system elapsed #> 227.927 19.754 248.680 plotCorrelogram(sfe, top_markers, swap_rownames = \"Symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"local-morans-i-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Local Moran’s I","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"also plot histograms, now results need added colData first. , y axis log transformed make tail visible. clusters, top marker gene’s local Moran’s forms peak cells cluster higher local Moran’s cells. However, sometimes cells within cluster form long tail shared cells clusters. local Moran’s another method differential expression. since local Moran’s Leiden clustering use k nearest neighbor graph PCA space, local Moran’s marker genes perhaps eigengenes signifying gene programs cell type k nearest neighbor graph can validate criticize Leiden clusters. Furthermore, interestingly, genes, tallest peak histogram away 0. scatter plots shown “spatial” analyses QC metrics section can made see local Moran’s relates expression gene . gene, just like total UMI counts, two wings central value local Moran’s around 0. Generally, cells higher expression gene higher local Moran’s gene well. density contours show cells concentrate around 0 expression weaker positive local Moran. streak cells 0 expression means many cells don’t express gene, neighbors low slightly homogeneous expression gene. pattern may different different genes. Also, p-values cell local Moran’s available corrected multiple hypothesis testing, can plotted. p-values based z score local Moran statistic, although statistic distributed gene expression data warrants investigation. p-value can also computed permutation (see localmoran_perm()).","code":"sfe <- runUnivariate(sfe, \"localmoran\", features = top_markers) plotLocalResult(sfe, \"localmoran\", top_markers, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 3, swap_rownames = \"Symbol\") new_colname <- paste0(\"cluster\", seq_along(top_markers), \"_\", top_markers_symbol, \"_localmoran\") for (i in seq_along(top_markers)) { g <- top_markers[i] colData(sfe)[[new_colname[i]]] <- localResult(sfe, \"localmoran\", g)[,\"Ii\"] } plotColDataFreqpoly(sfe, new_colname, color_by = \"cluster\") + ggtitle(\"Local Moran's I\") + theme(legend.position = \"top\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. i <- 6 # Change if running this notebook plotExpression(sfe, top_markers_symbol[i], x = new_colname[i], color_by = \"cluster\", swap_rownames = \"Symbol\") + scale_color_manual(values = ditto_colors) + coord_flip() + # comment out in case of error after changing i geom_density2d(data = as.data.frame(colData(sfe)) |> mutate(gene = logcounts(sfe)[top_markers[i],]), mapping = aes(x = .data[[new_colname[i]]], y = gene), color = \"blue\", linewidth = 0.3) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. localResultAttrs(sfe, \"localmoran\", top_markers[1]) #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\""},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"losh","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"LOSH","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"two genes right, ’s interesting see higher LOSH middle cluster. two genes left outliers throwing dynamic range, seems high LOSH regions different. , plot histograms: relationship expression LOSH complicated. genes, top marker gene cluster 1 LYAR, cells cluster higher expression also higher LOSH - much like Poisson negative binomial distributions, higher mean also means higher variance. However, genes, top marker gene cluster 2 CTSS, lower LOSH among cells higher expression, means expression gene homogeneous within cluster, consistent local Moran. gene, density contour indicates many cells don’t express gene homogeneous neighborhoods also low expression. streak around 0 expression means neighbors cells don’t express gene different levels heterogeneity gene.","code":"sfe <- runUnivariate(sfe, \"LOSH\", top_markers) plotLocalResult(sfe, \"LOSH\", top_markers, colGeometryName = \"centroids\", ncol = 3, swap_rownames = \"Symbol\") new_colname2 <- paste0(\"cluster\", seq_along(top_markers), \"_\", top_markers_symbol, \"_losh\") for (i in seq_along(top_markers)) { g <- top_markers[i] colData(sfe)[[new_colname2[i]]] <- localResult(sfe, \"LOSH\", g)[,\"Hi\"] } plotColDataFreqpoly(sfe, new_colname2, color_by = \"cluster\") + ggtitle(\"Local heteroscedasticity\") + theme(legend.position = \"top\") + scale_y_log10() + annotation_logticks(sides = \"l\") #> Warning in scale_y_log10(): log-10 transformation introduced #> infinite values. i <- 6 # Change if running this notebook plotExpression(sfe, top_markers_symbol[i], x = new_colname2[i], color_by = \"cluster\", swap_rownames = \"Symbol\") + scale_color_manual(values = ditto_colors) + coord_flip() + # comment out in case of error after changing i geom_density2d(data = as.data.frame(colData(sfe)) |> mutate(gene = logcounts(sfe)[top_markers[i],]), mapping = aes(x = .data[[new_colname2[i]]], y = gene), color = \"blue\", linewidth = 0.3) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"moran-plot-1","dir":"Articles","previous_headings":"“Spatial” analyses for gene expression","what":"Moran plot","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"make Moran plots top marker genes. reference, show Moran’s top marker genes, slope line fitted Moran scatter plot. significant marker gene cluster 7. plots shown sequence genes, points concentrated around origin aren’t “enough” points elsewhere plot density contours. cells express genes, clusters plot. genes expressed many cells, cells neighbors express gene, hence vertical streak x = 0. tutorial, applied univariate spatial statistics k nearest neighbor graph gene expression PCA space rather histological space. Just like histological space, impractical examine statistics gene gene, multivariate analyses incorporate k nearest neighbor graph may interesting.","code":"sfe <- runUnivariate(sfe, \"moran.plot\", features = top_markers, colGraphName = \"knn10\") top_markers_df[top_markers,] #> DataFrame with 8 rows and 6 columns #> FDR summary.AUC cluster symbol moran #> #> ENSG00000277734 3.38982e-13 0.975227 1 TRAC 0.768167 #> ENSG00000163563 2.71016e-14 0.999028 2 MNDA 0.955553 #> ENSG00000144713 8.43108e-15 0.999609 3 RPL32 0.789326 #> ENSG00000105369 6.38923e-14 1.000000 4 CD79A 0.944921 #> ENSG00000105374 6.08330e-14 1.000000 5 NKG7 0.931310 #> ENSG00000251562 9.26308e-09 0.930695 6 MALAT1 0.811310 #> ENSG00000120885 6.88523e-08 1.000000 7 CLU 0.902698 #> ENSG00000119866 3.82513e-08 1.000000 8 BCL11A 0.648106 #> log_p_adj #> #> ENSG00000277734 12.46982 #> ENSG00000163563 13.56701 #> ENSG00000144713 14.07412 #> ENSG00000105369 13.19455 #> ENSG00000105374 13.21586 #> ENSG00000251562 8.03324 #> ENSG00000120885 7.16208 #> ENSG00000119866 7.41735 plts <- lapply(top_markers, moranPlot, sfe = sfe, color_by = \"cluster\", swap_rownames = \"Symbol\") #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. #> Warning in value[[3L]](cond): Too few points for stat_density2d, not plotting #> contours. wrap_plots(plts, widths = 1, heights = 1) + plot_layout(ncol = 3, guides = \"collect\") + plot_annotation(tag_levels = \"1\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/nonspatial.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Apply spatial analyses to non-spatial scRNA-seq data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 dplyr_1.1.4 #> [3] patchwork_1.2.0 spdep_1.3-3 #> [5] sf_1.0-16 spData_2.3.0 #> [7] BiocSingular_1.18.0 stringr_1.5.1 #> [9] BiocParallel_1.36.0 bluster_1.12.0 #> [11] scran_1.30.2 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] BiocNeighbors_1.20.2 DropletUtils_1.22.0 #> [17] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [19] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [21] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [23] IRanges_2.36.0 S4Vectors_0.40.2 #> [25] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [27] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [29] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] memoise_2.0.1 DelayedMatrixStats_1.24.0 #> [15] RCurl_1.98-1.14 terra_1.7-71 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] Rhdf5lib_1.24.2 s2_1.1.6 #> [21] SparseArray_1.2.4 rhdf5_2.46.1 #> [23] sass_0.4.9 KernSmooth_2.23-22 #> [25] bslib_0.7.0 htmlwidgets_1.6.4 #> [27] desc_1.4.3 cachem_1.0.8 #> [29] igraph_2.0.3 lifecycle_1.0.4 #> [31] pkgconfig_2.0.3 rsvd_1.0.5 #> [33] Matrix_1.6-5 R6_2.5.1 #> [35] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [37] digest_0.6.35 colorspace_2.1-0 #> [39] ggnewscale_0.4.10 dqrng_0.3.2 #> [41] RSpectra_0.16-1 irlba_2.3.5.1 #> [43] textshaping_0.3.7 beachmat_2.18.1 #> [45] labeling_0.4.3 fansi_1.0.6 #> [47] mgcv_1.9-1 abind_1.4-5 #> [49] compiler_4.3.3 proxy_0.4-27 #> [51] withr_3.0.0 viridis_0.6.5 #> [53] DBI_1.2.2 highr_0.10 #> [55] HDF5Array_1.30.1 R.utils_2.12.3 #> [57] MASS_7.3-60.0.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] tools_4.3.3 units_0.8-5 #> [63] vipor_0.4.7 beeswarm_0.4.0 #> [65] R.oo_1.26.0 glue_1.7.0 #> [67] nlme_3.1-164 rhdf5filters_1.14.1 #> [69] grid_4.3.3 cluster_2.1.6 #> [71] generics_0.1.3 isoband_0.2.7 #> [73] gtable_0.3.5 R.methodsS3_1.8.2 #> [75] class_7.3-22 metapod_1.10.1 #> [77] ScaledMatrix_1.10.0 sp_2.1-4 #> [79] utf8_1.2.4 XVector_0.42.0 #> [81] ggrepel_0.9.5 pillar_1.9.0 #> [83] limma_3.58.1 splines_4.3.3 #> [85] lattice_0.22-6 deldir_2.0-4 #> [87] tidyselect_1.2.1 locfit_1.5-9.9 #> [89] knitr_1.45 gridExtra_2.3 #> [91] edgeR_4.0.16 xfun_0.43 #> [93] statmod_1.5.0 stringi_1.8.3 #> [95] yaml_2.3.8 boot_1.3-30 #> [97] evaluate_0.23 codetools_0.2-20 #> [99] tibble_3.2.1 cli_3.6.2 #> [101] systemfonts_1.0.6 munsell_0.5.1 #> [103] jquerylib_0.1.4 Rcpp_1.0.12 #> [105] png_0.1-8 parallel_4.3.3 #> [107] pkgdown_2.0.9 sparseMatrixStats_1.14.0 #> [109] bitops_1.0-7 viridisLite_0.4.2 #> [111] scales_1.3.0 e1071_1.7-14 #> [113] purrr_1.0.2 crayon_1.5.2 #> [115] scico_1.5.0 rlang_1.1.3 #> [117] cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium v3 preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium v3 preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"preprocessing-for-chromium-v3-chemistry","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium V3 Chemistry","title":"10X Chromium v3 preprocessing with cellatlas","text":"data example located cellatlas/examples/rna-10xv3/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-10xv3/* .\") system(\"gunzip 3M-february-2018.txt.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Fetch the references","title":"10X Chromium v3 preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Build the pipeline","title":"10X Chromium v3 preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Run the pipeline","title":"10X Chromium v3 preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_10xv3.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium V3 Chemistry","what":"Inspect the output","title":"10X Chromium v3 preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"preprocessing-for-chromium-single-cell-atac-seq","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell ATAC-seq","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"data example located cellatlas/examples/atac-10xatac/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/atac-10xatac/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Fetch the references","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Build the pipeline","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m atac\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz fastqs/I2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Run the pipeline","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_trim(cmds) cmds <- str_remove_all(cmds, '\\\\\\\",$|\\\\\\\"$|^\\\\\\\"') cmds <- str_replace_all(cmds, fixed(\"\\\\\\\"\"), \"\\\"\") cmds <- str_replace_all(cmds, fixed(\"\\\\t\"), \"\\t\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_atac.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC-seq","what":"Inspect the output","title":"10X Chromium ATAC-seq preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"ClickTags preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"ClickTags preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"preprocessing-for-clicktags","dir":"Articles","previous_headings":"","what":"Preprocessing for ClickTags","title":"ClickTags preprocessing with cellatlas","text":"data example located cellatlas/examples/tag-clicktag/* directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/tag-clicktag/* .\") system(\"gunzip 737K-august-2016.txt.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Fetch the references","title":"ClickTags preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Build the pipeline","title":"ClickTags preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m tag\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Run the pipeline","title":"ClickTags preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_clicktag.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for ClickTags","what":"Inspect the output","title":"ClickTags preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"preprocessing-for-chromium-single-cell-crispr-screening","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell CRISPR Screening","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"data example located cellatlas/examples/crispr-10xcrispr/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/crispr-10xcrispr/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Fetch the references","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Build the pipeline","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m crispr\", \"-fa\", FA, \"-g\", GTF, \"-fb\", \"feature_barcodes.txt\", \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Run the pipeline","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_crispr.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell CRISPR Screening","what":"Inspect the output","title":"10X Chromium CRISPR screening preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Multiome ATAC preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Multiome ATAC preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"preprocessing-for-chromium-single-cell-atac-multiome-atac","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","title":"10X Multiome ATAC preprocessing with cellatlas","text":"data example located cellatlas/examples/atac-10xmultiome/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/atac-10xmultiome/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Fetch the references","title":"10X Multiome ATAC preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Build the pipeline","title":"10X Multiome ATAC preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m atac\", \"-fa\", FA, \"-g\", GTF, \"fastqs/atac_R1.fastq.gz fastqs/atac_R2.fastq.gz fastqs/atac_I2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Run the pipeline","title":"10X Multiome ATAC preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_trim(cmds) cmds <- str_remove_all(cmds, '\\\\\\\",$|\\\\\\\"$|^\\\\\\\"') cmds <- str_replace_all(cmds, fixed(\"\\\\\\\"\"), \"\\\"\") cmds <- str_replace_all(cmds, fixed(\"\\\\t\"), \"\\t\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_multiome.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Single Cell ATAC Multiome ATAC","what":"Inspect the output","title":"10X Multiome ATAC preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"10X Chromium nuclei preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"10X Chromium nuclei preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"preprocessing-for-chromium-nuclei-isolation","dir":"Articles","previous_headings":"","what":"Preprocessing for Chromium Nuclei Isolation","title":"10X Chromium nuclei preprocessing with cellatlas","text":"data example located cellatlas/examples/rna-10xv3-nuclei/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-10xv3-nuclei/* .\") system(\"gunzip *.gz\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Fetch the references","title":"10X Chromium nuclei preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf homo_sapiens\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Build the pipeline","title":"10X Chromium nuclei preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.homo_sapiens.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fb feature_barcodes.txt\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Run the pipeline","title":"10X Chromium nuclei preprocessing with cellatlas","text":"run pipeline extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_nuclei.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Chromium Nuclei Isolation","what":"Inspect the output","title":"10X Chromium nuclei preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"Split-seq preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"Split-seq preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"preprocessing-for-split-seq","dir":"Articles","previous_headings":"","what":"Preprocessing for SPLiT-seq","title":"Split-seq preprocessing with cellatlas","text":"Note: move relevant data working directory gunzip barcode onlist. data example located cellatlas/examples/rna-splitseq/ directory. seqspec print command prints ordered tree representation sequenced elements contained FASTQ files. Note names nodes seqspec must match names FASTQ files. seqspec SPLiT-seq contains specification multiple split-pool rounds. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-splitseq/* .\") system(\"gunzip barcode*\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Fetch the references","title":"Split-seq preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Build the pipeline","title":"Split-seq preprocessing with cellatlas","text":"","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Run the pipeline","title":"Split-seq preprocessing with cellatlas","text":"run pipeline simply extract commands /cellatlas_info.json run command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_splitseq.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for SPLiT-seq","what":"Inspect the output","title":"Split-seq preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"building-count-matrices-with-cellatlas","dir":"Articles","previous_headings":"","what":"Building Count Matrices with cellatlas","title":"Visium preprocessing with cellatlas","text":"major challenge uniformly preprocessing large amounts single-cell genomics data variety different assays identifying handling sequenced elements coherent consistent fashion. Cell barcodes reads RNAseq data 10x Multiome, example, must extracted error corrected manner cell barcodes reads ATACseq data 10x Multiome barcode-barcode registration can occur. Uniform processing way minimzes computational variability enables cross-assay comparisons. notebook demonstrate single-cell genomics data can preprocessed generate cell feature count matrix. requires: FASTQ files seqspec specification FASTQ files Genome Sequence FASTA Genome Annotation GTF (optional) Feature barcode list","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"install-packages","dir":"Articles","previous_headings":"","what":"Install Packages","title":"Visium preprocessing with cellatlas","text":"vignette makes use two non-standard command line tools, jq tree. code cell installs tools Linux operating system updated Mac Windows users. continue dependencies can installed operating system.","code":"# Install `jq`, a command-line tool for extracting key value pairs from JSON files system(\"wget --quiet --show-progress https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64\") system(\"chmod +x jq-linux64 && mv jq-linux64 /usr/local/bin/jq\") # Clone the cellatlas repo and install the package system(\"git clone https://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@github.com/cellatlas/cellatlas.git\") system(\"cd cellatlas && pip install .\") # Install dependencies system(\"yes | pip uninstall --quiet seqspec\") system(\"pip install --quiet git+https://github.com/IGVF/seqspec.git\") system(\"pip install --quiet gget kb-python\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"examine-the-spec","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Examine the spec","title":"Visium preprocessing with cellatlas","text":"Note: move relevant data working directory gunzip barcode onlist. first use seqspec print check read structure matches expect. command prints ordered tree representation sequenced elements contained FASTQ files. Note names nodes seqspec must match names FASTQ files. Note Google Colab, go Runtime -> View runtime logs see output system.","code":"system(\"mv cellatlas/examples/rna-visium-spatial/* .\") system(\"seqspec print spec.yaml\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"fetch-the-references","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Fetch the references","title":"Visium preprocessing with cellatlas","text":"step necessary modality processing uses transcriptome reference-based alignment.","code":"system(\"gget ref -o ref.json -w dna,gtf mus_musculus\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"build-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Build the pipeline","title":"Visium preprocessing with cellatlas","text":"now supply relevant objects cellatlas build produce appropriate commands run build pipeline. includes reference building step read counting quantification step performed kallisto bustools part kb-python package.","code":"FA <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.genome_dna.ftp'\", \"ref.json\"), stdout = TRUE) GTF <- system2(\"jq\", args = c(\"-r\", \"'.mus_musculus.annotation_gtf.ftp'\", \"ref.json\"), stdout = TRUE) args <- c( \"-o out\", \"-s spec.yaml\", \"-m rna\", \"-fa\", FA, \"-g\", GTF, \"fastqs/R1.fastq.gz fastqs/R2.fastq.gz\") system2(command = \"cellatlas\", args = c(\"build\", args))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"run-the-pipeline","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Run the pipeline","title":"Visium preprocessing with cellatlas","text":"can extract view commands pipeline using jq. Now can run commands /cellatlas_info.json command line.","code":"cmds <- system2(\"jq\", \"-r '.commands[] | values[]' out/cellatlas_info.json\", stdout=TRUE) cmds <- str_subset(cmds, \"[\\\\[\\\\]]\", negate=TRUE) cmds <- str_extract(cmds, \"kb.*(txt|gz)\") cmds lapply(cmds, function(cmd) system(cmd))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/preprocess_visium.html","id":"inspect-the-output","dir":"Articles","previous_headings":"Preprocessing for Visium","what":"Inspect the output","title":"Visium preprocessing with cellatlas","text":"inspect /run_info.json /kb_info.json simple QC pipeline.","code":"list.files(\"out\") rjson::fromJSON(file = \"out/run_info.json\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/seqfish_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"seqFISH Processing Workflows with Voyager","text":"Pros: Single cell resolution High detection efficiency Commercial kit coming Get subcellular transcript localization information Compatible histological features DAPI membrane staining Cons: Need pre-select panel usually hundred genes","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/seqfish_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"seqFISH Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Use your own spatial method in Voyager","text":"multiple different ways certain things. different ways pros cons, sometimes can tell somewhat different stories. Often different ways come different syntaxes, increasing learning curve users. Voyager took inspiration caret tidymodels (Kuhn Wickham 2020) machine learning, foreach, future, BiocParallel parallel processing different backends, bluster different clustering algorithms, BiocNeighbors different algorithms find nearest neighbors. packages provide uniform user interfaces different methods achieve given goal. caret tidymodels, users can make uniform user interface fit custom models included package eliminate lot duplicate code. Voyager, done SFEMethod S4 class. vignette shows use SFEMethod class use Voyager’s uniform user interface custom methods. load packages used: Voyager categorizes exploratory spatial data analysis (ESDA) methods number variables whether method gives one result entire dataset (global) gives results location (local). process create SFEMethod object mostly across categories, category specific arguments. Also, make SFEMethod object, see method interest already Voyager. methods can listed listSFEMethods() function. calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object string matches entry name column data frame returned listSFEMethods() Voyager search S4 object name matching string.","code":"library(Voyager) library(spdep) #> Loading required package: spData #> To access larger datasets in this package, install the spDataLarge #> package with: `install.packages('spDataLarge', #> repos='https://nowosad.github.io/drat/', type='source')` #> Loading required package: sf #> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"global","dir":"Articles","previous_headings":"Univariate","what":"Global","title":"Use your own spatial method in Voyager","text":"univariate global methods Voyager: code used create SFEMethod object run Moran’s , SFEMethod() constructor: package argument used check package installed method run. function run method fun argument. univariate methods use spatial neighborhood graph (use_graph = TRUE) must arguments: x vector input listw spatial neighborhood graph listw object, zero.policy cells spots don’t spatial neighbors. See spdep documentation (e.g. spdep::moran()) zero.policy argument behaves. case wrote think wrapper fill confusing arguments may confuse users. function running method another package different arguments, write thin wrapper make required arguments. Extra arguments can passed fun .... reorganize_fun argument takes function reorganize output fun form DataFrame results genes can added rowData(sfe). Moran’s , function univariate bivariate global methods, function must : argument take output fun multiple genes features name take name results stored case method run genes different parameters don’t want overwrite previous results. name name specified SFEMethod() constructor default, can set user calling calculate*variate() run*variate(), ... reorganize_fun univariate global methods Voyager, sp.correlogram, needs arguments. spatial methods use spatial distances rather graphs, variogram. code used create SFEMethod object variogram: function fun univariate methods don’t use spatial neighborhood graph must arguments x coords_df (sf data frame spatial coordinates) arguments allowed. .variogram function: rule reorganize_fun remains , .other2df function:","code":"listSFEMethods(\"uni\", \"global\") #> name description #> 1 moran Moran's I #> 2 geary Geary's C #> 3 moran.mc Moran's I with permutation testing #> 4 geary.mc Geary's C with permutation testing #> 5 sp.mantel.mc Mantel-Hubert spatial general cross product statistic #> 6 moran.test Moran's I test #> 7 geary.test Geary's C test #> 8 globalG.test Global G test #> 9 sp.correlogram Correlogram #> 10 variogram Variogram with model #> 11 variogram_map Variogram map moran <- SFEMethod( name = \"moran\", title = \"Moran's I\", package = \"spdep\", variate = \"uni\", scope = \"global\", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), use_graph = TRUE, reorganize_fun = .moran2df ) .moran2df <- function(out, name, ...) { rns <- names(out) out <- lapply(out, unlist, use.names = TRUE) out <- Reduce(rbind, out) if (!is.matrix(out)) out <- t(as.matrix(out)) rownames(out) <- rns out <- DataFrame(out) names(out)[1] <- name out } variogram <- SFEMethod(package = \"automap\", variate = \"uni\", scope = \"global\", default_attr = NA, name = \"variogram\", title = \"Variogram\", fun = .variogram, reorganize_fun = .other2df, use_graph = FALSE) .variogram <- function(x, coords_df, formula = x ~ 1, scale = TRUE, ...) { coords_df$x <- x if (scale) coords_df$x <- scale(coords_df$x) dots <- list(...) # Deal with alpha myself and fit a global variogram to avoid further gstat warnings have_alpha <- \"alpha\" %in% names(dots) if (have_alpha) { empirical <- gstat::variogram(formula, data = coords_df, alpha = dots$alpha) dots$alpha <- NULL } out <- do.call(automap::autofitVariogram, c(list(formula = formula, input_data = coords_df, map = FALSE, cloud = FALSE), dots)) if (have_alpha) { out$exp_var <- empirical } out } .other2df <- function(out, name, ...) { if (!is.atomic(out)) out <- I(out) out_df <- DataFrame(res = out) names(out_df) <- name rownames(out_df) <- names(out) out_df }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"local","dir":"Articles","previous_headings":"Univariate","what":"Local","title":"Use your own spatial method in Voyager","text":"univariate local methods Voyager: code used create SFEMethod object localmoran: spdep::localmoran already right arguments, including x, listw, zero.policy. local methods, title default_attr arguments important, used plotLocalResults() plot title. Many local methods return matrix data frame results gene, default_attr specifies column use default plotting, local Moran’s values (Ii) case. fields results can p-values adjusted p-values. reorganize_fun different univariate global methods local results organized differently. .localmoran2df function: function must arguments: results fun genes, list element results one gene. nb neighbor object class nb, part listw object spatial neighborhood graphs. used correct multiple hypothesis testing p.adjustSP() p.adjust.method specify method correct multiple testing. See p.adjust() available methods. output list organized results, element one gene, converted DataFrame added localResults(sfe).","code":"listSFEMethods(\"uni\", \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot localmoran <- SFEMethod( name = \"localmoran\", title = \"Local Moran's I\", package = \"spdep\", scope = \"local\", default_attr = \"Ii\", fun = spdep::localmoran, use_graph = TRUE, reorganize_fun = .localmoran2df ) .localmoran2df <- function(out, nb, p.adjust.method) { lapply(out, function(o) { o1 <- as.data.frame(o) quadr <- attr(o, \"quadr\") I(.add_log_p(cbind(o1, quadr), nb, p.adjust.method)) }) }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"bivariate","dir":"Articles","previous_headings":"","what":"Bivariate","title":"Use your own spatial method in Voyager","text":"bivariate global methods Voyager: bivariate local methods Voyager: SFEMethod construction bivariate methods similar univariate methods, except function fun must argument y x. code used create SFEMethod object lee, Lee’s L: Note use_matrix argument, specific bivariate methods. means whether method can take matrix argument compute statistic pairwise combinations matrix’s rows. way computation can expressed matrix operations much efficient R loops loops pushed underlying C Fortran code BLAS Matrix package sparse matrices. ’s .lee_mat function: Due matrix operation, listw can sparse dense adjacency matrix spatial neighborhood graph. conform scRNA-seq conventions, x y genes rows matrices. reorganize_fun bivariate global methods don’t return DataFrame, bivariate global results can’t stored SFE object. However, reorganize_fun bivariate local methods follow rules univariate local methods results also go localResults(sfe).","code":"listSFEMethods(\"bi\", \"global\") #> name description #> 1 lee Lee's bivariate statistic #> 2 lee.mc Lee's bivariate static with permutation testing #> 3 lee.test Lee's L test #> 4 cross_variogram Cross variogram #> 5 cross_variogram_map Cross variogram map listSFEMethods(\"bi\", \"local\") #> name description #> 1 locallee Local Lee's bivariate statistic #> 2 localmoran_bv Local bivariate Moran's I lee <- SFEMethod(name = \"lee\", fun = .lee_mat, title = \"Lee's bivariate statistic\", reorganize_fun = function(out, name, ...) out, package = \"Voyager\", variate = \"bi\", scope = \"global\", use_matrix = TRUE) .lee_mat <- function(x, y = NULL, listw, zero.policy = TRUE, ...) { # X has genes in rows if (is(listw, \"listw\")) W <- listw2sparse(listw) else W <- listw x <- .scale_n(x) if (!is.null(y)) { y <- .scale_n(y) } else y <- x n <- ncol(x) # dimension of y is checked in calculateBivariate out <- x %*% (t(W) %*% W) %*% t(y)/sum(rowSums(W)^2) * n if (all(dim(out) == 1L)) out <- out[1,1] out }"},{"path":"https://pachterlab.github.io/voyager/dev/articles/sfemethod.html","id":"multivariate","dir":"Articles","previous_headings":"","what":"Multivariate","title":"Use your own spatial method in Voyager","text":"multivariate methods Voyager: SFEMethod construction bivariate methods similar univariate methods, except two arguments: joint indicate whether makes sense run method multiple samples jointly just like non-spatial PCA, dest indicate whether results go reducedDims(sfe) colData(sfe). code multivariate generalization local Geary’s C (Anselin 2019) permutation testing: results, single vector, goes colData(sfe), make sense run across multiple samples jointly sample separate spatial neighborhood graph, run sample separately. function reorganize_fun return vector, matrix, data frame ready added reducedDims(sfe) colData(sfe). results can go colData, rules arguments univariate local methods, permutation testing multivariate local Geary’s C, multiple testing correction performed reorganize_fun. results go reducedDims, needs one argument output.","code":"listSFEMethods(\"multi\") #> name description #> 1 multispati MULTISPATI PCA #> 2 localC_multi Multivariate local Geary's C #> 3 localC_perm_multi Multivariate local Geary's C permutation testing .localC_multi_fun <- function(perm = FALSE) { function(x, listw, ..., zero.policy) { x <- as.matrix(x) fun <- if (perm) spdep::localC_perm else spdep::localC fun(x, listw = listw, zero.policy = zero.policy, ...) } } .localCpermmulti2df <- function(out, nb, p.adjust.method) { .attrmat2df(list(out), \"pseudo-p\", \"localC_perm_multi\", nb, p.adjust.method)[[1]] } localC_perm_multi <- SFEMethod( name = \"localC_perm_multi\", title = \"Multivariate local Geary's C permutation testing\", package = \"spdep\", variate = \"multi\", default_attr = \"localC\", fun = .localC_multi_fun(TRUE), reorganize_fun = .localCpermmulti2df, dest = \"colData\" )"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Slide-seqV2 Processing Workflows with Voyager","text":"Pros: Higher resolution Visium, beads 10 \\(\\mu\\)m diameter Transcriptome wide Recently commercialized Curio, commercial kit coming Cons: Still single cell resolution two cells can occupy bead Relatively low detection efficiency transcripts Existing datasets may come histology image","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"dowload-data-and-create-a-spatialfeatureexperiment-object","dir":"Articles","previous_headings":"Getting Started","what":"Dowload Data and Create a SpatialFeatureExperiment object","title":"Slide-seqV2 Processing Workflows with Voyager","text":"vignettes demonstrate convert sequencing data spatial transcriptomics experiment SpatialFeatureExperiment object R. Many technologies yet standardized output formats, vignettes provide examples generate SFE object various output file types.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/slideseqV2_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Slide-seqV2 Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using data generated Slide-seqV2 platform. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"SPLiT-seq Processing Workflows with Voyager","text":"Pros: Commercial kit Low cost Single well capture randomly primed polyT oligos library Cons: * Fewer datasets available compared single cell technologies","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"SPLiT-seq Processing Workflows with Voyager","text":"vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix. process output various transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/splitseq_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"SPLiT-seq Processing Workflows with Voyager","text":"analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Google Colab notebooks linked.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Variogram","text":"geostatistical data, underlying spatial process sampled known locations. Kriging uses Gaussian process model interpolate values sample locations, semivariogram used model spatial dependency locations covariance Gaussian process. kriging, semivariogram can used exploratory data analysis tool find length scale anisotropy spatial autocorrelation. semivariogram defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Var}(X_t - X_0), \\] \\(X\\) value gene expression, \\(t\\) spatial vector. \\(X_0\\) value location interest, \\(X_t\\) value lagged \\(t\\). positive spatial autocorrelation, variance smaller among nearby values, variogram increase distance, eventually leveling distance beyond length scale spatial autocorrelation. “semi” comes 1/2, comes assumption Gaussian process weakly stationary, .e. covariance two locations depends spatial lag : \\[\\begin{align} \\mathrm{Var}(X_{t_2} - X_{t_1}) &= \\mathrm{Var}(X_{t_2}) + \\mathrm{Var}(X_{t_1}) - 2\\mathrm{Cov}(X_{t_2}, X_{t_1}) \\\\ &= 2\\rho(0) - 2\\rho(t_2 - t_1), \\end{align}\\] \\(\\rho\\) covariance function \\(t_1\\) \\(t_2\\) spatial locations. model can fitted empirical semivariogram, model \\(\\rho\\). variance differences value across locations depends spatial lag means intrinsically stationary, even weaker generalizable weakly stationary. weaker assumption used kriging. vignette demonstrates variogram ESDA tool, including interpretation univariate variogram, anisotropic variograms (variograms different directions), variogram maps, bivariate cross variograms. load packages: Slide-seq melanoma metastasis data (Biermann et al. 2022) used demonstration. QC performed another vignette. Variograms demonstrated top highly variable genes (HVGs)","code":"library(Voyager) library(SFEData) library(SpatialFeatureExperiment) library(scater) library(scran) library(ggplot2) library(BiocParallel) library(bluster) library(dplyr) theme_set(theme_bw()) (sfe <- BiermannMelaMetasData(dataset = \"MBM05_rep1\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 27566 29536 #> metadata(0): #> assays(1): counts #> rownames(27566): A1BG A1BG-AS1 ... ZZZ3 snoZ196 #> rowData names(3): means vars cv2 #> colnames(29536): ACCACTCATTTCTC-1 GTTCANTCCACGTA-1 ... ACGCGCAATCGTAG-1 #> TTGTTCCGTTCATA-1 #> colData names(4): sample_id nCounts nGenes prop_mito #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : xcoord ycoord #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: sfe <- sfe[, colData(sfe)$prop_mito < 0.1] sfe <- sfe[rowSums(counts(sfe)) > 0,] sfe <- logNormCounts(sfe) dec <- modelGeneVar(sfe) hvgs <- getTopHVGs(dec, n = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"variogram","dir":"Articles","previous_headings":"","what":"Variogram","title":"Variogram","text":"user interface used run Moran’s can used compute variograms. However, since variogram uses spatial distances instead spatial neighborhood graph, colGraph need specified. Instead, colGeometry can specified, geometry POINT, spatialCoords(sfe) used compute distances. Behind scene, automap package used, fits number different variogram models empirical variogram chooses one fits best. automap package user friendly wrapper gstat, time honored package geostatistics. data binned distance spots variance computed bin. gstat’s plotting functions say “semivariance”, data scaled variance 1, think variance rather semivariance plotted. numbers points plot indicate number pairs spots bin. “Ste” means Matern model M. Stein’s parameterization fitted points. Nugget variance distance 0, variance within first distance bin. data scaled default prior variogram computation make variograms multiple genes comparable. Spatial autocorrelation makes variance smaller shorter distances. variogram levels , means spatial autocorrelation longer effect distance. Sill variance variogram levels . Range distance variogram levels . first 4 genes, IGHG3 IGKC seem stronger spatial autocorrelation dissipate 100 200 units (whether ’s microns pixels unclear publication), whereas spatial autocorrelation B2M MT-RNR1 much weaker longer length scale. genes plotted space: length scales spatial autocorrelation genes quite obvious just plotting genes. ’s point plotting variograms ESDA? can also compute variograms larger number genes cluster variograms patterns spatial autocorrelation length scales, compare variograms genes across different samples. cluster variograms top highly variable genes (HVGs): BLUSPARAM argument used specify methods clustering, implemented bluster package. use hierarchical clustering. plot clusters: seems many genes, like MT-RNR1, weak spatial autocorrelation longer length scales, genes stronger shorter range spatial autocorrelation (around 150 200 units) like IGKC, genes somewhat longer length scale spatial autocorrelation (around 400 units). Plot one gene cluster space: MT-RNR1 widely expressed. IGKC ICHC3 restricted smaller areas, IGHM restricted even smaller areas. Note genes variograms cluster don’t co-expressed; need similar length scales strengths spatial autocorrelation.","code":"sfe <- runUnivariate(sfe, \"variogram\", hvgs, BPPARAM = SnowParam(2), model = \"Ste\") #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogram(sfe, hvgs[1:4], name = \"variogram\") plotSpatialFeature(sfe, hvgs[1:4], size = 0.3) & theme_bw() # To show the length units clusts <- clusterVariograms(sfe, hvgs, BLUSPARAM = HclustParam()) plotVariogram(sfe, hvgs, color_by = clusts, group = \"feature\", use_lty = FALSE, show_np = FALSE) genes_clusts <- clusts |> group_by(cluster) |> slice_head(n = 1) |> pull(feature) plotSpatialFeature(sfe, genes_clusts, size = 0.3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"anisotropy","dir":"Articles","previous_headings":"","what":"Anisotropy","title":"Variogram","text":"Anisotropy means different different directions. example cerebral cortex, layered structure. variogram can computed different directions.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"anisotropic-variogram","dir":"Articles","previous_headings":"Anisotropy","what":"Anisotropic variogram","title":"Variogram","text":"directions compute variograms can explicitly specified, alpha argument. However, since gstat fit anisotropic variograms, model fitted directions empirical variograms angle plotted separately. compute anisotropic variograms 4 genes : line variogram model fitted directions text describes model. points show angles different colors. Zero degree points north (), angles go clockwise.","code":"sfe <- runUnivariate(sfe, \"variogram\", genes_clusts, alpha = c(0, 45, 90, 135), # To not to overwrite omnidirectional variogram results name = \"variogram_anis\", model = \"Ste\", BPPARAM = SnowParam(2)) #> gstat does not fit anisotropic variograms. Variogram model is fitted to the whole dataset. #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogram(sfe, genes_clusts, group = \"angle\", name = \"variogram_anis\", show_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"variogram-map","dir":"Articles","previous_headings":"Anisotropy","what":"Variogram map","title":"Variogram","text":"variogram map another way visualize spatial autocorrelation different directions. bins distances x distances y, grid distances variance computed. Just like variograms , origin usually low value, spatial autocorrelation reduces variance short distance, values increase increasing distance origin, can increase quickly directions others. compute variogram maps 4 genes : width argument width bins, cutoff maximum distance.","code":"sfe <- runUnivariate(sfe, \"variogram_map\", genes_clusts, width = 100, cutoff = 800, BPPARAM = SnowParam(2), name = \"variogram_map2\") #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' plotVariogramMap(sfe, genes_clusts, name = \"variogram_map2\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"cross-variogram","dir":"Articles","previous_headings":"","what":"Cross variogram","title":"Variogram","text":"cross variogram used cokriging, uses multiple variables spatial interpolation model. cross variogram defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Cov}(X_t - X_0, Y_t - Y_0), \\] \\(Y\\) another variable. cross variogram also nugget, sill, range. shows covariance two variables changes distance. Voyager supports multiple bivariate spatial methods, cross variogram one . Just like univariate spatial methods, Voyager provides uniform user interface bivariate methods. However, bivariate local methods can’t stored SFE object present tend different formats outputs (e.g. correlation matrix Lee’s L list methods) may straightforward store SFE object. facets shown matrix, whose diagonal variogram gene, diagonal entries cross variograms. IGKC IGHG3, length scale covariance similar spatial autocorrelation. also cross variogram map show cross variogram different directions:","code":"cross_v <- calculateBivariate(sfe, \"cross_variogram\", feature1 = \"IGKC\", feature2 = \"IGHG3\") plotCrossVariogram(cross_v, show_np = FALSE) cross_v_map <- calculateBivariate(sfe, \"cross_variogram_map\", feature1 = \"IGKC\", feature2 = \"IGHG3\", width = 100, cutoff = 800) plotCrossVariogramMap(cross_v_map)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/variogram.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Variogram","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] dplyr_1.1.4 bluster_1.12.0 #> [3] BiocParallel_1.36.0 scran_1.30.2 #> [5] scater_1.30.1 ggplot2_3.5.1 #> [7] scuttle_1.12.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [19] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] xts_0.13.2 lifecycle_1.0.4 #> [7] sf_1.0-16 edgeR_4.0.16 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] RColorBrewer_1.1-3 DBI_1.2.2 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 units_0.8-5 #> [31] RSpectra_0.16-1 dqrng_0.3.2 #> [33] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [35] codetools_0.2-20 DelayedArray_0.28.0 #> [37] gstat_2.1-1 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 snow_0.4-4 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] xfun_0.43 HDF5Array_1.30.1 #> [57] withr_3.0.0 BiocManager_1.30.22 #> [59] fastmap_1.1.1 ggh4x_0.2.8 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 spData_2.3.0 #> [65] digest_0.6.35 rsvd_1.0.5 #> [67] R6_2.5.1 mime_0.12 #> [69] textshaping_0.3.7 colorspace_2.1-0 #> [71] wk_0.9.1 RSQLite_2.3.6 #> [73] intervals_0.15.4 utf8_1.2.4 #> [75] generics_0.1.3 FNN_1.1.4 #> [77] class_7.3-22 httr_1.4.7 #> [79] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [81] spdep_1.3-3 pkgconfig_2.0.3 #> [83] scico_1.5.0 gtable_0.3.5 #> [85] blob_1.2.4 XVector_0.42.0 #> [87] htmltools_0.5.8.1 automap_1.1-9 #> [89] scales_1.3.0 png_0.1-8 #> [91] SpatialExperiment_1.12.0 knitr_1.45 #> [93] rjson_0.2.21 spacetime_1.3-1 #> [95] curl_5.2.1 proxy_0.4-27 #> [97] cachem_1.0.8 zoo_1.8-12 #> [99] rhdf5_2.46.1 BiocVersion_3.18.1 #> [101] KernSmooth_2.23-22 parallel_4.3.3 #> [103] vipor_0.4.7 AnnotationDbi_1.64.1 #> [105] desc_1.4.3 s2_1.1.6 #> [107] reshape_0.8.9 pillar_1.9.0 #> [109] grid_4.3.3 vctrs_0.6.5 #> [111] promises_1.3.0 BiocSingular_1.18.0 #> [113] dbplyr_2.5.0 beachmat_2.18.1 #> [115] xtable_1.8-4 cluster_2.1.6 #> [117] beeswarm_0.4.0 evaluate_0.23 #> [119] magick_2.8.3 cli_3.6.2 #> [121] locfit_1.5-9.9 compiler_4.3.3 #> [123] rlang_1.1.3 crayon_1.5.2 #> [125] labeling_0.4.3 classInt_0.4-10 #> [127] plyr_1.8.9 fs_1.6.4 #> [129] ggbeeswarm_0.7.2 viridisLite_0.4.2 #> [131] deldir_2.0-4 stars_0.6-5 #> [133] munsell_0.5.1 Biostrings_2.70.3 #> [135] Matrix_1.6-5 ExperimentHub_2.10.0 #> [137] patchwork_1.2.0 sparseMatrixStats_1.14.0 #> [139] bit64_4.0.5 Rhdf5lib_1.24.2 #> [141] KEGGREST_1.42.0 statmod_1.5.0 #> [143] shiny_1.8.1.1 highr_0.10 #> [145] interactiveDisplayBase_1.40.0 AnnotationHub_3.10.1 #> [147] igraph_2.0.3 memoise_2.0.1 #> [149] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig10_10x_nuclei.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Chromium nuclei isolation basic quality control","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10x_nuclei.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10x_nuclei.rds\", destfile = \"10x_nuclei.rds\") sce <- readRDS(\"10x_nuclei.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 37 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") #> Warning: Removed 2931 rows containing non-finite outside the scale range #> (`stat_ydensity()`). #> Warning: Removed 2931 rows containing missing values or values outside the scale range #> (`position_quasirandom()`). plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$subsets_mito_percent < 20)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 5260 9091 #> metadata(0): #> assays(1): counts #> rownames(5260): ENSG00000142611.17 ENSG00000142655.13 ... #> ENSG00000225685.2 ENSG00000291031.1 #> rowData names(1): gene_name #> colnames(9091): AAACCCAAGACCATAA AAACCCAAGGTTTGAA ... TTTGTTGTCATCTGTT #> TTTGTTGTCCTCCACA #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] Matrix_1.6-5 stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] abind_1.4-5 compiler_4.3.3 #> [51] proxy_0.4-27 withr_3.0.0 #> [53] BiocParallel_1.36.0 viridis_0.6.5 #> [55] DBI_1.2.2 highr_0.10 #> [57] HDF5Array_1.30.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] bluster_1.12.0 tools_4.3.3 #> [63] units_0.8-5 vipor_0.4.7 #> [65] beeswarm_0.4.0 glue_1.7.0 #> [67] rhdf5filters_1.14.1 grid_4.3.3 #> [69] sf_1.0-16 cluster_2.1.6 #> [71] generics_0.1.3 gtable_0.3.5 #> [73] class_7.3-22 BiocSingular_1.18.0 #> [75] ScaledMatrix_1.10.0 sp_2.1-4 #> [77] utf8_1.2.4 XVector_0.42.0 #> [79] ggrepel_0.9.5 pillar_1.9.0 #> [81] limma_3.58.1 dplyr_1.1.4 #> [83] lattice_0.22-6 deldir_2.0-4 #> [85] tidyselect_1.2.1 locfit_1.5-9.9 #> [87] knitr_1.45 gridExtra_2.3 #> [89] edgeR_4.0.16 xfun_0.43 #> [91] statmod_1.5.0 stringi_1.8.3 #> [93] yaml_2.3.8 boot_1.3-30 #> [95] evaluate_0.23 codetools_0.2-20 #> [97] tibble_3.2.1 cli_3.6.2 #> [99] systemfonts_1.0.6 munsell_0.5.1 #> [101] jquerylib_0.1.4 Rcpp_1.0.12 #> [103] parallel_4.3.3 pkgdown_2.0.9 #> [105] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [107] viridisLite_0.4.2 scales_1.3.0 #> [109] e1071_1.7-14 purrr_1.0.2 #> [111] crayon_1.5.2 scico_1.5.0 #> [113] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig11_clicktags.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic quality control on scRNA-seq data with ClickTag barcodes","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"clicktags.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/clicktags.rds\", destfile = \"clicktags.rds\") sce <- readRDS(\"clicktags.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. bcrank <- barcodeRanks(counts(sce)) knee <- metadata(bcrank)$knee inflection <- metadata(bcrank)$inflection plot(bcrank$rank, bcrank$total, log=\"xy\", xlab=\"Rank\", ylab=\"Total ClickTags count\", cex.lab=1.2) #> Warning in xy.coords(x, y, xlabel, ylabel, log): 1 y value <= 0 omitted from #> logarithmic plot abline(h=inflection, col=\"darkgreen\", lty=2) abline(h=knee, col=\"dodgerblue\", lty=2) sce <- sce[, colSums(counts(sce)) > inflection] sce #> class: SingleCellExperiment #> dim: 20 3368 #> metadata(0): #> assays(1): counts #> rownames(20): ClickTag1 ClickTag2 ... ClickTag19 ClickTag20 #> rowData names(1): feature_name #> colnames(3368): AAACCTGCAAACTGCT AAACCTGGTAGCTTGT ... TTTGTCAGTCACCCAG #> TTTGTCATCTCTTATG #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig12_crispr.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Quality control on Chromium CRISPR Guide Capture libraries","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(Matrix) library(DropletUtils) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xcrispr.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xcrispr.rds\", destfile = \"10xcrispr.rds\") sce <- readRDS(\"10xcrispr.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 0 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. #> Warning: Computation failed in `stat_bin2d()`. #> Caused by error in `bin2d_breaks()`: #> ! `origin` must be a number, not `NaN`. bcrank <- barcodeRanks(counts(sce)) knee <- metadata(bcrank)$knee inflection <- metadata(bcrank)$inflection plot(bcrank$rank, bcrank$total, log=\"xy\", xlab=\"Rank\", ylab=\"Total ClickTags count\", cex.lab=1.2) #> Warning in xy.coords(x, y, xlabel, ylabel, log): 3 y values <= 0 omitted from #> logarithmic plot abline(h=inflection, col=\"darkgreen\", lty=2) abline(h=knee, col=\"dodgerblue\", lty=2) sce <- sce[, which(sce$total > inflection)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 89 293 #> metadata(0): #> assays(1): counts #> rownames(89): Non-Targeting-5 Non-Targeting-7 ... HDAC1-1 HDAC1-2 #> rowData names(1): feature_name #> colnames(293): AAAGAACAGAAACGAA AAAGAACGTTTGTCGA ... TTTGATCCAGGAGAAA #> TTTGATCGTGGTAGTG #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] DropletUtils_1.22.0 SingleCellExperiment_1.24.0 #> [9] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [11] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [13] IRanges_2.36.0 S4Vectors_0.40.2 #> [15] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [17] matrixStats_1.3.0 Matrix_1.6-5 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig13_10xatac.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"10X ATAC-seq basic quality control ","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xatac.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xatac.rds\", destfile = \"10xatac.rds\") sce <- readRDS(\"10xatac.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$total > 0)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 209 166 #> metadata(0): #> assays(1): counts #> rownames(209): 1:9410718-9410885 1:14968574-14969617 ... #> X:119775524-119775794 X:154317937-154318131 #> rowData names(0): #> colnames(166): AAACTCGCATTCTCGC AAAGGGCGTTGGCTTA ... TTGTCTACAGGTCCTG #> TTTGTGTCATCGTACA #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig14_10xmultiome.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"10X Multiome ATAC-seq basic quality control ","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SingleCellExperiment object.","code":"library(stringr) library(DropletUtils) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) library(ggplot2) theme_set(theme_bw()) if (!file.exists(\"10xmultiome.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/10xmultiome.rds\", destfile = \"10xmultiome.rds\") sce <- readRDS(\"10xmultiome.rds\") sce <- addPerCellQCMetrics(sce) names(colData(sce)) #> [1] \"sum\" \"detected\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$total > 0)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 277 198 #> metadata(0): #> assays(1): counts #> rownames(277): 1:39574808-39575296 1:43131572-43131673 ... #> X:152281428-152281521 X:166010316-166010375 #> rowData names(0): #> colnames(198): AAACGGTTCATTAGCT AAACGGTTCCGAAACG ... TTTCAAGGTACTAACC #> TTTCCGGCATTAGCAG #> colData names(3): sum detected total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] Matrix_1.6-5 DropletUtils_1.22.0 #> [9] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [11] Biobase_2.62.0 GenomicRanges_1.54.1 #> [13] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [15] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [17] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [19] stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 HDF5Array_1.30.1 #> [59] R.utils_2.12.3 DelayedArray_0.28.0 #> [61] rjson_0.2.21 classInt_0.4-10 #> [63] bluster_1.12.0 tools_4.3.3 #> [65] units_0.8-5 vipor_0.4.7 #> [67] beeswarm_0.4.0 R.oo_1.26.0 #> [69] glue_1.7.0 rhdf5filters_1.14.1 #> [71] grid_4.3.3 sf_1.0-16 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 sp_2.1-4 #> [81] utf8_1.2.4 XVector_0.42.0 #> [83] ggrepel_0.9.5 pillar_1.9.0 #> [85] limma_3.58.1 dplyr_1.1.4 #> [87] lattice_0.22-6 deldir_2.0-4 #> [89] tidyselect_1.2.1 locfit_1.5-9.9 #> [91] knitr_1.45 gridExtra_2.3 #> [93] edgeR_4.0.16 xfun_0.43 #> [95] statmod_1.5.0 stringi_1.8.3 #> [97] yaml_2.3.8 boot_1.3-30 #> [99] evaluate_0.23 codetools_0.2-20 #> [101] tibble_3.2.1 cli_3.6.2 #> [103] systemfonts_1.0.6 munsell_0.5.1 #> [105] jquerylib_0.1.4 Rcpp_1.0.12 #> [107] parallel_4.3.3 pkgdown_2.0.9 #> [109] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [111] viridisLite_0.4.2 scales_1.3.0 #> [113] e1071_1.7-14 purrr_1.0.2 #> [115] crayon_1.5.2 scico_1.5.0 #> [117] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic Visium exploratory data analysis","text":"introductory vignette SpatialFeatureExperiment data representation Voyager analysis package, demonstrate basic exploratory data analysis (EDA) spatial transcriptomics data. Basic knowledge R SingleCellExperiment assumed. vignette showcases packages Visium spatial gene expression system dataset. technology chosen due popularity, therefore availability numerous publicly available datasets analysis (Moses Pachter 2022). Voyager developed goal facilitating use geospatial methods spatial genomics, introductory vignette restricted non-spatial scRNA-seq EDA Visium dataset. vignette illustrating univariate spatial analysis dataset, see advanced exploratory spatial data analyis vignette dataset. load packages used vignette.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(scran) library(patchwork) library(bluster) library(SFEData) library(BiocParallel) library(stringr) library(ggplot2) library(sparseMatrixStats) library(dplyr) library(reticulate) library(concordexR) library(BiocNeighbors) theme_set(theme_bw(10)) # Specify Python version to use gget PY_PATH <- Sys.which(\"python\") use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function gget <- import(\"gget\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"mouse-skeletal-muscle-dataset","dir":"Articles","previous_headings":"","what":"Mouse skeletal muscle dataset","title":"Basic Visium exploratory data analysis","text":"dataset used vignette paper Large-scale integration single-cell transcriptomic data captures transitional progenitor states mouse skeletal muscle regeneration (McKellar et al. 2021). Notexin injected tibialis anterior muscle mice induce injury, healing muscle collected 2, 5, 7 days post injury Visium analysis. dataset vignette timepoint day 2. vignette starts SpatialFeatureExperiment (SFE) object. gene count matrix directly downloaded GEO. 4992 spots, whether tissue , included. tissue boundary found thresholding H&E image OpenCV, small polygons removed likely debris. Spot polygons constructed spot centroid coordinates diameter Space Ranger output. in_tissue column colData indicates spot polygons intersect tissue polygons, based st_intersects(). Tissue boundary, nuclei, myofiber, Visium spot polygons stored sf data frames SFE object. Visium spot polygons called “spotPoly” SFE object. SpatialFeatureExperiment package convenience wrappers get set common types geometries, including spotPoly() Visium (technologies relevant) spot polygons, cellSeg() cell segmentation, nucSeg() nuclei segmentation, centroids() cell centroids. Behind scene specially named sf data frames. See vignette SpatialFeatureExperiment details structure SFE object. SFE object dataset provided SFEData package; begin downloading data loading R. authors provided full resolution hematoxylin eosin (H&E) image GEO, downsized facilitate display: image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"(sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"spots","dir":"Articles","previous_headings":"Quality control","what":"Spots","title":"Basic Visium exploratory data analysis","text":"begin quality control (QC) plotting various metrics violin plots space. QC metrics pre-computed stored colData (spots) rowData SFE object. plot total unique molecular identifier (UMI) counts per spot. commented line code shows compute total UMI counts. maxcell argument maximum number pixels plot image; image downsampled pixels maxcells. can speed plotting plotting image multiple facets. spots injury site leukocyte infiltration high total counts. Spatial autocorrelation total counts apparent, discussed later section vignette. Next find number genes detected per spot. commented line code shows find number genes detected. commonly done scRNA-seq data, plot nCounts vs. nGenes plot two branches spots tissue, turn related myofiber size. See exploratory spatial data analysis (ESDA) Visium vignette. commonly done scRNA-seq data, plot proportion mitochondrially encoded counts. commented code shows find proportion: expected, spots outside tissue higher proportion mitochondrial counts, tissue lysed, mitochondrial transcripts less likely degrade cytosolic transcripts protected double membrane. However, spots myofibers also high proportion mitochondrial counts, function myofibers. injury site leukocyte infiltration lower proportion mitochondrial counts. see relationship proportion mitochondrial counts total UMI counts, plot commonly done scRNA-seq analysis identify low quality cells, .e. cells UMI counts high proportion mitochondrial counts. two clusters spots tissue, also turn related myofiber size. See ESDA Visium vignette. far haven’t seen spots obvious outliers QC metrics. following analyses use spots tissue, selected follows:","code":"names(colData(sfe)) #> [1] \"barcode\" \"col\" \"row\" \"x\" \"y\" \"dia\" #> [7] \"tissue\" \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" \"in_tissue\" # colData(sfe)$nCounts <- colSums(counts(sfe)) violin <- plotColData(sfe, \"nCounts\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial # colData(sfe)$nGenes <- colSums(counts(sfe) > 0) violin <- plotColData(sfe, \"nGenes\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial plotColData(sfe, x = \"nCounts\", y = \"nGenes\", colour_by = \"in_tissue\") # mito_ind <- str_detect(rowData(sfe)$symbol, \"^Mt-\") # colData(sfe)$prop_mito <- colSums(counts(sfe)[mito_ind,]) / colData(sfe)$nCounts violin <- plotColData(sfe, \"prop_mito\", x = \"in_tissue\", colour_by = \"in_tissue\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"prop_mito\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", image = \"lowres\", maxcell = 5e4, annot_fixed = list(fill = NA, color = \"black\")) + theme_void() violin + spatial plotColData(sfe, x = \"nCounts\", y = \"prop_mito\", colour_by = \"in_tissue\") sfe_tissue <- sfe[, colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"Basic Visium exploratory data analysis","text":"scRNA-seq, gene expression variance Visium measurements overdispersed compared variance counts Poisson distributed. understand mean-variance relationship, compute mean, variance, coefficient variance (CV2) gene among spots tissue: avoid overplotting better show point density plot, use 2D histogram. color bin indicates number points bin. red line, \\(y = x\\) expected Poisson distributed data, find variance higher highly expressed genes expected Poisson distributed counts. coefficient variation shows .","code":"rowData(sfe_tissue)$means <- rowMeans(counts(sfe_tissue)) rowData(sfe_tissue)$vars <- rowVars(counts(sfe_tissue)) # Coefficient of variance rowData(sfe_tissue)$cv2 <- rowData(sfe_tissue)$vars/rowData(sfe_tissue)$means^2 plotRowData(sfe, x = \"means\", y = \"vars\", bins = 50) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + scale_fill_distiller(palette = \"Blues\", direction = 1) + annotation_logticks() + coord_equal() #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotRowData(sfe, x = \"means\", y = \"cv2\", bins = 50) + geom_abline(slope = -1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + scale_fill_distiller(palette = \"Blues\", direction = 1) + annotation_logticks() + coord_equal() #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale."},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"normalize-data","dir":"Articles","previous_headings":"","what":"Normalize data","title":"Basic Visium exploratory data analysis","text":"demonstrate use scater normalization , although note necessarily best approach normalizing spatial transcriptomics data. problem normalize spatial transcriptomics data non-trivial , nCounts plot space shows , spatial autocorrelation evident. Furthemrore, Visium, reverse transcription occurs situ spots, PCR amplification occurs cDNA dissociated spots. Artifacts may subsequently introduced amplification step, associated spatial origin. Spatial artifacts may arise diffusion transcripts tissue permeablization. However, given total counts seem correspond histological regions, total counts may biological component hence treated technical artifact normalized away scRNA-seq data normalization methods. words, issue normalization spatial transcriptomics data, Visium particular, complex currently unsolved. one way normalize non-spatial scRNA-seq data. commented code implements scran method (Lun, Bach, Marioni 2016). simplify matter, perform logNormCounts() introductory vignette. Note scater’s logNormCounts() quite different Seurat. Let \\(N\\) denote total UMI count one Visium spot, \\(\\bar N\\) average total UMI count spots dataset, \\(x\\) denote UMI count one gene Visium spot interest. Seurat performs log normalization \\(\\mathrm{log}\\left( \\frac{x}{N/10000} + 1 \\right)\\), natural log used. contrast, default parameters, scater uses \\(\\mathrm{log_2}\\left( \\frac{x}{N/\\bar N} + 1 \\right)\\). pseudocount (default 1), library size factors (default \\(N/\\bar N\\)), transform (default log2) can changed. Log 2 used differences values can interpreted log fold change. Next, identify highly variable genes (HVGs), used principal component analysis (PCA) dimensionality reduction. , different ways identify HVGs, scater differently Seurat. frameworks, log normalized data used default. summary, Seurat, default parameters, Loess curve fitted log transformed data (log normalized data log transformed fitting purposes), fitted values exponentiated expected variance gene. expected variance mean used standardize log normalized gene expression; standardized values used calculate standardized variance gene. top HVGs genes largest standardized variance. scater, default parameters, parametric non-linear curve variance vs. mean gene log normalized data. log ratio actual variance fitted variance curve calculated, Loess curve fitted log ratio vs. mean gene. “technical” component variance fitted values Loess curve. “biological” component difference actual variance Loess fitted variance. top HVGs genes largest biological component. See documentation modelGeneVar(), fitTrendVar(), getTopHVGs() details. differences can lead different downstream results. don’t comment way better vignette, ’s important aware differences.","code":"# clusters <- quickCluster(sfe_tissue) # sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) # sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension reduction and clustering","title":"Basic Visium exploratory data analysis","text":"clustering show dimension reduction plots principal components (PCs) can plotted space. Due spatial autocorrelation many genes spatial regions different histological characters, even though spatial information used PCA procedure, PCs may show spatial structure. PC1, explains far variance PC2, separates injury site leukocytes myofibers close site Visium myofibers. PC2 highlights center injury site myofibers near edge. PC3 highlights muscle tendon junctions. PC4 seem informative; might picked outlier. also possible run UMAP following PCA, done scRNA-seq. recommend producing UMAP since procedure distorts distances, respect either local global structure data (Chari, Banerjee, Pachter 2021). However, completeness, show compute UMAP : UMAP often used visualize clusters. alternative UMAP concordex, quantitatively shows proportion neighbors k nearest neighbor graph cluster label. consistent default igraph Leiden clustering, use k = 10. cluster labels permuted estimate null distribution, observed values can compared simulated values: observed value much higher simulated values, indicating good clustering. single number average clusters. Values different clusters can plotted heatmap: diagonal represents proportion neighbors cells cluster cluster. diagonal entries low indicate good clustering. interesting spatial transcriptomics, locate clusters space, can done follows: spatial information explicitly used clustering, due spatial autocorrelation gene expression histological regions, clusters spatially contiguous. many methods find spatially informed clusters, BayesSpace (E. Zhao et al. 2021), Bioconductor. Remark spatial regions: geographical space, usually one single way define spatial regions. example, influenced sociology geology, LA county can partitioned regions Eastside, Westside, South Central, San Fernado Valley, San Gabriel Valley, Pomona Valley, Gateway Cities, South Bay, etc., containing multiple smaller cities parts LA City, can divided many neighborhoods, Koreatown, Highland Park, Lincoln Heights, etc. Definitions regions subject dispute. Meanwhile, LA county can also partitioned watersheds LA River, San Gabriel River, Ballona Creek, etc., well different rock formations. kind spatial region resolution relevant depends question asked. also gray areas spatial regions. example, Whittier Narrows dam intercepts San Gabriel River Rio Hondo (large tributary LA River), whether dam area belongs watershed San Gabriel River LA River unclear. Similarly, spatial transcriptomics, methods identifying spatial regions currently generally aim give one result, multiple results different resolutions depending question asked may relevant. Furthermore, methods spatial region demarcation used spatial -omics ideally provide uncertainty assessments assignment cells Visium spots. existing geospatial method accounts uncertainty geocmeans (F. Zhao, Jiao, Liu 2013), CRAN. geographical histological space, conflicting views spatial variation. one hand, methods identify spatially variable genes SpatialDE often assume gene expression vary smoothly continuously space. hand, methods identifying spatial regions attempt identify discrete regions. continuous variation features might definitions geographical neighborhoods often subject dispute. existing methods attempt harmonize two views. example, spatially variable gene method belayer (Ma et al. 2022) takes discrete tissue layers account.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat ElbowPlot(sfe_tissue, ndims = 30) plotDimLoadings(sfe_tissue, dims = 1:4, swap_rownames = \"symbol\") colData(sfe_tissue)$cluster <- clusterRows(reducedDim(sfe_tissue, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotPCA(sfe_tissue, ncomponents = 3, colour_by = \"cluster\") spatialReducedDim(sfe_tissue, \"PCA\", ncomponents = 4, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image = \"lowres\", maxcell = 5e4) set.seed(29) sfe_tissue <- runUMAP(sfe_tissue, dimred = \"PCA\", n_dimred = 3) plotUMAP(sfe_tissue, colour_by = \"cluster\") g <- findKNN(reducedDim(sfe_tissue, \"PCA\")[,1:3], k = 10) res <- calculateConcordex(g$index, labels = sfe_tissue$cluster, k = 10, return.map = TRUE) plotConcordexSim(res) heatConcordex(res, angle_col = 0, cluster_rows = FALSE, cluster_cols = FALSE) plotSpatialFeature(sfe_tissue, \"cluster\", colGeometryName = \"spotPoly\", image = \"lowres\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"non-spatial-differential-expression","dir":"Articles","previous_headings":"","what":"Non-spatial differential expression","title":"Basic Visium exploratory data analysis","text":"Cluster marker genes can found using differential analysis methods commonly done scRNA-seq. example Wilcoxon rank sum test: result sorted p-values: can use gget enrichr module gget package perform gene enrichment analysis. can choose >200 enrichment databases listed Enrichr website. , analyzing top 20 genes cluster 1 using default ontology database GO_Biological_Process_2021: Significant markers cluster can obtained follows: ’ll use module gget info get additional information genes, descriptions, synonyms, transcripts collection reference databases including Ensembl, UniProt, NCBI. , showing gene descriptions NCBI: genes interesting view spatial context:","code":"markers <- findMarkers(sfe_tissue, groups = colData(sfe_tissue)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[1]] #> DataFrame with 15043 rows and 8 columns #> p.value FDR summary.AUC AUC.2 AUC.3 #> #> ENSMUSG00000051747 5.28233e-10 5.46992e-06 0.686982 0.686982 0.909644 #> ENSMUSG00000064360 7.27238e-10 5.46992e-06 0.685410 0.685410 0.978921 #> ENSMUSG00000019787 2.37596e-09 1.19139e-05 0.693069 0.706395 0.812064 #> ENSMUSG00000030730 4.06144e-09 1.52740e-05 0.676723 0.676723 0.898037 #> ENSMUSG00000064341 5.90896e-07 1.77777e-03 0.648920 0.648920 0.969773 #> ... ... ... ... ... ... #> ENSMUSG00000087095 1 1 0.5 0.497525 0.5 #> ENSMUSG00000043969 1 1 0.5 0.500000 0.5 #> ENSMUSG00000091378 1 1 0.5 0.500000 0.5 #> ENSMUSG00000072437 1 1 0.5 0.500000 0.5 #> ENSMUSG00000094649 1 1 0.5 0.497525 0.5 #> AUC.4 AUC.5 AUC.6 #> #> ENSMUSG00000051747 0.971508 0.808830 0.857509 #> ENSMUSG00000064360 0.993854 0.751129 0.889987 #> ENSMUSG00000019787 0.913713 0.693069 0.809057 #> ENSMUSG00000030730 0.976749 0.796210 0.810742 #> ENSMUSG00000064341 0.989804 0.785362 0.909585 #> ... ... ... ... #> ENSMUSG00000087095 0.496212 0.496644 0.5 #> ENSMUSG00000043969 0.496212 0.500000 0.5 #> ENSMUSG00000091378 0.496212 0.500000 0.5 #> ENSMUSG00000072437 0.492424 0.500000 0.5 #> ENSMUSG00000094649 0.500000 0.500000 0.5 enrichr_genes <- rownames(markers[[1]])[1:20] gget_e <- gget$enrichr(enrichr_genes, ensembl=TRUE, database = \"ontology\") # Plot results of gene enrichment analysis # Count number of overlapping genes gget_e$overlapping_genes_count <- lapply(gget_e$overlapping_genes, length) |> as.numeric() # Only keep the top 10 results gget_e <- gget_e[1:10,] gget_e |> ggplot() + geom_bar(aes( x = -log10(adj_p_val), y = reorder(path_name, -adj_p_val) ), stat = \"identity\", fill = \"lightgrey\", width = 0.5, color = \"black\") + geom_text( aes( y = path_name, x = (-log10(adj_p_val)), label = overlapping_genes_count ), nudge_x = 0.25, show.legend = NA, color = \"red\" ) + geom_text( aes( y = Inf, x = Inf, hjust = 1, vjust = 1, label = \"# of overlapping genes\" ), show.legend = NA, size=4, color = \"red\" ) + geom_vline(linetype = \"dashed\", linewidth = 0.5, xintercept = -log10(0.05)) + ylab(\"Pathway name\") + xlab(\"-log10(adjusted P value)\") #> Warning in geom_text(aes(y = Inf, x = Inf, hjust = 1, vjust = 1, label = \"# of overlapping genes\"), : All aesthetics have length 1, but the data has 10 rows. #> ℹ Please consider using `annotate()` or provide this layer with data containing #> a single row. genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe_tissue, rowData(sfe_tissue)[genes_use, \"symbol\"], x = \"cluster\", colour_by = \"cluster\", swap_rownames = \"symbol\") gget_info <- gget$info(genes_use) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> Ttn Enables ankyrin binding activity. Involved in regulation of relaxation of cardiac muscle. Acts upstream of or within several processes, including chordate embryonic development; forward locomotion; and heart development. Located in M band and Z disc. Is expressed in several structures, including diaphragm; embryo mesenchyme; heart; musculature; and tarsus. Used to study autosomal recessive limb-girdle muscular dystrophy type 2J; dilated cardiomyopathy 1G; and tibial muscular dystrophy. Human ortholog(s) of this gene implicated in intrinsic cardiomyopathy (multiple) and myopathy (multiple). Orthologous to human TTN (titin). [provided by Alliance of Genome Resources, Apr 2022] #> Gapdh This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The encoded protein was originally identified as a key glycolytic enzyme that converts D-glyceraldehyde 3-phosphate (G3P) into 3-phospho-D-glyceroyl phosphate. Subsequent studies have assigned a variety of additional functions to the protein including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Alternative splicing results in multiple transcript variants. Many pseudogenes similar to this locus are found throughout the mouse genome. [provided by RefSeq, Jan 2014] #> Hsp90ab1 Enables protein folding chaperone; protein kinase binding activity; and tau protein binding activity. Contributes to protein kinase regulator activity. Involved in several processes, including axonogenesis; positive regulation of protein kinase B signaling; and regulation of cellular protein metabolic process. Acts upstream of or within cellular response to interleukin-4; negative regulation of apoptotic process; and placenta development. Located in growth cone; neuronal cell body; and perinuclear region of cytoplasm. Part of HSP90-CDC37 chaperone complex. Is expressed in several structures, including branchial arch; central nervous system; eye; limb; and placenta. Human ortholog(s) of this gene implicated in multiple sclerosis. Orthologous to human HSP90AB1 (heat shock protein 90 alpha family class B member 1). [provided by Alliance of Genome Resources, Apr 2022] #> Tmsb4x Predicted to enable actin monomer binding activity and enzyme binding activity. Acts upstream of or within regulation of cell migration. Located in cytosol and nucleus. Is expressed in several structures, including alimentary system; central nervous system; eye; heart; and somite. Orthologous to several human genes including TMSB4X (thymosin beta 4 X-linked). [provided by Alliance of Genome Resources, Apr 2022] #> Des This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane and are essential for maintaining the strength and integrity of skeletal, cardiac and smooth muscle fibers. Mutations in this gene affect assembly of intermediate filaments. Mice lacking this gene are able to develop and reproduce but exhibit abnormal muscle fibers. Mutations in the human gene are associated with myofibrillar myopathy, dilated cardiomyopathy, neurogenic scapuloperoneal syndrome and autosomal recessive limb-girdle muscular dystrophy, type 2R. [provided by RefSeq, Jan 2014] plotSpatialFeature(sfe_tissue, genes_use, colGeometryName = \"spotPoly\", ncol = 3, image = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"Basic Visium exploratory data analysis","text":"Tobler’s first law geography (Tobler 1970) states Everything related everything else. near things related distant things. observation motivates examination spatial autocorrelation. Positive spatial autocorrelation evident nearby things tend similar, weather Pasadena downtown Los Angeles (opposed weather Pasadena San Francisco). Negative spatial autocorrelation evident nearby things tend dissimilar, like squares chessboard. Spatial autocorrelation can arise intrinsic process diffusion communication physical contact, result covariate intrinsic process, areal data, areal units observation smaller scale spatial process. commonly used measure spatial autocorrelation Moran’s (Moran 1950), defined \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) variable values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood whether normalize number neighbors. spdep package uses neighborhood. Moran’s similar Pearson correlation value location average value neighbors (identical, see (Lee 2001)). Just like Pearson correlation, Moran’s generally bound -1 1, positive value indicates positive spatial autocorrelation negative value indicates negative spatial autocorrelation. Spatial dependence analysis spdep requires spatial neighborhood graph. graph adjacent Visium spot can found mentioned spatial autocorrelation apparent total UMI counts. ’s Moran’s shows: K means kurtosis. positive values Moran’s indicate positive spatial autocorrelation.","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) calculateMoransI(t(colData(sfe_tissue)[,c(\"nCounts\", \"nGenes\")]), listw = colGraph(sfe_tissue, \"visium\")) #> DataFrame with 2 rows and 2 columns #> moran K #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"spatially-variable-genes","dir":"Articles","previous_headings":"Moran’s I","what":"Spatially variable genes","title":"Basic Visium exploratory data analysis","text":"spatially variable gene gene whose expression depends spatial locations, rather spatially random, like salt grains spread soup. Spatially variable genes can identified spatial autocorrelation signatures, sometimes Moran’s used compare assess spatially variable genes identified different methods. BPPARAM used paralelize computation Moran’s 2000 highly variable genes, 2 cores used SNOW backend. results stored rowData NA’s genes highly variable Moran’s computed genes. rank genes Moran’s plot space follows: see genes strong positive spatial autocorrelation, don’t observe strong negative spatial autocorrelation. Let’s get additional information genes strongest positive spatial autocorrelation space using gget info : Let’s plot genes: genes indeed look spatially variable. However, spatial variability can simply due histological regions space, words, spatial distribution different cell types. many methods identify spatially variable genes, often involving Gaussian process modeling, far complex Moran’s , SpatialDE (Svensson, Teichmann, Stegle 2018). However, methods usually don’t account histological regions, except C-SIDE (Cable et al. 2022), identifies spatially variable genes within cell types. leads question really meant “cell type”. remains see spatial methods made specifically identifying spatially variable genes compare methods don’t explicitly use spatial information simply perform differential analysis cell types often spatially defined histological regions. Another consideration using Moran’s extent strength spatial autocorrelation varies space. gene exhibits strong spatial autocorrelation one region, another? different histological regions analyzed separately cases? ways see whether Moran’s statistically significant, many methods explore spatial autocorrelation. discussed advanced ESDA Visium vignette.","code":"sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, colGraphName = \"visium\", BPPARAM = SnowParam(2)) #> Warning: : ... may be used in an incorrect context: 'fun(x[i, ], ...)' rowData(sfe_tissue) #> DataFrame with 15043 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000025902 ENSMUSG00000025902 Sox17 Gene Expression 0.03969957 #> ENSMUSG00000096126 ENSMUSG00000096126 Gm22307 Gene Expression 0.00107296 #> ENSMUSG00000033845 ENSMUSG00000033845 Mrpl15 Gene Expression 0.38197425 #> ENSMUSG00000025903 ENSMUSG00000025903 Lypla1 Gene Expression 0.28755365 #> ENSMUSG00000033813 ENSMUSG00000033813 Tcea1 Gene Expression 0.26502146 #> ... ... ... ... ... #> ENSMUSG00000064360 ENSMUSG00000064360 mt-Nd3 Gene Expression 56.445279 #> ENSMUSG00000064363 ENSMUSG00000064363 mt-Nd4 Gene Expression 123.991416 #> ENSMUSG00000064367 ENSMUSG00000064367 mt-Nd5 Gene Expression 14.645923 #> ENSMUSG00000064368 ENSMUSG00000064368 mt-Nd6 Gene Expression 0.109442 #> ENSMUSG00000064370 ENSMUSG00000064370 mt-Cytb Gene Expression 121.273605 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000025902 0.04460915 28.30429 NA NA #> ENSMUSG00000096126 0.00107296 932.00000 NA NA #> ENSMUSG00000033845 0.47048031 3.22458 NA NA #> ENSMUSG00000025903 0.34686963 4.19497 NA NA #> ENSMUSG00000033813 0.32388797 4.61140 0.0489758 19.2181 #> ... ... ... ... ... #> ENSMUSG00000064360 2.47976e+03 0.778314 0.410657 11.31069 #> ENSMUSG00000064363 1.45282e+04 0.944991 0.546964 13.62886 #> ENSMUSG00000064367 2.34858e+02 1.094895 0.480634 3.75345 #> ENSMUSG00000064368 1.31941e-01 11.015664 NA NA #> ENSMUSG00000064370 1.48225e+04 1.007833 0.621060 10.71784 df <- rowData(sfe_tissue)[hvgs,] ord <- order(df$moran_Vis5A, decreasing = TRUE) df[ord, c(\"symbol\", \"moran_Vis5A\")] #> DataFrame with 2000 rows and 2 columns #> symbol moran_Vis5A #> #> ENSMUSG00000064351 mt-Co1 0.764044 #> ENSMUSG00000050335 Lgals3 0.741474 #> ENSMUSG00000029304 Spp1 0.734937 #> ENSMUSG00000021939 Ctsb 0.708362 #> ENSMUSG00000004207 Psap 0.706552 #> ... ... ... #> ENSMUSG00000039911 Spsb1 -0.0333357 #> ENSMUSG00000015711 Prune -0.0354638 #> ENSMUSG00000042675 Ypel3 -0.0369055 #> ENSMUSG00000090262 Mpv17 -0.0412250 #> ENSMUSG00000020964 Sel1l -0.0443975 gget_info2 <- gget$info(rownames(df)[1:6]) rownames(gget_info2) <- gget_info2$primary_gene_name select(gget_info2, ncbi_description) #> ncbi_description #> Spp1 Enables extracellular matrix binding activity. Acts upstream of or within several processes, including cellular ion homeostasis; cellular response to leukemia inhibitory factor; and neutrophil chemotaxis. Located in apical part of cell and cytoplasm. Is expressed in several structures, including alimentary system; brain; metanephros; reproductive system; and skeleton. Human ortholog(s) of this gene implicated in several diseases, including autoimmune disease (multiple); biliary atresia; coronary artery disease (multiple); disease of cellular proliferation (multiple); and hepatitis. Orthologous to human SPP1 (secreted phosphoprotein 1). [provided by Alliance of Genome Resources, Apr 2022] #> Ftl1 Predicted to enable ferric iron binding activity; ferrous iron binding activity; and identical protein binding activity. Predicted to be involved in intracellular sequestering of iron ion. Predicted to be located in autolysosome. Predicted to be part of intracellular ferritin complex. Predicted to be active in cytoplasm. Is expressed in several structures, including central nervous system; ciliary body; liver; and retina nuclear layer. Human ortholog(s) of this gene implicated in basal ganglia disease; hyperferritinemia-cataract syndrome; neurodegeneration with brain iron accumulation 3; and neurodegenerative disease. Orthologous to human FTL (ferritin light chain). [provided by Alliance of Genome Resources, Apr 2022] #> Lgals3 Predicted to enable several functions, including IgE binding activity; advanced glycation end-product receptor activity; and signaling receptor binding activity. Involved in negative regulation of T cell receptor signaling pathway; negative regulation of endocytosis; and negative regulation of lymphocyte activation. Acts upstream of or within extracellular matrix organization and skeletal system development. Located in several cellular components, including external side of plasma membrane; glial cell projection; and immunological synapse. Is expressed in several structures, including alimentary system; genitourinary system; respiratory system; skeleton; and skin. Used to study fatty liver disease. Human ortholog(s) of this gene implicated in asthma. Orthologous to human LGALS3 (galectin 3). [provided by Alliance of Genome Resources, Apr 2022] #> Ctsb This gene encodes a member of the peptidase C1 family and preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to generate the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. Homozygous knockout mice for this gene exhibit reduced pancreatic damage following induced pancreatitis and reduced hepatocyte apoptosis in a model of liver injury. Pseudogenes of this gene have been identified in the genome. [provided by RefSeq, Aug 2015] #> Lgmn This gene encodes a member of the cysteine peptidase family C13 that plays an important role in the endosome/lysosomal degradation system. The encoded inactive preproprotein undergoes autocatalytic removal of the C-terminal inhibitory propeptide to generate the active endopeptidase that cleaves protein substrates on the C-terminal side of asparagine residues. Mice lacking the encoded protein exhibit defects in the lysosomal processing of proteins resulting in their accumulation in the lysosomes, and develop symptoms resembling hemophagocytic lymphohistiocytosis. [provided by RefSeq, Aug 2016] #> Mb Predicted to enable oxygen binding activity. Acts upstream of or within several processes, including brown fat cell differentiation; enucleate erythrocyte differentiation; and response to hypoxia. Is expressed in brown fat; heart; skeletal muscle; and somite. Human ortholog(s) of this gene implicated in acute kidney failure. Orthologous to human MB (myoglobin). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, rownames(df)[1:6], colGeometryName = \"spotPoly\", image = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig1_visium_basic.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Basic Visium exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocNeighbors_1.20.2 concordexR_1.2.0 #> [3] reticulate_1.36.1 dplyr_1.1.4 #> [5] sparseMatrixStats_1.14.0 stringr_1.5.1 #> [7] BiocParallel_1.36.0 SFEData_1.4.0 #> [9] bluster_1.12.0 patchwork_1.2.0 #> [11] scran_1.30.2 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [17] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [19] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [21] IRanges_2.36.0 S4Vectors_0.40.2 #> [23] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [25] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [27] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 sf_1.0-16 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 cowplot_1.1.3 #> [19] RColorBrewer_1.1-3 DBI_1.2.2 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 pheatmap_1.0.12 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 e1071_1.7-14 #> [45] systemfonts_1.0.6 dbscan_1.1-12 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 snow_0.4-4 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] xfun_0.43 HDF5Array_1.30.1 #> [57] withr_3.0.0 BiocManager_1.30.22 #> [59] fastmap_1.1.1 boot_1.3-30 #> [61] rhdf5filters_1.14.1 fansi_1.0.6 #> [63] spData_2.3.0 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] RSQLite_2.3.6 utf8_1.2.4 #> [73] generics_0.1.3 FNN_1.1.4 #> [75] class_7.3-22 httr_1.4.7 #> [77] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [79] spdep_1.3-3 uwot_0.2.2 #> [81] pkgconfig_2.0.3 scico_1.5.0 #> [83] gtable_0.3.5 blob_1.2.4 #> [85] XVector_0.42.0 htmltools_0.5.8.1 #> [87] scales_1.3.0 png_0.1-8 #> [89] knitr_1.45 rjson_0.2.21 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 BiocSingular_1.18.0 #> [107] dbplyr_2.5.0 beachmat_2.18.1 #> [109] xtable_1.8-4 cluster_2.1.6 #> [111] beeswarm_0.4.0 evaluate_0.23 #> [113] magick_2.8.3 cli_3.6.2 #> [115] locfit_1.5-9.9 compiler_4.3.3 #> [117] rlang_1.1.3 crayon_1.5.2 #> [119] labeling_0.4.3 classInt_0.4-10 #> [121] fs_1.6.4 ggbeeswarm_0.7.2 #> [123] stringi_1.8.3 viridisLite_0.4.2 #> [125] deldir_2.0-4 munsell_0.5.1 #> [127] Biostrings_2.70.3 Matrix_1.6-5 #> [129] ExperimentHub_2.10.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] highr_0.10 interactiveDisplayBase_1.40.0 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Spatial Visium exploratory data analysis","text":"vignette provides introduction exploratory spatial data analysis methods via Voyager package context Visium dataset.","code":"library(Voyager) library(SpatialFeatureExperiment) library(scater) library(scran) library(SFEData) library(sf) library(ggplot2) library(scales) library(patchwork) library(BiocParallel) library(bluster) library(dplyr) library(reticulate) theme_set(theme_bw(10)) # Specify Python version to use gget PY_PATH <- system(\"which python\", intern = TRUE) use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function # Load gget gget <- import(\"gget\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"Spatial Visium exploratory data analysis","text":"dataset used vignette paper Large-scale integration single-cell transcriptomic data captures transitional progenitor states mouse skeletal muscle regeneration (McKellar et al. 2021). Notexin injected tibialis anterior muscle mice induce injury, healing muscle collected 2, 5, 7 days post injury Visium analysis. dataset vignette timepoint day 2. vignette starts SpatialFeatureExperiment (SFE) object. gene count matrix directly downloaded GEO. 4992 spots, whether tissue , included. H&E image used nuclei myofiber segmentation. subset nuclei randomly selected regions 3 timepoints manually annotated train StarDist model segment rest nuclei, myofibers manually segmented. tissue boundary found thresholding OpenCV, small polygons removed likely debris. Spot polygons constructed spot centroid coordinates diameter Space Ranger output. in_tissue column colData indicates spot polygons intersect tissue polygons, based st_intersects(). Tissue boundary, nuclei, myofiber, Visium spot polygons stored sf data frames SFE object. See vignette SpatialFeatureExperiment details structure SFE object. SFE object dataset provided SFEData package; begin downloading data loading R. H&E image section: image can added SFE object plotted behind geometries, needs flipped align spots origin top left image bottom left geometries.","code":"(sfe <- McKellarMuscleData(\"full\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> class: SpatialFeatureExperiment #> dim: 15123 4992 #> metadata(0): #> assays(1): counts #> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ... #> ENSMUSG00000064368 ENSMUSG00000064370 #> rowData names(6): Ensembl symbol ... vars cv2 #> colnames(4992): AAACAACGAATAGTTC AAACAAGTATCTCCCA ... TTGTTTGTATTACACG #> TTGTTTGTGTAAATTC #> colData names(12): barcode col ... prop_mito in_tissue #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : imageX imageY #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: spotPoly (POLYGON) #> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT) #> #> Graphs: #> Vis5A: if (!file.exists(\"tissue_lowres_5a.jpeg\")) { download.file(\"https://raw.githubusercontent.com/pachterlab/voyager/main/vignettes/tissue_lowres_5a.jpeg\", destfile = \"tissue_lowres_5a.jpeg\") } sfe <- addImg(sfe, file = \"tissue_lowres_5a.jpeg\", sample_id = \"Vis5A\", image_id = \"lowres\", scale_fct = 1024/22208) sfe <- mirrorImg(sfe, sample_id = \"Vis5A\", image_id = \"lowres\")"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"spots-in-tissue","dir":"Articles","previous_headings":"Exploratory data analysis","what":"Spots in tissue","title":"Spatial Visium exploratory data analysis","text":"example dataset Visium spots whether tissue , spots intersect tissue used analyses. Total UMI counts (nCounts), number genes detected per spot (nGenes), proportion mitochondrially encoded counts (prop_mito) precomputed colData(sfe). plotSpatialFeature function can used visualize various attributes space: expression gene, colData values, geometry attributes colGeometry annotGeometry. Visium spots plotted polygons reflecting actual size relative tissue, rather points, case packages plot Visium data. plotting geometries performed hood geom_sf. tissue boundary found thresholding H&E image removing small polygons likely debris. in_tissue column colData(sfe) indicates Visium spot polygon intersects tissue polygon; can found SpatialFeatureExperiment::annotPred(). demonstrate use scran (Lun, Bach, Marioni 2016) normalization , although note necessarily best approach normalizing spatial transcriptomics data. problem normalize spatial transcriptomics data non-trivial , nCounts plot space shows , spatial autocorrelation evident. Furthemrore, Visium, reverse transcription occurs situ spots, PCR amplification occurs cDNA dissociated spots. Artifacts may subsequently introduced amplification step, associated spatial origin. Spatial artifacts may arise diffusion transcripts tissue permeablization. However, given total counts seem correspond histological regions, total counts may biological component hence treated technical artifact normalized away scRNA-seq data normalization methods. words, issue normalization spatial transcriptomics data, Visium particular, complex currently unsolved. Myofiber nuclei segmentation polygons available dataset annotGeometries field. Myofibers manually segmented, nuclei segmented StarDist trained manually segmented subset.","code":"names(colData(sfe)) #> [1] \"barcode\" \"col\" \"row\" \"x\" \"y\" \"dia\" #> [7] \"tissue\" \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" \"in_tissue\" sfe_tissue <- sfe[,colData(sfe)$in_tissue] sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0,] #clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) annotGeometryNames(sfe_tissue) #> [1] \"tissueBoundary\" \"myofiber_full\" \"myofiber_simplified\" #> [4] \"nuclei\" \"nuclei_centroid\""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"from-myofibers-and-nuclei-to-visium-spots","dir":"Articles","previous_headings":"Exploratory data analysis > Spots in tissue","what":"From myofibers and nuclei to Visium spots","title":"Spatial Visium exploratory data analysis","text":"plotSpatialFeature() function can also used plot attributes geometries, .e. non-geometry columns sf data frames rowGeometries, colGeometries, annotGeometries fields SFE object. rowGeometries colGeometries, columns associated sf data frames rather rowData colData, allowed one can specify columns associate geometries (see st_agr documentation st_sf). attribute annotGeometry plotted along side gene expression colData colGeometry attribute, annotGeometry attribute plotted different color palette distinguish column associated values. myofiber polygons annotGeometries can plotted shown , colored cross section area observed tissue section. aes_use argument set color rather fill (default polygons) plot Visium spot outlines make myofiber polygons visible. fill argument set NA make Visium spots look hollow, size argument controls thickness outlines. annot_aes argument specifies column annotGeometry use specify values aesthstic, just like aes ggplot2 (aes_string precise, since tidyeval used ). annot_fixed argument (used ) can set fixed size, alpha, color, etc. annotGeometry. larger myofibers seem fewer total counts, possibly larger size myofibers dilutes transcripts. hints need normalization procedure. SpatialFeatureExperiment, can find number myofibers nuclei intersect Visium spot. predicate can anything implemented sf, example, number nuclei fully covered Visium spot can also found. default predicate st_intersects(). one--one mapping Visium spots myofibers. However, can relate attributes myofibers gene expression detected Visium spots. One way summarize attributes myofibers intersect (choose another better predicate implemented sf) spot, calculate mean, median, sum. can done annotSummary() function SpatialFeatureExperiment. default predicate st_intersects(), default summary function mean(). reveals relationship mean area myofibers intersecting Visium spot aspects spots, total counts gene expression. NAs designate spots intersecting myofibers, e.g. inflammatory region. Basic Visium vignette, encountered two mysterious branches two clusters nGenes vs. nCounts plot proportion mitochondrial counts vs. nCounts plot. Now see two clusters seem related myofiber size.","code":"plotSpatialFeature(sfe_tissue, features = \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\", aes_use = \"color\", linewidth = 0.5, fill = NA, annot_aes = list(fill = \"area\")) colData(sfe_tissue)$n_myofibers <- annotNPred(sfe_tissue, colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\") plotSpatialFeature(sfe_tissue, features = \"n_myofibers\", colGeometryName = \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1) colData(sfe_tissue)$mean_myofiber_area <- annotSummary(sfe_tissue, \"spotPoly\", \"myofiber_simplified\", annotColNames = \"area\")[,1] # it always returns a data frame # The gray spots don't intersect any myofiber plotSpatialFeature(sfe_tissue, \"mean_myofiber_area\", \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1) plotColData(sfe_tissue, x = \"nCounts\", y = \"nGenes\", colour_by = \"mean_myofiber_area\") plotColData(sfe_tissue, x = \"nCounts\", y = \"prop_mito\", colour_by = \"mean_myofiber_area\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"myofiber-types","dir":"Articles","previous_headings":"Exploratory data analysis > Spots in tissue","what":"Myofiber types","title":"Spatial Visium exploratory data analysis","text":"Marker genes: Myh7 (Type , slow twitch, aerobic), Myh2 (Type IIa, fast twitch, somewhat aerobic), Myh4 (Type IIb, fast twitch, anareobic), Myh1 (Type IIx, fast twitch, anaerobic), protocol (Wang, Yue, Kuang 2017) can use gget search gget info modules gget package get Ensembl IDs additional information (example NCBI description) marker genes: first examine Type myofibers. fast twitch muscle, don’t expect many slow twitch Type myofibers. Row names sfe_tissue Ensembl IDs order avoid ambiguity sometimes multiple Ensembl IDs gene symbol genes aliases. However, gene symbols shorter human readable Ensembl IDs, better suited display plots. plotSpatialFeature() function functions Voyager, even row names recorded Ensembl IDs, features argument can take gene symbols swap_rownames argument indicating column rowData(sfe) stores gene symbols. Gene symbols also shown plots instead Ensembl IDs. one gene symbol matches multiple Ensembl IDs dataset, warning given. exprs_values argument specifies assay use, default “logcounts”, .e. log normalized data. default may may suitable practice given total UMI counts may biological relevance spatial data. Therefore, plot raw counts log normalized counts: marker gene type IIa myofibers shown . straightforward modify plotting display markers type IIb type IIx myofibers: Type IIa myofibers also tend clustered together left side tissue. SFE inherits SCE, non-spatial EDA plots scater package can also used: Plotting proportion mitochondrial counts vs. mean myofiber area, see two clusters, one higher proportion mitochondrial counts smaller area, another lower proportion mitochondrial counts average slightly larger area. Type IIa myofibers tend smaller area larger proportion mitochondrial counts.","code":"markers <- c(I = \"Myh7\", IIa = \"Myh2\", IIb = \"Myh4\", IIx = \"Myh1\") gget_search <- gget$search(list(\"Myh7\", \"Myh2\", \"Myh4\", \"Myh1\"), species=\"mouse\") gget_search <- gget_search[gget_search$gene_name %in% list(\"Myh7\", \"Myh2\", \"Myh4\", \"Myh1\"), ] gget_search #> ensembl_id gene_name #> 4 ENSMUSG00000033196 Myh2 #> 5 ENSMUSG00000053093 Myh7 #> 6 ENSMUSG00000056328 Myh1 #> 7 ENSMUSG00000057003 Myh4 #> ensembl_description #> 4 myosin, heavy polypeptide 2, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339710] #> 5 myosin, heavy polypeptide 7, cardiac muscle, beta [Source:MGI Symbol;Acc:MGI:2155600] #> 6 myosin, heavy polypeptide 1, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339711] #> 7 myosin, heavy polypeptide 4, skeletal muscle [Source:MGI Symbol;Acc:MGI:1339713] #> ext_ref_description biotype #> 4 myosin, heavy polypeptide 2, skeletal muscle, adult protein_coding #> 5 myosin, heavy polypeptide 7, cardiac muscle, beta protein_coding #> 6 myosin, heavy polypeptide 1, skeletal muscle, adult protein_coding #> 7 myosin, heavy polypeptide 4, skeletal muscle protein_coding #> synonym #> 4 MHC2A, M.... #> 5 B-MHC, M.... #> 6 A530084A.... #> 7 MHC2B, M.... #> url #> 4 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000033196 #> 5 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000053093 #> 6 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000056328 #> 7 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000057003 gget_info <- gget$info(gget_search$ensembl_id) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> Myh2 Acts upstream of or within actin-mediated cell contraction; plasma membrane repair; and response to activity. Located in several cellular components, including A band; Golgi apparatus; and actomyosin contractile ring. Is expressed in several structures, including alimentary system; forelimb bud mesenchyme; and skeletal musculature. Human ortholog(s) of this gene implicated in inclusion body myositis and proximal myopathy and ophthalmoplegia. Orthologous to human MYH2 (myosin heavy chain 2). [provided by Alliance of Genome Resources, Apr 2022] #> Myh7 Predicted to enable several functions, including ATP binding activity; ATP hydrolysis activity; and identical protein binding activity. Acts upstream of or within cardiac muscle hypertrophy in response to stress and transition between fast and slow fiber. Located in Z disc and stress fiber. Part of myosin complex. Is expressed in several structures, including diaphragm; eye; heart; musculature; and somite. Human ortholog(s) of this gene implicated in cardiomyopathy (multiple); congenital heart disease (multiple); distal myopathy 1; and hyaline body myopathy (multiple). Orthologous to human MYH7 (myosin heavy chain 7). [provided by Alliance of Genome Resources, Apr 2022] #> Myh1 Predicted to enable several functions, including ATP binding activity; actin filament binding activity; and calmodulin binding activity. Located in A band and intercalated disc. Is expressed in several structures, including gonad; gut; hemolymphoid system gland; integumental system; and skeletal musculature. Orthologous to human MYH1 (myosin heavy chain 1). [provided by Alliance of Genome Resources, Apr 2022] #> Myh4 Predicted to enable double-stranded RNA binding activity. Acts upstream of or within response to activity. Predicted to be located in myofibril. Predicted to be part of myosin complex. Is expressed in several structures, including brown fat; diaphragm; heart; limb segment; and skeletal musculature. Orthologous to human MYH4 (myosin heavy chain 4). [provided by Alliance of Genome Resources, Apr 2022] # Function specific for this vignette, with some hard coded values plot_counts_logcounts <- function(sfe, feature) { p1 <- plotSpatialFeature(sfe, feature, \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\"), swap_rownames = \"symbol\", exprs_values = \"counts\", aes_use = \"color\", linewidth = 0.5, fill = NA) + ggtitle(\"Raw counts\") p2 <- plotSpatialFeature(sfe, feature, \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\"), swap_rownames = \"symbol\", exprs_values = \"logcounts\", aes_use = \"color\", linewidth = 0.5, fill = NA) + ggtitle(\"Log normalized counts\") p1 + p2 + plot_annotation(title = feature) } plot_counts_logcounts(sfe_tissue, markers[\"I\"]) plot_counts_logcounts(sfe_tissue, markers[\"IIa\"]) plotColData(sfe_tissue, x = \"mean_myofiber_area\", y = \"prop_mito\", colour_by = markers[\"IIa\"], by_exprs_values = \"logcounts\", swap_rownames = \"symbol\") #> Warning: Removed 36 rows containing missing values or values outside the scale range #> (`geom_point()`)."},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"spatial-neighborhood-graphs","dir":"Articles","previous_headings":"","what":"Spatial neighborhood graphs","title":"Spatial Visium exploratory data analysis","text":"spatial neighborhood graph required compute spatial dependency metrics Moran’s Geary’s C. SpatialFeatureExperiment package wraps methods spdep find spatial neighborhood graphs, stored within SFE object (see spdep documentation gabrielneigh(), knearneigh(), poly2nb(), tri2nb()). Voyager package uses graphs spatial dependency analyses, based spdep first version, methods geospatial packages, also use spatial neighborhood graphs, may added later. Visium, spots hexagonal grid, spatial neighborhood graph straightforward. However, spatial technologies single cell resolution, e.g. MERFISH, different methods can used find spatial neighborhood graph. example, method “poly2nb” used myofibers, identifies myofiber polygons physically touch . zero.policy = TRUE allow singletons, .e. nodes without neighbors graph; inflamed region, singletons. yet benchmarked spatial neighborhood construction methods determine “best” different technologies; particular method used demonstration purposes may best practice: plotColGraph() function plots graph space associated colGeometry, along geometry interest. Similarly, plotAnnotGraph() function plots graph associated annotGeometry, along geometry interest. plotRowGraph yet since haven’t worked dataset spatial graphs related genes relevant, although SFE object supports row graphs.","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) annotGraph(sfe_tissue, \"myofiber_poly2nb\") <- findSpatialNeighbors(sfe_tissue, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE) plotColGraph(sfe_tissue, colGraphName = \"visium\", colGeometryName = \"spotPoly\") + theme_void() plotAnnotGraph(sfe_tissue, annotGraphName = \"myofiber_poly2nb\", annotGeometryName = \"myofiber_simplified\") + theme_void()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"exploratory-spatial-data-analysis","dir":"Articles","previous_headings":"","what":"Exploratory spatial data analysis","title":"Spatial Visium exploratory data analysis","text":"spatial autocorrelation metrics package can computed directly vector matrix rather SFE object. user interface emulates dimension reductions scater package (e.g. calculateUMAP() takes matrix SCE object returns matrix, runUMAP() takes SCE object adds results reducedDims field SCE object). calculate* functions take matrix SFE object directly return results (format results depends structure results), run* functions take SFE object add results object. addition, colData* functions compute metrics numeric variables colData. colGeometry* functions compute metrics numeric columns colGeometry. annotGeometry* functions compute metrics numeric columns annotGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"univariate-global","dir":"Articles","previous_headings":"","what":"Univariate global","title":"Spatial Visium exploratory data analysis","text":"Voyager supports many univariate global spatial autocorrelation implemented spdep ESDA: Moran’s Geary’s C, permutation testing Moran’s Geary’s C, Moran plot, correlograms. addition, beyond spdep, Voyager can cluster Moran plots correlograms. Plotting functions taking SFE objects implemented plot results ggplot2 customization options spdep plotting functions. functions calculateUnivariate(), runUnivariate(), colDataUnivariate(), colGeometryUnivariate(), annotGeometryUnivariate() compute univariate spatial statistics. argument type, indicates corresponding function names spdep, determines spatial statistics computed. univariate global methods Voyager listed : calling calculate*variate() run*variate(), type (2nd) argument takes either SFEMethod object (see SFEMethod() vignette SFEMethod) string matches entry name column data frame returned listSFEMethods(). demonstrate spatial autocorrelation gene expression, top highly variable genes (HVGs) used. HVGs found scran method. global statistic yields one result entire dataset.","code":"listSFEMethods(variate = \"uni\", scope = \"global\") #> name description #> 1 moran Moran's I #> 2 geary Geary's C #> 3 moran.mc Moran's I with permutation testing #> 4 geary.mc Geary's C with permutation testing #> 5 sp.mantel.mc Mantel-Hubert spatial general cross product statistic #> 6 moran.test Moran's I test #> 7 geary.test Geary's C test #> 8 globalG.test Global G test #> 9 sp.correlogram Correlogram #> 10 variogram Variogram with model #> 11 variogram_map Variogram map dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"morans-i","dir":"Articles","previous_headings":"Univariate global","what":"Moran’s I","title":"Spatial Visium exploratory data analysis","text":"several ways quantify spatial autocorrelation, common Moran’s : \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}, \\] \\(n\\) number spots locations, \\(\\) \\(j\\) different locations, spots Visium context, \\(x\\) variable values location, \\(w_{ij}\\) spatial weight, can inversely proportional distance spots indicator whether two spots neighbors, subject various definitions neighborhood whether normalize number neighbors. spdep package uses neighborhood. Moran’s can understood Pearson correlation value location average value neighbors. Just like Pearson correlation, Moran’s generally bound -1 1, positive value indicates positive spatial autocorrelation negative value indicates negative spatial autocorrelation. Upon visual inspection, total UMI counts per spot seem spatial autocorrelation. spatial neighborhood graph required compute Moran’s , specified listw argument. matrices, rows features, gene count matrix. “moran” Moran’s , K sample kurtosis. add results SFE object, specifically colData: colData, results added colFeatureData(sfe), features Moran’s calculated NA. column names featureData distinguishes different samples (’s one sample dataset), parsed plotting functions. add results SFE object, specifically geometries: “area” area cross section myofiber seen tissue section “eccentricity” eccentricity ellipse fitted myofiber. non-geometry column colGeometry, colGeometryUnivariate() like annotGeometryUnivariate() , none colGeometries dataset extra columns. gene expression, logcounts assay used default (use exprs_values argument change assay), though may may best practice. metrics computed large number features, parallel computing supported, BiocParallel, BPPARAM argument.","code":"# Directly use vector or matrix, and multiple features can be specified at once calculateUnivariate(t(colData(sfe_tissue)[,c(\"nCounts\", \"nGenes\")]), type = \"moran\", listw = colGraph(sfe_tissue, \"visium\")) #> DataFrame with 2 rows and 2 columns #> moran K #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036 sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"moran\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> nCounts 0.528705 3.00082 #> nGenes 0.384028 3.88036 # Remember zero.policy = TRUE since there're singletons sfe_tissue <- annotGeometryUnivariate(sfe_tissue, type = \"moran\", features = c(\"area\", \"eccentricity\"), annotGeometryName = \"myofiber_simplified\", annotGraphName = \"myofiber_poly2nb\", zero.policy = TRUE) head(attr(annotGeometry(sfe_tissue, \"myofiber_simplified\"), \"featureData\")) #> DataFrame with 6 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> lyr.1 NA NA #> area 0.327888 4.95675 #> perimeter NA NA #> eccentricity 0.110938 3.26913 #> theta NA NA #> sine_theta NA NA sfe_tissue <- runUnivariate(sfe_tissue, type = \"moran\", features = hvgs, colGraphName = \"visium\", BPPARAM = MulticoreParam(2)) rowData(sfe_tissue)[head(hvgs),] #> DataFrame with 6 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000029304 ENSMUSG00000029304 Spp1 Gene Expression 1.63722 #> ENSMUSG00000050708 ENSMUSG00000050708 Ftl1 Gene Expression 2.37981 #> ENSMUSG00000050335 ENSMUSG00000050335 Lgals3 Gene Expression 1.43189 #> ENSMUSG00000021939 ENSMUSG00000021939 Ctsb Gene Expression 2.73117 #> ENSMUSG00000021190 ENSMUSG00000021190 Lgmn Gene Expression 1.11278 #> ENSMUSG00000018893 ENSMUSG00000018893 Mb Gene Expression 2.11118 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000029304 60.1583 22.4430 0.734937 1.63516 #> ENSMUSG00000050708 162.1931 28.6384 0.665563 1.81841 #> ENSMUSG00000050335 48.0739 23.4471 0.741474 1.68098 #> ENSMUSG00000021939 131.6232 17.6455 0.708362 1.86896 #> ENSMUSG00000021190 21.4505 17.3228 0.659916 1.66838 #> ENSMUSG00000018893 74.1782 16.6428 0.675840 1.82510"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"gearys-c","dir":"Articles","previous_headings":"Univariate global","what":"Geary’s C","title":"Spatial Visium exploratory data analysis","text":"Another spatial autocorrelation metric Geary’s C, defined : \\[ C = \\frac{(n-1)}{2\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}(x_i - x_j)^2}{{\\sum_{=1}^n (x_i - \\bar{x})^2}} \\] Geary’s C 1 indicates positive spatial autocorrelation, 1 indicates negative spatial autocorrelation. compute Geary’s C features interest replace type = \"moran\" previous section type = \"geary\", add results SFE object. example, colData ’s one column K since ’s Moran’s Geary’s C. Moran’s Geary’s C suggest positive spatial autocorrelation nCounts nGenes. univariate global methods, including permutation testing Moran’s Geary’s C, correlograms, Moran scatter plot can also called functions runUnivariate, specifying type argument. See documentation runUnivariate see available methods see documentation corresponding spdep functions see extra arguments required method.","code":"sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"geary\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 3 columns #> moran_Vis5A K_Vis5A geary_Vis5A #> #> nCounts 0.528705 3.00082 0.474892 #> nGenes 0.384028 3.88036 0.605797"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"permutation-testing","dir":"Articles","previous_headings":"Univariate global","what":"Permutation testing","title":"Spatial Visium exploratory data analysis","text":"establish whether spatial autocorrelation statistically significant, moran.test() function spdep can used. provides p-value, p-value may accurate data normally distributed. gene expression data generally normally distributed data normalization doesn’t always work well, use permutation testing test significance Moran’s Geary’s C, wrapping moran.mc() spdep. “mc” stands Monte Carlo. nsim argument specifies number simulations. following adds results SFE object: Note test performed multiple features, p-values corrected multiple hypothesis testing. results can plotted: default, colorblind friendly palette dittoSeq used categorical variables. density plot Moran’s simulations values permuted disconnected spatial locations, vertical line actual Moran’s value. simulation indicates actual Moran’s much higher simulations values dissociated spatial locations permuted among locations, indicating spatial autocorrelation significant. Use type = \"geary.mc\" permutation testing Geary’s C. spdep package can also compute p-values Moran’s analytically, theory behind mean variance null distribution Moran’s assumes normal distribution data, gene expression data generally non-normal. However, according (Griffith 2010), large sample size (“preferably least 100”), mean variance Moran’s several iid non-normal simulated datasets (including negative binomial, commonly used model gene expression data) don’t seem deviate much values expected normally distributed data. Spatial transcriptomics datasets typically thousands spots cells, sample size likely large enough. Hence using analytical test non-normal data might bad. However, large sample size, minuscule difference can create significant p-values. perform analytical test Moran’s : Now compare p-values permutation analytical test; cases , default alternative hypothesis positive spatial autocorrelation: p-values permutation limited number permutations (1000 ). Either way, permutation analytical tests indicate significant positive spatial autocorrelation. limitation permutation testing Moran’s assumes permutation values among locations equally likely, necessarily true. instance, epidemiology, disease rate regions small population likely assumes extreme values (Assunção Reis 1999), analogous rare cell types lowly expressed genes histological space given divide total UMI counts per spot. extent happens may depend tissue, gene interest, technology, data normalization method.","code":"set.seed(29) sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", nsim = 1000, type = \"moran.mc\") colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"),] #> DataFrame with 2 rows and 9 columns #> moran_Vis5A K_Vis5A geary_Vis5A moran.mc_statistic_Vis5A #> #> nCounts 0.528705 3.00082 0.474892 0.528705 #> nGenes 0.384028 3.88036 0.605797 0.384028 #> moran.mc_parameter_Vis5A moran.mc_p.value_Vis5A #> #> nCounts 1001 0.000999001 #> nGenes 1001 0.000999001 #> moran.mc_alternative_Vis5A moran.mc_method_Vis5A #> #> nCounts greater Monte-Carlo simulati.. #> nGenes greater Monte-Carlo simulati.. #> moran.mc_res_Vis5A #> #> nCounts -0.02610680, 0.00305305,-0.01996753,... #> nGenes 0.02274607,-0.02127688, 0.00705138,... plotMoranMC(sfe_tissue, c(\"nCounts\", \"nGenes\")) sfe_tissue <- colDataUnivariate(sfe_tissue, features = c(\"nCounts\", \"nGenes\"), colGraphName = \"visium\", type = \"moran.test\") names(colFeatureData(sfe_tissue)) #> [1] \"moran_Vis5A\" \"K_Vis5A\" #> [3] \"geary_Vis5A\" \"moran.mc_statistic_Vis5A\" #> [5] \"moran.mc_parameter_Vis5A\" \"moran.mc_p.value_Vis5A\" #> [7] \"moran.mc_alternative_Vis5A\" \"moran.mc_method_Vis5A\" #> [9] \"moran.mc_res_Vis5A\" \"moran.test_statistic_Vis5A\" #> [11] \"moran.test_p.value_Vis5A\" \"moran.test_alternative_Vis5A\" #> [13] \"moran.test_method_Vis5A\" \"moran.test_Moran.I.statistic_Vis5A\" #> [15] \"moran.test_Expectation_Vis5A\" \"moran.test_Variance_Vis5A\" # permutation colFeatureData(sfe_tissue)[c(\"nCounts\", \"nGenes\"), c(\"moran.mc_p.value_Vis5A\", \"moran.test_p.value_Vis5A\")] #> DataFrame with 2 rows and 2 columns #> moran.mc_p.value_Vis5A moran.test_p.value_Vis5A #> #> nCounts 0.000999001 5.41958e-163 #> nGenes 0.000999001 2.82666e-87"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"correlogram","dir":"Articles","previous_headings":"Univariate global","what":"Correlogram","title":"Spatial Visium exploratory data analysis","text":"correlogram, spatial autocorrelation higher orders neighbors (e.g. second order neighbors neighbors neighbors) calculated see decays orders. Visium, regular hexagonal grid, order neighbors proxy distance. irregular patterns single cells, different methods find spatial neighbors may give different results. colData, Moran’s correlogram computed results can plotted plotCorrelogram: error bars twice standard deviation Moran’s value. standard deviation p-values (null hypothesis Moran’s 0) come moran.test() (Geary’s C correlogram, geary.test()); taken grain salt data normally distributed. p-values corrected multiple hypothesis testing across orders features. usual, . means p < 0.1, * means p < 0.05, ** means p < 0.01, *** means p < 0.001. , can done Geary’s C, colData, annotGeometry, etc.","code":"sfe_tissue <- runUnivariate(sfe_tissue, hvgs[1:2], colGraphName = \"visium\", order = 10, type = \"sp.correlogram\") plotCorrelogram(sfe_tissue, hvgs[1:2], swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"univariate-local","dir":"Articles","previous_headings":"","what":"Univariate local","title":"Spatial Visium exploratory data analysis","text":"Local statistics yield result location rather whole dataset, global statistics may obscure local heterogeneity. See (Fotheringham 2009) interesting discussion relationships global local spatial statistics. Local statistics stored localResults field SFE object, can accessed localResult() localResults() functions SpatialFeatureExperiment package. univariate local methods Voyager listed :","code":"listSFEMethods(variate = \"uni\", scope = \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"moran-scatter-plot","dir":"Articles","previous_headings":"Univariate local","what":"Moran scatter plot","title":"Spatial Visium exploratory data analysis","text":"Moran scatter plot (Anselin 1996), x axis value spot, y axis average value neighbors. slope fitted line Moran’s . Sometimes clusters appear plot, showing different kinds neighborhoods. gene expression, use one gene (log normalized value) demonstrate: dashed lines mark mean Myh2 spatially lagged Myh2. singletons . Visium spots lower Myh2 expression neighbors don’t express Myh2 spots don’t express Myh2 usually least neighbors . twp main clusters spots whose neighbors express Myh2: high (average) expression whose neighbors also high expression, low expression whose neighbors also low expression. features may show different kinds clusters. can use k-means clustering identify clusters, though clustering method supported bluster package can used. can use gget search module get Ensembl ID Myh2: Plot clusters space can also done colData, annotGeometry, etc. Moran’s permutation testing.","code":"sfe_tissue <- runUnivariate(sfe_tissue, \"Myh2\", colGraphName = \"visium\", type = \"moran.plot\", swap_rownames = \"symbol\") moranPlot(sfe_tissue, \"Myh2\", graphName = \"visium\", swap_rownames = \"symbol\") set.seed(29) clusts <- clusterMoranPlot(sfe_tissue, \"Myh2\", BLUSPARAM = KmeansParam(2), swap_rownames = \"symbol\") gget$search(\"Myh2\", species=\"mouse\") #> ensembl_id gene_name #> 1 ENSMUSG00000033196 Myh2 #> ensembl_description #> 1 myosin, heavy polypeptide 2, skeletal muscle, adult [Source:MGI Symbol;Acc:MGI:1339710] #> ext_ref_description biotype #> 1 myosin, heavy polypeptide 2, skeletal muscle, adult protein_coding #> synonym #> 1 MHC2A, M.... #> url #> 1 https://useast.ensembl.org/mus_musculus/Gene/Summary?g=ENSMUSG00000033196 moranPlot(sfe_tissue, \"Myh2\", graphName = \"visium\", color_by = clusts$ENSMUSG00000033196, swap_rownames = \"symbol\") colData(sfe_tissue)$Myh2_moranPlot_clust <- clusts$ENSMUSG00000033196 plotSpatialFeature(sfe_tissue, \"Myh2_moranPlot_clust\", colGeometryName = \"spotPoly\", image = \"lowres\", color = \"black\", linewidth = 0.1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"local-morans-i","dir":"Articles","previous_headings":"Univariate local","what":"Local Moran’s I","title":"Spatial Visium exploratory data analysis","text":"recap, global Moran’s defined \\[ = \\frac{n}{\\sum_{=1}^n \\sum_{j=1}^n w_{ij}} \\frac{\\sum_{=1}^n \\sum_{j=1}^n w_{ij} (x_i - \\bar{x})(x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}. \\] Local Moran’s (Anselin 1995) defined \\[ I_i = (n-1)\\frac{(x_i - \\bar{x})\\sum_{j=1}^n w_{ij} (x_j - \\bar{x})}{\\sum_{=1}^n (x_i - \\bar{x})^2}. \\] ’s similar global Moran’s , values locations \\(\\) summed ’s normalization sum spatial weights. useful plot log normalized Myh2 gene expression context interpret local results: see regions higher Myh2 expression also stronger spatial autocorrelation. interesting see spatial autocorrelation relates gene expression level, much finding variance relates mean expression gene, usually indicates overdispersion compared Poisson scRNA-seq Visium data: gene, Visium spots higher expression also tend higher local Moran’s , may may apply genes. Local spatial analyses often return matrix data frame. plotLocalResult() function default column local spatial method, columns can plotted well. Use localResultAttrs() function see columns present, use attribute argument specify column plot. local spatial methods return p-values location, column name like Pr(z != E(Ii)), test two sided (default, can changed alternative argument runUnivariate() passed relevant underlying function spdep). Negative log p-value computed facilitate visualization, p-value corrected multiple hypothesis testing p.adjustSP() spdep, number tests number neighbors location rather total number locations (-log10p_adj). plot following plots p-values, divergent palette used show locations significant adjusting multiple testing significant different colors. center divergent palette p = 0.05, brown spots significant dark blue means really significant. “pysal” column displays quadrants relative means Moran plot. result similar k-means clustering shown .","code":"sfe_tissue <- runUnivariate(sfe_tissue, type = \"localmoran\", features = \"Myh2\", colGraphName = \"visium\", swap_rownames = \"symbol\") plotSpatialFeature(sfe_tissue, features = \"Myh2\", colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1) plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\",divergent = TRUE, diverge_center = 0, image_id = \"lowres\", swap_rownames = \"symbol\", color = \"black\", linewidth = 0.1) df <- data.frame(myh2 = logcounts(sfe_tissue)[rowData(sfe_tissue)$symbol == \"Myh2\",], Ii = localResult(sfe_tissue, \"localmoran\", \"Myh2\", swap_rownames = \"symbol\")[,\"Ii\"]) ggplot(df, aes(myh2, Ii)) + geom_point(alpha = 0.3) + labs(x = \"Myh2 (log counts)\", y = \"localmoran\") localResultAttrs(sfe_tissue, \"localmoran\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\", attribute = \"-log10p_adj\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1) plotLocalResult(sfe_tissue, \"localmoran\", features = \"Myh2\", colGeometryName = \"spotPoly\", attribute = \"pysal\", swap_rownames = \"symbol\", image_id = \"lowres\", color = \"black\", linewidth = 0.1)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"getis-ord-gi","dir":"Articles","previous_headings":"Univariate local","what":"Getis-Ord Gi*","title":"Spatial Visium exploratory data analysis","text":"Getis-Ord Gi* used find hotspots coldspots feature space. hotspot cluster high values space, coldspot cluster low values space. Getis-Ord Gi* essentially z-score spatially lagged value feature location \\(\\) ($j w{ij}x_j $), \\(w_{ij}\\) spatial weight. original publication Getis-Ord Gi* 1992 (Getis Ord 1992), spatial weight distance-based binary weight indicating whether another location within certain distance location \\(\\). Getis-Ord Gi excludes location \\(\\) computation mean variance lagged value, Gi* includes location \\(\\) . Usually Gi Gi* yield similar results. mean variance used z-score differ Gi Gi* described paper 1995 (J. K. Ord Getis 1995) derived (Getis Ord 1992). Binary weights recommended Getis-Ord Gi*. High values Gi* indicate hotspots, low values Gi* indicate coldspots. Plot pseudo-p-values simulation hotspots expected. warm color indicates adjusted \\(p < 0.05\\). Local results can also computed annotation geometries. hotspots coldspots expected. Warm color indicates adjusted \\(p < 0.05\\).","code":"colGraph(sfe_tissue, \"visium_B\") <- findVisiumGraph(sfe_tissue, style = \"B\") sfe_tissue <- runUnivariate(sfe_tissue, type = \"localG_perm\", features = \"Myh2\", colGraphName = \"visium_B\", include_self = TRUE, swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"localG_perm\", features = \"Myh2\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image_id = \"lowres\", swap_rownames = \"symbol\", color = \"black\", linewidth = 0.1) localResultAttrs(sfe_tissue, \"localG_perm\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"localG\" \"Gi\" \"E.Gi\" #> [4] \"Var.Gi\" \"StdDev.Gi\" \"Pr(z != E(Gi))\" #> [7] \"Pr(z != E(Gi)) Sim\" \"Pr(folded) Sim\" \"Skewness\" #> [10] \"Kurtosis\" \"-log10p Sim\" \"-log10p_adj Sim\" #> [13] \"cluster\" plotLocalResult(sfe_tissue, \"localG_perm\", features = \"Myh2\", attribute = \"-log10p_adj Sim\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\") annotGraph(sfe_tissue, \"myofiber_poly2nb_B\") <- findSpatialNeighbors(sfe_tissue, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE, style = \"B\") sfe_tissue <- annotGeometryUnivariate(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", annotGraphName = \"myofiber_poly2nb_B\", include_self = TRUE, zero.policy = TRUE) plotLocalResult(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", divergent = TRUE, diverge_center = 0) plotLocalResult(sfe_tissue, \"localG_perm\", \"area\", annotGeometryName = \"myofiber_simplified\", attribute = \"-log10p_adj Sim\", divergent = TRUE, diverge_center = -log10(0.05))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"local-spatial-heteroscedasticity-losh","dir":"Articles","previous_headings":"Univariate local","what":"Local spatial heteroscedasticity (LOSH)","title":"Spatial Visium exploratory data analysis","text":"LOSH (J. Keith Ord Getis 2012) defined \\[ H_i = \\frac{\\sum_j w_{ij}\\left| e_j \\right|^}{h_1\\sum_j w_{ij}} \\] \\(h_1 = \\sum_i \\left| e_i \\right|^/n\\), \\(e_j = x_j - \\bar{x}_j\\), \\[ \\bar{x}_j = \\frac{\\sum_j w_{jk}x_k}{\\sum_j w_{jk}}. \\] default, \\(= 2\\) LOSH like local variance. See (J. Keith Ord Getis 2012) details interpretation. gene, isn’t clear whether LOSH relates gene expression levels. Voyager wrap LOSH.mc() perform permutation testing LOSH, time consuming. chi-squared approximation described 2012 LOSH paper account non-normality data approximate mean variance permutation distributions, p-values LOSH can quickly computed, LOSH.cs(). gene, local conditions mostly homogenous, except spots injury site. Warm color indicates adjusted \\(p < 0.05\\).","code":"sfe_tissue <- runUnivariate(sfe_tissue, \"LOSH.cs\", \"Myh2\", colGraphName = \"visium\", swap_rownames = \"symbol\") plotLocalResult(sfe_tissue, \"LOSH.cs\", features = \"Myh2\", colGeometryName = \"spotPoly\", swap_rownames = \"symbol\", image_id = \"lowres\") localResultAttrs(sfe_tissue, \"LOSH.cs\", \"Myh2\", swap_rownames = \"symbol\") #> [1] \"Hi\" \"E.Hi\" \"Var.Hi\" \"Z.Hi\" \"x_bar_i\" #> [6] \"ei\" \"Pr()\" \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe_tissue, \"LOSH.cs\", features = \"Myh2\", attribute = \"-log10p_adj\", colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = -log10(0.05), swap_rownames = \"symbol\", image_id = \"lowres\") + theme_void()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"caveats","dir":"Articles","previous_headings":"","what":"Caveats","title":"Spatial Visium exploratory data analysis","text":"H&E image can alter perception colors geometries. 2D data supported present, although principle, sf GEOS support 3D data. Spatial neighborhoods make sense within tissue section. multiple tissue sections, biological replica, different conditions? mouse brain, different biological replica can registered Allen Common Coordinate Framework (CCF) spatially comparable. Indeed, interesting see biological variability healthy wild type gene expression fine scaled region brain. However, CCF tissues without stereotypical structure, adipose skeletal muscle. don’t good solution spatially compare different tissue sections yet. Perhaps global spatial statistics whole section histological regions within section can compared. problem remains select informative metrics compare. Perhaps spatially-informed dimension reduction method, taking gene count matrix, also adjacency matrices spatial neighborhood graphs (different sections different blocks matrix) projecting cells Visium spots different sections shared low dimensional space can facilitate comparison. batch effect must corrected, dimension reduction interpretable, scalable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig2_visium.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Spatial Visium exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 dplyr_1.1.4 #> [3] bluster_1.12.0 BiocParallel_1.36.0 #> [5] patchwork_1.2.0 scales_1.3.0 #> [7] sf_1.0-16 SFEData_1.4.0 #> [9] scran_1.30.2 scater_1.30.1 #> [11] ggplot2_3.5.1 scuttle_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SpatialFeatureExperiment_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] edgeR_4.0.16 MASS_7.3-60.0.1 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] cowplot_1.1.3 DBI_1.2.2 #> [21] RColorBrewer_1.1-3 abind_1.4-5 #> [23] zlibbioc_1.48.2 purrr_1.0.2 #> [25] RCurl_1.98-1.14 rappdirs_0.3.3 #> [27] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [29] irlba_2.3.5.1 terra_1.7-71 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] dbscan_1.1-12 tools_4.3.3 #> [49] ggnewscale_0.4.10 ragg_1.3.0 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] mgcv_1.9-1 xfun_0.43 #> [57] HDF5Array_1.30.1 withr_3.0.0 #> [59] BiocManager_1.30.22 fastmap_1.1.1 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 spData_2.3.0 #> [65] digest_0.6.35 rsvd_1.0.5 #> [67] R6_2.5.1 mime_0.12 #> [69] textshaping_0.3.7 colorspace_2.1-0 #> [71] wk_0.9.1 RSQLite_2.3.6 #> [73] utf8_1.2.4 generics_0.1.3 #> [75] class_7.3-22 httr_1.4.7 #> [77] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [79] spdep_1.3-3 pkgconfig_2.0.3 #> [81] scico_1.5.0 gtable_0.3.5 #> [83] blob_1.2.4 XVector_0.42.0 #> [85] htmltools_0.5.8.1 png_0.1-8 #> [87] SpatialExperiment_1.12.0 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 BiocSingular_1.18.0 #> [107] dbplyr_2.5.0 beachmat_2.18.1 #> [109] xtable_1.8-4 cluster_2.1.6 #> [111] beeswarm_0.4.0 evaluate_0.23 #> [113] isoband_0.2.7 magick_2.8.3 #> [115] cli_3.6.2 locfit_1.5-9.9 #> [117] compiler_4.3.3 rlang_1.1.3 #> [119] crayon_1.5.2 labeling_0.4.3 #> [121] classInt_0.4-10 fs_1.6.4 #> [123] ggbeeswarm_0.7.2 viridisLite_0.4.2 #> [125] deldir_2.0-4 munsell_0.5.1 #> [127] Biostrings_2.70.3 Matrix_1.6-5 #> [129] ExperimentHub_2.10.0 sparseMatrixStats_1.14.0 #> [131] bit64_4.0.5 Rhdf5lib_1.24.2 #> [133] KEGGREST_1.42.0 statmod_1.5.0 #> [135] shiny_1.8.1.1 highr_0.10 #> [137] interactiveDisplayBase_1.40.0 AnnotationHub_3.10.1 #> [139] igraph_2.0.3 memoise_2.0.1 #> [141] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Slide-seq V2 spatial transcriptomic tool measures genome-wide expression using DNA-barcoded beads patterned slide non-regular array. beads used current protocol diameter \\(10 \\mu m\\) thus larger single cell, number detected transcripts order magnitude higher compared previous iteration technology. vignette, use Voyager analyze dataset generated using Slide-Seq V2 technology. data described Dissecting treatment-naive ecosystem human melanoma brain metastasis (Biermann et al. 2022). raw counts cell metadata publicly available GEO. focus one human melanoma brain metastasis (MBM) samples provided SFEData package SpatialFeatureExperiment(SFE) object. SFE object contains raw counts, QC metrics number UMIs genes detected per barcode, centroid coordinates barcode sf POINT geometry. SFE object SFEData package includes information 27,566 features 29,536 beads/barcodes.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(scran) library(bluster) library(ggplot2) library(patchwork) library(spdep) library(BiocParallel) theme_set(theme_bw()) (sfe <- BiermannMelaMetasData(dataset = \"MBM05_rep1\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 27566 29536 #> metadata(0): #> assays(1): counts #> rownames(27566): A1BG A1BG-AS1 ... ZZZ3 snoZ196 #> rowData names(3): means vars cv2 #> colnames(29536): ACCACTCATTTCTC-1 GTTCANTCCACGTA-1 ... ACGCGCAATCGTAG-1 #> TTGTTCCGTTCATA-1 #> colData names(4): sample_id nCounts nGenes prop_mito #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : xcoord ycoord #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Slide-Seq V2 Exploratory Data Analysis","text":"begin performing exploratory data analysis barcodes tissue. pre-computed QC measures stored object. Total UMI counts (nCounts), number genes detected per spot (nGenes), proportion mitochondrially encoded counts (prop_mito). , plot total number UMI counts per barcode violin plot space. latter task, leverage function plotSpatialFeature() uses geom_sf() plot geometries applicable. first lines compute average number UMI counts per barcode average plotted red line violin plot. barcode represented sf POINT geometry plot , note many beads quite low UMI counts, small regions throughout tissue appear high counts. perhaps due high cellular density melanoma cells, can speculate without image tissue. Interestingly, barcodes zero counts. contrast many scRNA-seq dataset many cells zero counts. Given density points, may choose aggregate points hexagonal grid avoid overplotting. hexagon colored total number UMI counts space hexagon may represent one barcode. worthwhile note cell segmentation data included dataset. Even though Slide-Seq V2 profile gene expression single cell resolution, cell segmentation data can flexibly stored annotGeometries SFE object. geometries can plotted barcode-level data can used sf operations like finding number barcodes localized single cell. plot visualizes number UMI counts per barcode log scale. appears barcodes higher counts co-localized regions throughout tissue, however, regions rather small may suggest spatial autocorrelation. Next find number genes detected per barcode. , QC feature provided nGenes colData attribute barcodes. Similar number UMI counts per barcode, seem small regions higher number genes throughout tissue. may correspond regions cellular diversity high cellular density, might expected context melanoma. can compute degree number UMI counts per barcode depends spatial location measurement. relationship, spatial autocorrelation, can quantified using Moran’s index spatial autocorrelation, Moran’s . computation Moran’s requires first definition constitutes objects “near” . simply, represented spatial weights matrix. One possible representation adjacency matrix. matrix can computed polygonal data resulting matrix can binary, entries 1 polygons share border, 0 elsewhere (including diagonal). entries can weighted different ways, including length border shared two polygons. schema necessarily lend well spatial transcriptomic technologies, polygonal boundaries cell objects may correspond measurements count matrix, individual spots barcodes may correspond multiple neighborhoods cells. Certainly, interpretation spatial weights matrix change depending technology. case, can generate putative spatial graph using k-nearest neighbors algorithm. implemented findSpatialNeighbors() function argument method = \"knearneigh\" . store result colGraphs() slot SFE object. Now compute Moran's barcode QC metrics using colDataMoransI(). results substantiate visual check spatial autocorrelation. continue investigating QC metrics. proportion UMIs mapping mitochondrial genes useful metric assessing cell quality scRNA-seq data. examine QC metric plotting versus total number UMI counts barcode. keeping expectations, barcodes associated fewer counts appear associated higher proportions mitochondrial reads. exclude barcodes containing >10% mitochondrial reads subsequent analysis. second line removes barcodes zero counts, necessary dataset barcodes zero counts. keep just demonstrate method.","code":"names(colData(sfe)) #> [1] \"sample_id\" \"nCounts\" \"nGenes\" \"prop_mito\" avg <- as.data.frame(colData(sfe)) |> dplyr::summarise(across(-sample_id, mean)) violin <- plotColData(sfe, \"nCounts\") + geom_hline(aes(yintercept = nCounts), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"nCounts\", colGeometryName = \"centroids\", size = 0.2) + theme_void() violin + spatial as.data.frame(cbind(spatialCoords(sfe), colData(sfe))) |> ggplot(aes(xcoord, ycoord, z=nCounts)) + stat_summary_hex(fun = function(x) sum(x), bins=100) + scale_fill_distiller(palette = \"Blues\", direction = 1) + labs(fill='nCounts') + theme_bw() + coord_equal() + scale_x_continuous(expand = expansion()) + scale_y_continuous(expand = expansion()) + theme_void() colData(sfe)$log_nCounts <- log(colData(sfe)$nCounts) avg <- as.data.frame(colData(sfe)) |> dplyr::summarise(across(-sample_id, mean)) violin <- plotColData(sfe, \"log_nCounts\") + geom_hline(aes(yintercept = log_nCounts), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"log_nCounts\", colGeometryName = \"centroids\", size = 0.2) violin + spatial violin <- plotColData(sfe, \"nGenes\") + geom_hline(aes(yintercept = nGenes), avg, color=\"red\") + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, features = \"nGenes\", colGeometryName = \"centroids\", size = 0.2) violin + spatial colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' features_use <- c(\"nCounts\", \"nGenes\") sfe <- colDataMoransI(sfe, features_use, colGraphName = \"knn5\") colFeatureData(sfe)[features_use,] #> DataFrame with 2 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.0965909 48.6328 #> nGenes 0.0957030 11.2037 violin <- plotColData(sfe, \"prop_mito\") + geom_hline(aes(yintercept = prop_mito), avg, color=\"red\") + theme(legend.position = \"top\") mito <- plotColData(sfe, x = \"nCounts\", y = \"prop_mito\") violin + mito # Spatial neighborhood graph is reconstructed when subsetting columns # Use drop = TRUE to drop the graph without reconstruction, whose indices are # no longer valid sfe_filt <- sfe[, colData(sfe)$prop_mito < 0.1] #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' sfe_filt <- sfe_filt[rowSums(counts(sfe_filt)) > 0,]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"data-normalization","dir":"Articles","previous_headings":"","what":"Data Normalization","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Normalization spatial transcriptomics data non-trivial requires thoughtful consideration. Similarly scRNA-seq data analysis, goal normalization remove effects technical variation derive quantity reflects biological variation. However, several questions arise considering best practices spatial data normalization. example, spatial methods average detect fewer UMIs single-cell counterparts, may preclude use normalization techniques log transformation shown . ’s , always evident whether spatial autocorrelation genes (QC measures) artifact technology, thus, whether normalization methods preserve spatial autocorrelation architecture. questions provide avenues active research development, currently unresolved. end, log-normalize data cell identify variable genes subsequent analysis.","code":"sfe_filt <- logNormCounts(sfe_filt) dec <- modelGeneVar(sfe_filt) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension Reduction and Clustering","title":"Slide-Seq V2 Exploratory Data Analysis","text":"Much like scRNA-seq analysis, perform principal component analysis (PCA) clustering. note method use spatial information. can plot variance explained PC. see first components explain variance data. principal components (PCs) can plotted space. notice PCs may show spatial structure correlates biological niches cells. Without cellular overlays, can speculate potential relevance barcodes seem separated PC, PC doe seem separate distinct neighborhoods barcodes. Now can cluster barcodes using graph-based clustering algorithm plot space. plot colored cluster id. naive interpretation plot shows distinct niches barcodes separated abundant, intervening types. may indicative biological processes hand, namely melanoma metastasis, ‘hotspots’ melanoma proliferation separated unaffected normal tissue.","code":"set.seed(29) sfe_filt <- runPCA(sfe_filt, ncomponents = 30, subset_row = hvgs, scale = TRUE, BSPARAM = BiocSingular::IrlbaParam()) # scale as in Seurat ElbowPlot(sfe_filt, ndims = 30) + theme_bw() spatialReducedDim(sfe_filt, \"PCA\", ncomponents = 4, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, scattermore = TRUE, pointsize = 0.5) colData(sfe_filt)$cluster <- clusterRows(reducedDim(sfe_filt, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotSpatialFeature(sfe_filt, \"cluster\", colGeometryName = \"centroids\") + guides(colour = guide_legend(override.aes = list(size=3)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"morans-i","dir":"Articles","previous_headings":"Dimension Reduction and Clustering","what":"Moran’s I","title":"Slide-Seq V2 Exploratory Data Analysis","text":"One avenue future analysis includes identifying genes differentially expressed cluster, can interrogated findMarkers() non-spatial context calculateMoransI() spatial context. spatial case, consideration given whether differences seen across tissue represent biological difference artifacts field view. run global Moran’s log normalized gene expression. Now, might ask: genes display spatial autocorrelation? Spatial variability can also investigated using differential expression testing known anatomical regions complemented spatial location. One potential drawback approach variability induced melanoma, rather native tissue architecture, may preclude identification typical structures. analyses can done stage: gene expression patterns, , differentiate neighborhoods melanoma cells? genes differentially expressed cluster?","code":"sfe_filt <- runMoransI(sfe_filt, features = hvgs, BPPARAM = MulticoreParam(2)) top_moran <- rownames(sfe_filt)[order(rowData(sfe_filt)$moran_sample01, decreasing = TRUE)[1:4]] plotSpatialFeature(sfe_filt, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, pointsize = 0.5)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig3_slideseq_v2.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"Slide-Seq V2 Exploratory Data Analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 BiocParallel_1.36.0 #> [3] spdep_1.3-3 sf_1.0-16 #> [5] spData_2.3.0 patchwork_1.2.0 #> [7] bluster_1.12.0 scran_1.30.2 #> [9] scater_1.30.1 ggplot2_3.5.1 #> [11] scuttle_1.12.0 SpatialExperiment_1.12.0 #> [13] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [15] Biobase_2.62.0 GenomicRanges_1.54.1 #> [17] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [19] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [21] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [23] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] cowplot_1.1.3 DBI_1.2.2 #> [19] RColorBrewer_1.1-3 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] dqrng_0.3.2 pkgdown_2.0.9 #> [33] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [35] DelayedArray_0.28.0 tidyselect_1.2.1 #> [37] farver_2.1.1 ScaledMatrix_1.10.0 #> [39] viridis_0.6.5 BiocFileCache_2.10.2 #> [41] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [43] e1071_1.7-14 systemfonts_1.0.6 #> [45] tools_4.3.3 ggnewscale_0.4.10 #> [47] ragg_1.3.0 Rcpp_1.0.12 #> [49] glue_1.7.0 gridExtra_2.3 #> [51] SparseArray_1.2.4 xfun_0.43 #> [53] dplyr_1.1.4 HDF5Array_1.30.1 #> [55] withr_3.0.0 BiocManager_1.30.22 #> [57] fastmap_1.1.1 boot_1.3-30 #> [59] rhdf5filters_1.14.1 fansi_1.0.6 #> [61] digest_0.6.35 rsvd_1.0.5 #> [63] R6_2.5.1 mime_0.12 #> [65] textshaping_0.3.7 colorspace_2.1-0 #> [67] wk_0.9.1 scattermore_1.2 #> [69] RSQLite_2.3.6 hexbin_1.28.3 #> [71] utf8_1.2.4 generics_0.1.3 #> [73] class_7.3-22 httr_1.4.7 #> [75] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [77] pkgconfig_2.0.3 scico_1.5.0 #> [79] gtable_0.3.5 blob_1.2.4 #> [81] XVector_0.42.0 htmltools_0.5.8.1 #> [83] scales_1.3.0 png_0.1-8 #> [85] knitr_1.45 rjson_0.2.21 #> [87] curl_5.2.1 proxy_0.4-27 #> [89] cachem_1.0.8 rhdf5_2.46.1 #> [91] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [93] parallel_4.3.3 vipor_0.4.7 #> [95] AnnotationDbi_1.64.1 desc_1.4.3 #> [97] s2_1.1.6 pillar_1.9.0 #> [99] grid_4.3.3 vctrs_0.6.5 #> [101] promises_1.3.0 BiocSingular_1.18.0 #> [103] dbplyr_2.5.0 beachmat_2.18.1 #> [105] xtable_1.8-4 cluster_2.1.6 #> [107] beeswarm_0.4.0 evaluate_0.23 #> [109] magick_2.8.3 cli_3.6.2 #> [111] locfit_1.5-9.9 compiler_4.3.3 #> [113] rlang_1.1.3 crayon_1.5.2 #> [115] labeling_0.4.3 classInt_0.4-10 #> [117] fs_1.6.4 ggbeeswarm_0.7.2 #> [119] viridisLite_0.4.2 deldir_2.0-4 #> [121] munsell_0.5.1 Biostrings_2.70.3 #> [123] Matrix_1.6-5 ExperimentHub_2.10.0 #> [125] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [127] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [129] statmod_1.5.0 shiny_1.8.1.1 #> [131] interactiveDisplayBase_1.40.0 highr_0.10 #> [133] AnnotationHub_3.10.1 igraph_2.0.3 #> [135] memoise_2.0.1 bslib_0.7.0 #> [137] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"CosMX non-small cell lung cancer data","text":"Nanostring GeoMX DSP popular spatial transcriptomics technology formalin fixed paraffin embedded (FFPE) tissues, doesn’t single cell resolution. CosMX FISH based technology FFPE tissue (He2021-oy?) single cell resolution, vignette provides example analyze CosMX data voyager. Note FFPE common way preserve archive tissue, cases, samples available may FFPE. CosMX dataset non-small cell lung cancer used described (He2021-oy?). processed data available download Nanostring website. gene count matrix, cell metadata, cell segmentation polygon coordinates downloaded Nanostring website CSV files read R data frames. gene count matrix converted sparse matrix. cell metadata contains centroid coordinates cells. cell polygon data frames converted sf data frame df2sf() function SpatialFeatureExperiment (SFE). used construct SFE object. Cell segmentation available one z-plane. first biological replicate included SFEData package. biological replicate 980 features 100,290 cells. Take look cells space: single cell resolution, lot details can seen, although ’s artifact borders fields view (FOVs). Plot cell density","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) # devel version of plotExpression library(scran) library(bluster) library(ggplot2) library(patchwork) library(stringr) library(spdep) library(BiocParallel) library(BiocSingular) theme_set(theme_bw()) (sfe <- HeNSCLCData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 980 100290 #> metadata(0): #> assays(1): counts #> rownames(980): AATK ABL1 ... NegPrb22 NegPrb23 #> rowData names(3): means vars cv2 #> colnames(100290): 1_1 1_2 ... 30_4759 30_4760 #> colData names(17): Area AspectRatio ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : CenterX_global_px CenterY_global_px #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: plotGeometry(sfe, MARGIN = 2L, type = \"cellSeg\") plotCellBin2D(sfe, hex = TRUE)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"cells","dir":"Articles","previous_headings":"Quality control (QC)","what":"Cells","title":"CosMX non-small cell lung cancer data","text":"Single cell RNA-seq (scRNA-seq) technologies typically don’t quantify cell morphology, gene expression Visium doesn’t single cell resolution. single cell resolution smFISH based data, cell gene expression related QC metrics total number transcripts detected number genes detected, also cell morphology area (z-plane segmentation polygons provided) aspect ratio. Area relevant QC since can flag falsely undersegmented cells, .e. several cells falsely considered one cell segmentation program. However, since pre-defined gene panel used mitochondrially encoded genes quantified, scRNA-seq QC metric proportion mitochondrially encoded counts applicable. QC metrics precomputed stored colData Cell area, aspect ratio, marker stain intensities, .e. columns “sample_id” come Nanostring’s website. sf package can compute areas cell polygons. R, EBImage package can compute morphological metrics aspect ratio, eccentricity, orientation, etc., requires data converted raster. OpenCV can compute morphological metrics polygons without converting raster, needs called Python C++. Since math behind many basic morphological metrics pretty simple, may add Voyager future version. Since plotting 100,000 polygons slow plot isn’t large enough us see polygons anyway, use scattermore rasterize plot speed plotting. Instead plotting every single point, now ggplot merely displays rasterized image. Number transcript spots detected per cell make nCounts nGenes comparable across datasets, divide number genes probed. dataset, 960 genes, 20 negative controls. However, different genes may probed different datasets, can different tissues, make nCounts nGenes completely comparable across datasets. However, may still somewhat comparable, since genes highly expressed major cell types tissue tend selected gene panel. means cells mostly less 1 transcript count per gene average, surprising since cells express genes. cells detected express less 30% genes probed. Number genes (980) detected per cell Based spatial plot, seems nCounts nGenes biologically relevant, cells transcripts detected. nCounts relates nGenes ’s nature cells without transcripts? cells without transcripts central cavity. “empty” cells tend smaller cells also really large ones. Cell area distribution Larger cells likely found certain areas tissue. biological, -segmentation likely cell type tissue region. area relate total counts? may vaguely seem cells total counts tend larger (least z-plane), cells large low total counts. Negative control probes used dataset QC. calculate proportion transcripts attributed negative controls. NA’s empty cells, proportion low except outliers. prop_neg relate nCounts? looks kind like proportion mitochondrial counts vs. nCounts plot scRNA-seq, cells fewer total counts tend higher proportion mitochondrial counts. distribution obviously bimodal, since x-axis log transformed better visualize distribution, 0’s removed. ’s kind arbitrary; now ’ll remove cells 10% transcripts negative controls. removing low quality cells, 100,095 cells left.","code":"names(colData(sfe)) #> [1] \"Area\" \"AspectRatio\" \"Width\" #> [4] \"Height\" \"Mean.MembraneStain\" \"Max.MembraneStain\" #> [7] \"Mean.PanCK\" \"Max.PanCK\" \"Mean.CD45\" #> [10] \"Max.CD45\" \"Mean.CD3\" \"Max.CD3\" #> [13] \"Mean.DAPI\" \"Max.DAPI\" \"sample_id\" #> [16] \"nCounts\" \"nGenes\" # Function to plot violin plot for distribution and spatial at once plot_violin_spatial <- function(sfe, feature) { violin <- plotColData(sfe, feature, point_fun = function(...) list()) spatial <- plotSpatialFeature(sfe, feature, colGeometryName = \"centroids\", scattermore = TRUE) violin + spatial + plot_layout(widths = c(1, 2)) } plot_violin_spatial(sfe, \"nCounts\") summary(sfe$nCounts) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.0 135.0 248.0 302.8 409.0 2475.0 n_panel <- 960 colData(sfe)$nCounts_normed <- sfe$nCounts/n_panel colData(sfe)$nGenes_normed <- sfe$nGenes/n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plot_violin_spatial(sfe, \"nGenes\") summary(sfe$nGenes) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.0 75.0 119.0 127.1 171.0 500.0 plotColData(sfe, x = \"nCounts\", y = \"nGenes\", bins = 100) colData(sfe)$is_empty <- colData(sfe)$nCounts < 1 plotSpatialFeature(sfe, \"is_empty\", \"cellSeg\") plotColData(sfe, x = \"Area\", y = \"is_empty\") plot_violin_spatial(sfe, \"Area\") plotColData(sfe, x = \"nCounts\", y = \"Area\", bins = 100) + theme_bw() neg_inds <- str_detect(rownames(sfe), \"^NegPrb\") # Number of negative control probes sum(neg_inds) #> [1] 20 colData(sfe)$prop_neg <- colSums(counts(sfe)[neg_inds,])/colData(sfe)$nCounts plot_violin_spatial(sfe, \"prop_neg\") #> Warning: Removed 142 rows containing non-finite outside the scale range #> (`stat_ydensity()`). plotColData(sfe, x = \"nCounts\",y = \"prop_neg\", bins = 100) #> Warning: Removed 142 rows containing non-finite outside the scale range #> (`stat_bin2d()`). # The zeros are removed plotColDataHistogram(sfe, \"prop_neg\") + scale_x_log10() #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 59213 rows containing non-finite outside the scale range #> (`stat_bin()`). # Remove low quality cells (sfe <- sfe[,!sfe$is_empty & sfe$prop_neg < 0.1]) #> class: SpatialFeatureExperiment #> dim: 980 100095 #> metadata(0): #> assays(1): counts #> rownames(980): AATK ABL1 ... NegPrb22 NegPrb23 #> rowData names(3): means vars cv2 #> colnames(100095): 1_1 1_2 ... 30_4759 30_4760 #> colData names(21): Area AspectRatio ... is_empty prop_neg #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : CenterX_global_px CenterY_global_px #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"markers","dir":"Articles","previous_headings":"Quality control (QC) > Cells","what":"Markers","title":"CosMX non-small cell lung cancer data","text":"Nanostring provides cell stain marker intensities cell metadata. plot aspect ratio mean intensity cells stains markers, plotted . PanCK marker epithelial cells. CD45 leukocyte marker. CD3 T cell marker. Since takes quite plot 100,000 cells 6 times, scattermore really helps.","code":"names(colData(sfe)) #> [1] \"Area\" \"AspectRatio\" \"Width\" #> [4] \"Height\" \"Mean.MembraneStain\" \"Max.MembraneStain\" #> [7] \"Mean.PanCK\" \"Max.PanCK\" \"Mean.CD45\" #> [10] \"Max.CD45\" \"Mean.CD3\" \"Max.CD3\" #> [13] \"Mean.DAPI\" \"Max.DAPI\" \"sample_id\" #> [16] \"nCounts\" \"nGenes\" \"nCounts_normed\" #> [19] \"nGenes_normed\" \"is_empty\" \"prop_neg\" plotSpatialFeature(sfe, c(\"AspectRatio\", \"Mean.DAPI\", \"Mean.MembraneStain\", \"Mean.PanCK\", \"Mean.CD45\", \"Mean.CD3\"), colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"genes","dir":"Articles","previous_headings":"Quality control (QC)","what":"Genes","title":"CosMX non-small cell lung cancer data","text":"red line \\(y = x\\) expected Poisson data. Gene expression dataset variance expected Poisson, even gene lower expression. Zoom negative controls Among “high quality” cells, negative controls still higher variance relative mean compared Poisson. Negative controls vs. real genes negative controls lower mean “expression” vast majority real genes.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_neg <- neg_inds plotRowData(sfe, x = \"means\", y = \"vars\", bins = 50) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() as.data.frame(rowData(sfe)[neg_inds,]) |> ggplot(aes(means, vars)) + geom_point() + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() plotRowData(sfe, x = \"means\", y = \"is_neg\") + scale_y_log10() + annotation_logticks(sides = \"b\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"spatial-autocorrelation-in-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation in QC metrics","title":"CosMX non-small cell lung cancer data","text":"spatial neighborhood graph required spatial dependence analyses spdep. Without benchmark, don’t yet know type neighborhood graph best purpose. Methods find spatial neighborhood graphs spdep knearneigh() (k nearest neighbors), dnearneigh() (find cells within certain distance), poly2nb() (polygon contiguity) recommended larger datasets. cell-cell contact may biologically relevant, cell segmentation imperfect, leading non-contiguous cell segmentation polygons cells appear contiguous H&E, using poly2nb() find polygon contiguity neighbors without supplementing another kind neighborhood problematic. Delaunay triangulation deldir package, used spdep (tri2nb()), takes 4 5 minutes dataset size, run time increases much drastically linearly number cells increases. Sphere Interest (SOI) graph (soi.graph()) prunes edges triangulation long, take long . triangulation SOI graph, slower knearneigh(), dnearneigh(), poly2nb(), somewhat practical considerations. implementation gabrielneigh() relativeneigh() take impracticably long (hour terminated R session impatience) dataset recommended. Methods find approximate nearest neighbors Annoy (AnnoyParam()) HNSW (HnswParam()), supported bluster BiocNeighbors packages might speed finding graphs, haven’t formally benchmarked . See Chapter 14 Spatial Data Science proximity areal data detailed discussion different neighborhood graphs spdep. methods areal data first wrapped Voyager much spatial transcriptomics data analogous areal geospatial data, data several cells aggregated areas, happens Visium spots. Just like geospatial areal data, Visium aggregation areas arbitrary represent underlying spatial process. Although sometimes geographical areal units arbitrary, tissues generally hexagonal grids means Visium spot polygons arbitrary context. Regions interest (ROI) selection spatial transcriptomics methods, laser capture microdissection (LCM) GeoMX DSP obviously analogous geospatial areal data. aggregation also happens analyze smFISH-based data cell level, basic unit observation individual transcript spots. spdep caters areal data, gstat caters geostatistical data, continuous spatial process sampled point locations. ways, spatial transcriptomics data analogous geostatistical data. Visium samples supposed spatial biological process regular hexagonal grid, pretend Visium spots points. smFISH-based single cell resolution data, cells observed can thought sample underlying spatial biological process supervening specific locations cells. sense, cells samples, since smFISH based technologies attempt visualize cells tissue section. However, biological function tissue depend particular spatial arrangement individual cells (.e. supervenes particular spatial arrangement), cell types, specific cell locations observed can thought samples process, consider cell basic unit spatial process. Voyager 1.2.0 (Bioconductor 3.17), added semivariograms (gstat package) exploratory tool identify presence spatial autocorrelation, length scale, anisotropy (.e. different different directions). Covariates can specified computing variogram account spatial trends adjust another spatial variable. However, unlike Morans’s , semivariogram can’t identify negative spatial autocorrelation, although since spatial neighborhood graph typically encode spatial directions, spdep autocorrelation metrics can’t identify anisotropy. Another problem semivariogram assumes data intrinsically stationary, .e. semivariogram holds entire dataset, similarity two cells depends distance , may case spatial autocorrelation varies space evident genes local spatial analyses. Single cell smFISH based data also dissimiliar areal geostatistical data important ways. geospatial areal data, data numerous basic units spatial process (e.g. people epidemiology) aggregated areas (e.g. cities), whereas histological space, cell arguably sensible basic unit biological spatial process individual mRNA molecules. Unlike geostatistical data, cells seen tissue section often polygons tessellating tissue section rather points. Furthermore, ideally samples underlying spatial process affect spatial process geostatistical data, cells play active roles biological spatial process. However, data analysis methods areal geostatistical data can still relevant EDA descriptive models (causal mechanistic) single cell smFISH data. Different types spatial neighborhood graphs cells may relevant different processes. instance, contiguity cell segmentation polygons relevant contact involved cell signaling, although cell segmentation imperfect. Positive spatial autocorrelation can arise contact activation, negative autocorrelation can arise contact inhibition. However, cells may also influenced longer range factors secreted ligands, morphogens, simpler spatial trends like distance artery vein. case, perhaps semivariogram using Euclidean distance cells spatial weights spatial autocorrelation metrics relevant EDA. interesting compare results different spatial neighborhood graphs spatial weights, spdep gstat. Perhaps one best method, different methods reveal different phenomena. problem choosing spatial neighborhood matrix long history far predating spatial transcriptomics. See (Getis 2009) brief discussion decades work around issue. Spatial autocorrelation metrics seek measure nearby things tend similar dissimilar, neighborhood graph edge weights define mean “nearby” areal data. Note Visium spot can contain several dozens cells, spatial neighborhood graphs Visium spots describe neighborhood relationships much longer length scales spatial neighborhood graphs single cells, spatial autocorrelation metrics using Visium graph different meanings cellular neighborhood graphs. now, just demonstrate software usage, use k nearest neighborhood graph distance based edge weights, commonly done graph based clustering scRNA-seq, although don’t yet know best value k scenario. purpose vignette, say use \\(k = 5\\), execution time isn’t outrageous. argument style = \"W\" row normalize adjacency matrix spatial neighborhood graph necessary Moran scatter plot. Inverse distance edge weights can take small values matter relative rather absolute values distance arbitrary unit; row normalizing adjacency matrix makes weighted average value neighbors comparable value cell . tissue, many cells appear contiguous, since cell segmentation imperfect, many false singletons, makes polygon contiguity neighbors poly2nb() problematic without modification. based distribution number neighbors based contiguity, \\(k = 5\\) doesn’t seem bad approximate contiguity. Now compute Moran’s cell QC metrics Positive spatial autocorrelation suggested, stronger nCounts nGenes. length scales spatial autocorrelation QC metrics? nice lagged neighborhood graphs can stored reused features rather recomputed feature spdep::sp.correlogram() called behind scene . takes minutes run, long typical song. Another way find length scale spatial autocorrelation bin cells bins different sizes find spatial autocorrelation bin size, probably faster finding lagged values higher higher neighborhoods since geom_bin2d() geom_hex() ggplot2 run pretty fast even large datasets. use semivariogram; gstat also bins data estimating semivariogram calculating semivariogram long distance much faster correlogram cell-cell neighborhood graphs. Note MulticoreParam() doesn’t work Windows; vignette built Linux. Use SnowParam() DoparParam() Windows. See ?BiocParallelParam available parallel processing backends. notice significant performance differences ShowParam() MulticoreParam() context. seem similar length scales, aspect ratios tend decay quickly. Moran’s scatter plot nCounts. first panel, density points plot, second, points influential fitting line highlighted red, still 2D histogram avoid overplotting. obvious clusters plot. Local Moran’s nCounts Cool, appears epithelial regions tend homogenous nCounts.","code":"system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' #> user system elapsed #> 5.755 0.029 5.791 features_use <- c(\"nCounts\", \"nGenes\", \"Area\", \"AspectRatio\") sfe <- colDataMoransI(sfe, features_use, colGraphName = \"knn5\") colFeatureData(sfe)[features_use,] #> DataFrame with 4 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.386655 6.80818 #> nGenes 0.434639 3.19599 #> Area 0.198152 8.96966 #> AspectRatio 0.256211 43.05666 system.time( sfe <- colDataUnivariate(sfe, \"sp.correlogram\", features = features_use, colGraphName = \"knn5\", order = 6, zero.policy = TRUE, BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 470.737 12.785 244.024 plotCorrelogram(sfe, features_use) sfe <- colDataUnivariate(sfe, \"moran.plot\", \"nCounts\", colGraphName = \"knn5\") p1 <- moranPlot(sfe, \"nCounts\", binned = TRUE, plot_influential = FALSE) p2 <- moranPlot(sfe, \"nCounts\", binned = TRUE) p1 / p2 + plot_layout(guides = \"collect\") sfe <- colDataUnivariate(sfe, \"localmoran\", \"nCounts\", colGraphName = \"knn5\") plotLocalResult(sfe, \"localmoran\", \"nCounts\", colGeometryName = \"cellSeg\", divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"data-normalization","dir":"Articles","previous_headings":"","what":"Data normalization","title":"CosMX non-small cell lung cancer data","text":"Given may relationship cell size total counts, total counts may biological thus purely treated technical, questions raised data normalization different standard scRNA-seq practices. instance, technical contributions total counts kind data? Furthermore, cell area, since part technical, z-plane cell segmentation polygons intersects cell, cell types, biological? Also, different methods data normalization affect spatial autocorrelation? spatial autocorrelation used ways normalizing data? Besides correcting technical effects making gene expression cells different total counts comparable, data normalization stabilizes variance tries make data normally distributed since many statistical methods assume normally distributed data. don’t know best practice normalize kind data, still normalize data downstream analyses.","code":"sfe <- logNormCounts(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"CosMX non-small cell lung cancer data","text":"run global Moran’s log normalized gene expression. real genes tend spatial autocorrelation negative controls? seems least shorter length scale captured k nearest neighbor graph, genes don’t strong spatial autocorrelation strong positive spatial autocorrelation. contrast, Moran’s negative controls closely packed around 0, indicating lack spatial autocorrelation, good sign, evidence technical artifact manifests spatial trend manifest negative controls. genes highest Moran’s ? highlight epithelial regions. regions spatially organized, short length scale used Moran’s correlogram shows Moran’s decays first order neighbors. wonder using longer length scale change results.","code":"# Note: on your computer, you can put progressbar = TRUE inside MulticoreParam() # to show progress bar. This applies to any BiocParallParam. sfe <- runMoransI(sfe, features = rownames(sfe), BPPARAM = MulticoreParam(2)) plotRowData(sfe, x = \"moran_sample01\", y = \"is_neg\") + geom_hline(yintercept = 0, linetype = 2) top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"non-spatial-dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Non-spatial dimension reduction and clustering","title":"CosMX non-small cell lung cancer data","text":"first PC highlights epithelium. PC2 highlights T cells. PC4 might highlight leukocytes. Need check genes highest loadings find PCs mean. Non-spatial clustering locating clusters space analyses can done stage: many cell types neighborhood cell? subject different definitions neighborhood. cell types tend co-localize ? Find spatial regions based cell type colocalization, can done R package spicyR (Canete et al. 2022)","code":"set.seed(29) sfe <- runPCA(sfe, ncomponents = 30, scale = TRUE, BSPARAM = IrlbaParam()) ElbowPlot(sfe, ndims = 30) plotDimLoadings(sfe, dims = 1:6) spatialReducedDim(sfe, \"PCA\", 6, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2, scattermore = TRUE) colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\")[,1:15], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' data(\"ditto_colors\") plotPCA(sfe, ncomponents = 4, colour_by = \"cluster\") + scale_color_manual(values = ditto_colors) #> Scale for colour is already present. #> Adding another scale for colour, which will replace the existing scale. plotSpatialFeature(sfe, \"cluster\", colGeometryName = \"cellSeg\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential expression","title":"CosMX non-small cell lung cancer data","text":"Cluster marker genes found Wilcoxon rank sum test commonly done scRNA-seq. ’s already sorted p-values. Get significant marker cluster plot. Since ’re many points, used development version scater plot points, uninformative due overplotting make plot really slow. Plot top marker genes heatmap","code":"markers <- findMarkers(sfe, groups = colData(sfe)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[6]] #> DataFrame with 980 rows and 12 columns #> p.value FDR summary.AUC AUC.1 AUC.2 AUC.3 #> #> SERPINA1 0.00000e+00 0.00000e+00 0.900457 0.882702 0.917169 0.927197 #> LTF 0.00000e+00 0.00000e+00 0.885649 0.898378 0.896631 0.889493 #> SOD2 1.34767e-268 4.40240e-266 0.763204 0.708843 0.740953 0.759093 #> CXCL2 1.62703e-179 3.98621e-177 0.646570 0.739327 0.700563 0.752507 #> LAMP3 3.01063e-172 5.90084e-170 0.710243 0.718463 0.715373 0.716238 #> ... ... ... ... ... ... ... #> TPSAB1 1 1 0.01337636 0.541769 0.517367 0.530969 #> TPSB2 1 1 0.00641018 0.534170 0.519936 0.530758 #> VIM 1 1 0.16441559 0.588792 0.164416 0.398639 #> VWF 1 1 0.24200118 0.517581 0.242001 0.510094 #> XBP1 1 1 0.21892716 0.676944 0.618364 0.632037 #> AUC.4 AUC.5 AUC.7 AUC.8 AUC.9 AUC.10 #> #> SERPINA1 0.872056 0.928391 0.918108 0.924688 0.900457 0.860946 #> LTF 0.903540 0.905416 0.904550 0.902354 0.885649 0.897075 #> SOD2 0.703673 0.790085 0.690380 0.747167 0.763204 0.824461 #> CXCL2 0.731505 0.754922 0.743084 0.752041 0.738496 0.646570 #> LAMP3 0.714841 0.720568 0.722422 0.715502 0.710243 0.716974 #> ... ... ... ... ... ... ... #> TPSAB1 0.527268 0.532545 0.525186 0.522748 0.01337636 0.518232 #> TPSB2 0.520962 0.507907 0.524842 0.522807 0.00641018 0.524115 #> VIM 0.360806 0.469073 0.344926 0.360144 0.31454577 0.633386 #> VWF 0.511037 0.508046 0.515044 0.509359 0.50754675 0.508200 #> XBP1 0.616943 0.218927 0.611995 0.614587 0.59660992 0.493690 genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe, genes_use, x = \"cluster\", point_fun = function(...) list()) genes_use2 <- unique(unlist(lapply(markers, function(x) rownames(x)[1:5]))) plotGroupedHeatmap(sfe, genes_use2, group = \"cluster\", colour = scales::viridis_pal()(100))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"local-spatial-statistics-of-marker-genes","dir":"Articles","previous_headings":"","what":"Local spatial statistics of marker genes","title":"CosMX non-small cell lung cancer data","text":"Plot genes space Moran’s marker genes Local Moran’s marker genes seems histological regions tend spatially homogenous gene expression others. epithelial region tends homogenous. Run local spatial heteroscdasticity (LOSH) marker genes find local heterogeneity genes heterogeneous also highly expressed, COLA1 IGKC. However case genes. example, MZT2A quite ubiqiutously experssed, heterogeneous regions others, KRT19 seem much heterogeneous ’s highly expressed. MZT2A, LOSH picked artifact edges FOVs, although apparent genes plotted . don’t information cell belongs FOV, FOV edge effects considered data normalization. interesting systematically see LOSH relates gene expression across genes, differs cell types gene functions.","code":"plotSpatialFeature(sfe, genes_use, colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE) rowData(sfe)[genes_use, \"moran_sample01\", drop = FALSE] #> DataFrame with 10 rows and 1 column #> moran_sample01 #> #> MZT2A 0.199130 #> COL4A1 0.213595 #> IGHM 0.293282 #> HLA-DPA1 0.242441 #> IGKC 0.425192 #> SERPINA1 0.254077 #> COL1A1 0.394780 #> IL7R 0.177655 #> TPSB2 0.206115 #> KRT19 0.770433 sfe <- runUnivariate(sfe, \"localmoran\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = genes_use, colGeometryName = \"centroids\", ncol = 2, divergent = TRUE, diverge_center = 0, scattermore = TRUE) sfe <- runUnivariate(sfe, \"LOSH\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"LOSH\", features = genes_use, colGeometryName = \"centroids\", ncol = 2, scattermore = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig4_cosmx.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"CosMX non-small cell lung cancer data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 BiocSingular_1.18.0 #> [3] BiocParallel_1.36.0 spdep_1.3-3 #> [5] sf_1.0-16 spData_2.3.0 #> [7] stringr_1.5.1 patchwork_1.2.0 #> [9] bluster_1.12.0 scran_1.30.2 #> [11] scater_1.30.1 ggplot2_3.5.1 #> [13] scuttle_1.12.0 SpatialExperiment_1.12.0 #> [15] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [17] Biobase_2.62.0 GenomicRanges_1.54.1 #> [19] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [21] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [23] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [25] SFEData_1.4.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] metapod_1.10.1 httpuv_1.6.15 #> [17] sp_2.1-4 cowplot_1.1.3 #> [19] DBI_1.2.2 RColorBrewer_1.1-3 #> [21] abind_1.4-5 zlibbioc_1.48.2 #> [23] purrr_1.0.2 RCurl_1.98-1.14 #> [25] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [27] ggrepel_0.9.5 irlba_2.3.5.1 #> [29] terra_1.7-71 pheatmap_1.0.12 #> [31] units_0.8-5 RSpectra_0.16-1 #> [33] dqrng_0.3.2 pkgdown_2.0.9 #> [35] DelayedMatrixStats_1.24.0 codetools_0.2-20 #> [37] DelayedArray_0.28.0 tidyselect_1.2.1 #> [39] farver_2.1.1 ScaledMatrix_1.10.0 #> [41] viridis_0.6.5 BiocFileCache_2.10.2 #> [43] jsonlite_1.8.8 BiocNeighbors_1.20.2 #> [45] e1071_1.7-14 systemfonts_1.0.6 #> [47] tools_4.3.3 ggnewscale_0.4.10 #> [49] ragg_1.3.0 Rcpp_1.0.12 #> [51] glue_1.7.0 gridExtra_2.3 #> [53] SparseArray_1.2.4 mgcv_1.9-1 #> [55] xfun_0.43 dplyr_1.1.4 #> [57] HDF5Array_1.30.1 withr_3.0.0 #> [59] BiocManager_1.30.22 fastmap_1.1.1 #> [61] boot_1.3-30 rhdf5filters_1.14.1 #> [63] fansi_1.0.6 digest_0.6.35 #> [65] rsvd_1.0.5 R6_2.5.1 #> [67] mime_0.12 textshaping_0.3.7 #> [69] colorspace_2.1-0 wk_0.9.1 #> [71] scattermore_1.2 RSQLite_2.3.6 #> [73] hexbin_1.28.3 utf8_1.2.4 #> [75] generics_0.1.3 class_7.3-22 #> [77] httr_1.4.7 htmlwidgets_1.6.4 #> [79] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [81] scico_1.5.0 gtable_0.3.5 #> [83] blob_1.2.4 XVector_0.42.0 #> [85] htmltools_0.5.8.1 scales_1.3.0 #> [87] png_0.1-8 knitr_1.45 #> [89] rjson_0.2.21 nlme_3.1-164 #> [91] curl_5.2.1 proxy_0.4-27 #> [93] cachem_1.0.8 rhdf5_2.46.1 #> [95] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [97] parallel_4.3.3 vipor_0.4.7 #> [99] AnnotationDbi_1.64.1 desc_1.4.3 #> [101] s2_1.1.6 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 fs_1.6.4 #> [121] ggbeeswarm_0.7.2 stringi_1.8.3 #> [123] viridisLite_0.4.2 deldir_2.0-4 #> [125] munsell_0.5.1 Biostrings_2.70.3 #> [127] Matrix_1.6-5 ExperimentHub_2.10.0 #> [129] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [131] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [133] statmod_1.5.0 shiny_1.8.1.1 #> [135] interactiveDisplayBase_1.40.0 highr_0.10 #> [137] AnnotationHub_3.10.1 igraph_2.0.3 #> [139] memoise_2.0.1 bslib_0.7.0 #> [141] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Xenium breast cancer dataset","text":"Xenium new technology 10X genomics single cell resolution smFISH based spatial transcriptomics. first Xenium dataset formalin fixed paraffin embedded (FFPE) human breast tumor, reported (Janesick et al. 2022) downloaded 10X website. gene count matrix downloaded HDF5 file read R SingleCellExperiment (SCE) object DropletUtils::read10xCounts(). gene count matrix originally DelayedArray, data loaded memory. now, matrix converted memory dgCMatrix. However, next release, like write another vignette disk analyses. challenge representing sf data frames disk, perhaps sedona SQLDataFrame. cell metadata (including centroid coordinates) cell segmentation polygons downloaded parquet files, compact way store columnar data CSV, read R data frames read_parquet arrow package. cell polygons converted sf data frame SpatialFeatureExperiment::df2sf(). SCE object converted SpatialFeatureExperiment (SFE) polygon geometry added SFE object, SFEData package. load packages used vignette. 118708 cells dataset, little CosMX dataset. SFE object doesn’t column names (.e. cell IDs). assign cell IDs. tissue, cell outlines, looks like Plot cell density space","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(ggplot2) library(stringr) library(scater) library(scuttle) library(BiocParallel) library(BiocSingular) library(bluster) library(scran) library(patchwork) theme_set(theme_bw()) (sfe <- JanesickBreastData(dataset = \"rep2\")) #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> class: SpatialFeatureExperiment #> dim: 541 118708 #> metadata(1): Samples #> assays(1): counts #> rownames(541): ABCC11 ACTA2 ... BLANK_0497 BLANK_0499 #> rowData names(6): ID Symbol ... vars cv2 #> colnames: NULL #> colData names(10): Sample Barcode ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : x_centroid y_centroid #> imgData names(1): sample_id #> #> unit: #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON), nucSeg (GEOMETRY) #> #> Graphs: #> sample01: colnames(sfe) <- seq_len(ncol(sfe)) plotGeometry(sfe, \"cellSeg\") plotCellBin2D(sfe, hex = TRUE)"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"cells","dir":"Articles","previous_headings":"Quality control","what":"Cells","title":"Xenium breast cancer dataset","text":"QC metrics precomputed stored colData Since ’re cells, better plot tissue larger, ’ll plot histogram QC metrics spatial plots separately, unlike CosMx vignette. divided nCounts total number genes probed, histogram comparable smFISH-based datasets. Compared FFPE CosMX non-small cell lung cancer dataset, transcripts per gene average larger proportion genes detected dataset, also FFPE. However, interpreted care, since two datasets different tissues different gene panels, may may indicate Xenium better detection efficiency CosMX. seem FOV artifacts. However, cell ID FOV information unavailable examine . standard examination look relationship nCounts nGenes: appear two branches. plot distribution cell area pixels. ’s long tail. nuclei much smaller cells. cell area distributed space? Cells sparse region tend larger dense region. may biological artifact cell segmentation algorithm . nuclei segmentations plotted instead cell segmentation. nuclei much smaller extent difficult see. ’s outlier near right edge section, throwing dynamic range plot. Upon inspection H&E image, outlier bit tissue debris doesn’t look like cell. can still cells dense, gland like regions tend larger nuclei. may biological, nuclei densely packed regions likely undersegmented, .e. multiple nuclei counted one nuclei segmentation program, . observations motivate examination relationship cell area nuclei area: , two branches, probably related cell density cell type. nucleus outlier also large cell area, though much outlier cell area. However, spatial outlier ’s unusually large compared neighbors (scroll two plots back). Next calculate proportion cell z-plane taken nucleus, examine distribution: distribution generated two peaks combined. histogram, seem cells without nuclei segmentation artifacts nucleus larger cell. However, many cells dataset possible just cells visible histogram. double check: cells without nuclei nuclei larger cells. plot nuclei proportion space: Cells histological regions larger proportions occupied nuclei. interesting check, controlling cell type, cell area, nucleus area, proportion cell occupied nucleus relate gene expression. However, problem performing analysis cell segmentation available one z-plane areas also relate z-plane intersects cell. plot 2D histogram better show density points plot: Smaller cells tend higher proportion occupied nucleus. can related cell type, limitation small nuclei can tissue. also examine relationship nucleus area proportion cell occupied nucleus: outlier obvious. cells small nuclei low proportion area occupied nucleus.","code":"names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" n_panel <- 313 colData(sfe)$nCounts_normed <- sfe$nCounts/n_panel colData(sfe)$nGenes_normed <- sfe$nGenes/n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\") plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"cellSeg\") plotColData(sfe, x=\"nCounts\", y=\"nGenes\", bins = 100) plotColDataHistogram(sfe, c(\"cell_area\", \"nucleus_area\"), scales = \"free_y\") plotSpatialFeature(sfe, \"cell_area\", colGeometryName = \"cellSeg\") plotSpatialFeature(sfe, \"nucleus_area\", colGeometryName = \"nucSeg\") plotColData(sfe, x=\"cell_area\", y=\"nucleus_area\", bins = 100) colData(sfe)$prop_nuc <- sfe$nucleus_area / sfe$cell_area plotColDataHistogram(sfe, \"prop_nuc\") # No nucleus sum(sfe$nucleus_area < 1) #> [1] 0 # Nucleus larger than cell sum(sfe$nucleus_area > sfe$cell_area) #> [1] 0 plotSpatialFeature(sfe, \"prop_nuc\", colGeometryName = \"cellSeg\") plotColData(sfe, x=\"cell_area\", y=\"prop_nuc\") plotColData(sfe, x=\"nucleus_area\", y=\"prop_nuc\", bins = 100)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"negative-controls","dir":"Articles","previous_headings":"Quality control","what":"Negative controls","title":"Xenium breast cancer dataset","text":"Since hundred genes plus negative control probes, row names SFE object can printed find negative control probes called. According Xenium paper (Janesick et al. 2022), 3 types controls: probe controls assess non-specific binding RNA, decoding controls assess misassigned genes, genomic DNA (gDNA) controls ensure signal RNA. paper explain detail control probes designed, explain blank probes . blank probes can used negative control. number 1, probe control number 2, decoding control must number 3, gDNA control Also make indicator whether feature sort negative control addPerCellQCMetrics() function scuttle package can conveniently add transcript counts, proportion total counts, number features detected subset features SCE object. SFE object, SFE inherits SCE. Next plot proportion transcript counts coming negative control. histogram dominated bin zero extreme outliers seen evident scale x axis. also plot histogram cells least 1 count negative control. NA’s come cells got segmented transcripts detected. vast majority cells less 1% transcript counts negative controls, outliers 50%. Next plot distribution number negative control counts per cell: counts low, mostly zero, outliers 10 counts types aggregated. outlier 50% counts negative controls must low total real transcript counts begin . scuttle package can detect outliers, default assigns anything zero outlier, since 3 median absolute deviations (MADs) away median, 0, MAD 0 since vast majority cells don’t negative control count. makes sense allow small proportion negative controls. use distribution just cells least 1 negative control count find outliers. distribution long tail definite outliers. code extracts outliers, based cells least one negative control count examine outliers located space: find outliers difficult see: analysis reveals outliers seem smaller. Outliers negative probe controls negative codeword controls also hard see plot, plots skipped . top left region tissue tends counts antisense controls. Now identified outliers, can remove along empty cells proceeding analysis: 1000 cells removed. Next check many negative control features detected per cell: 3 counts per cell per type. non-outliers, type around 1%, data looks good.","code":"rownames(sfe) #> [1] \"ABCC11\" \"ACTA2\" #> [3] \"ACTG2\" \"ADAM9\" #> [5] \"ADGRE5\" \"ADH1B\" #> [7] \"ADIPOQ\" \"AGR3\" #> [9] \"AHSP\" \"AIF1\" #> [11] \"AKR1C1\" \"AKR1C3\" #> [13] \"ALDH1A3\" \"ANGPT2\" #> [15] \"ANKRD28\" \"ANKRD29\" #> [17] \"ANKRD30A\" \"APOBEC3A\" #> [19] \"APOBEC3B\" \"APOC1\" #> [21] \"AQP1\" \"AQP3\" #> [23] \"AR\" \"AVPR1A\" #> [25] \"BACE2\" \"BANK1\" #> [27] \"BASP1\" \"BTNL9\" #> [29] \"C15orf48\" \"C1QA\" #> [31] \"C1QC\" \"C2orf42\" #> [33] \"C5orf46\" \"C6orf132\" #> [35] \"CAV1\" \"CAVIN2\" #> [37] \"CCDC6\" \"CCDC80\" #> [39] \"CCL20\" \"CCL5\" #> [41] \"CCL8\" \"CCND1\" #> [43] \"CCPG1\" \"CCR7\" #> [45] \"CD14\" \"CD163\" #> [47] \"CD19\" \"CD1C\" #> [49] \"CD247\" \"CD27\" #> [51] \"CD274\" \"CD3D\" #> [53] \"CD3E\" \"CD3G\" #> [55] \"CD4\" \"CD68\" #> [57] \"CD69\" \"CD79A\" #> [59] \"CD79B\" \"CD80\" #> [61] \"CD83\" \"CD86\" #> [63] \"CD8A\" \"CD8B\" #> [65] \"CD9\" \"CD93\" #> [67] \"CDC42EP1\" \"CDH1\" #> [69] \"CEACAM6\" \"CEACAM8\" #> [71] \"CENPF\" \"CLCA2\" #> [73] \"CLDN4\" \"CLDN5\" #> [75] \"CLEC14A\" \"CLEC9A\" #> [77] \"CLECL1\" \"CLIC6\" #> [79] \"CPA3\" \"CRHBP\" #> [81] \"CRISPLD2\" \"CSF3\" #> [83] \"CTH\" \"CTLA4\" #> [85] \"CTSG\" \"CTTN\" #> [87] \"CX3CR1\" \"CXCL12\" #> [89] \"CXCL16\" \"CXCL5\" #> [91] \"CXCR4\" \"CYP1A1\" #> [93] \"CYTIP\" \"DAPK3\" #> [95] \"DERL3\" \"DMKN\" #> [97] \"DNAAF1\" \"DNTTIP1\" #> [99] \"DPT\" \"DSC2\" #> [101] \"DSP\" \"DST\" #> [103] \"DUSP2\" \"DUSP5\" #> [105] \"EDN1\" \"EDNRB\" #> [107] \"EGFL7\" \"EGFR\" #> [109] \"EIF4EBP1\" \"ELF3\" #> [111] \"ELF5\" \"ENAH\" #> [113] \"EPCAM\" \"ERBB2\" #> [115] \"ERN1\" \"ESM1\" #> [117] \"ESR1\" \"FAM107B\" #> [119] \"FAM49A\" \"FASN\" #> [121] \"FBLIM1\" \"FBLN1\" #> [123] \"FCER1A\" \"FCER1G\" #> [125] \"FCGR3A\" \"FGL2\" #> [127] \"FLNB\" \"FOXA1\" #> [129] \"FOXC2\" \"FOXP3\" #> [131] \"FSTL3\" \"GATA3\" #> [133] \"GJB2\" \"GLIPR1\" #> [135] \"GNLY\" \"GPR183\" #> [137] \"GZMA\" \"GZMB\" #> [139] \"GZMK\" \"HAVCR2\" #> [141] \"HDC\" \"HMGA1\" #> [143] \"HOOK2\" \"HOXD8\" #> [145] \"HOXD9\" \"HPX\" #> [147] \"IGF1\" \"IGSF6\" #> [149] \"IL2RA\" \"IL2RG\" #> [151] \"IL3RA\" \"IL7R\" #> [153] \"ITGAM\" \"ITGAX\" #> [155] \"ITM2C\" \"JUP\" #> [157] \"KARS\" \"KDR\" #> [159] \"KIT\" \"KLF5\" #> [161] \"KLRB1\" \"KLRC1\" #> [163] \"KLRD1\" \"KLRF1\" #> [165] \"KRT14\" \"KRT15\" #> [167] \"KRT16\" \"KRT23\" #> [169] \"KRT5\" \"KRT6B\" #> [171] \"KRT7\" \"KRT8\" #> [173] \"LAG3\" \"LARS\" #> [175] \"LDHB\" \"LEP\" #> [177] \"LGALSL\" \"LIF\" #> [179] \"LILRA4\" \"LPL\" #> [181] \"LPXN\" \"LRRC15\" #> [183] \"LTB\" \"LUM\" #> [185] \"LY86\" \"LYPD3\" #> [187] \"LYZ\" \"MAP3K8\" #> [189] \"MDM2\" \"MEDAG\" #> [191] \"MKI67\" \"MLPH\" #> [193] \"MMP1\" \"MMP12\" #> [195] \"MMP2\" \"MMRN2\" #> [197] \"MNDA\" \"MPO\" #> [199] \"MRC1\" \"MS4A1\" #> [201] \"MUC6\" \"MYBPC1\" #> [203] \"MYH11\" \"MYLK\" #> [205] \"MYO5B\" \"MZB1\" #> [207] \"NARS\" \"NCAM1\" #> [209] \"NDUFA4L2\" \"NKG7\" #> [211] \"NOSTRIN\" \"NPM3\" #> [213] \"OCIAD2\" \"OPRPN\" #> [215] \"OXTR\" \"PCLAF\" #> [217] \"PCOLCE\" \"PDCD1\" #> [219] \"PDCD1LG2\" \"PDE4A\" #> [221] \"PDGFRA\" \"PDGFRB\" #> [223] \"PDK4\" \"PECAM1\" #> [225] \"PELI1\" \"PGR\" #> [227] \"PIGR\" \"PIM1\" #> [229] \"PLD4\" \"POLR2J3\" #> [231] \"POSTN\" \"PPARG\" #> [233] \"PRDM1\" \"PRF1\" #> [235] \"PTGDS\" \"PTN\" #> [237] \"PTPRC\" \"PTRHD1\" #> [239] \"QARS\" \"RAB30\" #> [241] \"RAMP2\" \"RAPGEF3\" #> [243] \"REXO4\" \"RHOH\" #> [245] \"RORC\" \"RTKN2\" #> [247] \"RUNX1\" \"S100A14\" #> [249] \"S100A4\" \"S100A8\" #> [251] \"SCD\" \"SCGB2A1\" #> [253] \"SDC4\" \"SEC11C\" #> [255] \"SEC24A\" \"SELL\" #> [257] \"SERHL2\" \"SERPINA3\" #> [259] \"SERPINB9\" \"SFRP1\" #> [261] \"SFRP4\" \"SH3YL1\" #> [263] \"SLAMF1\" \"SLAMF7\" #> [265] \"SLC25A37\" \"SLC4A1\" #> [267] \"SLC5A6\" \"SMAP2\" #> [269] \"SMS\" \"SNAI1\" #> [271] \"SOX17\" \"SOX18\" #> [273] \"SPIB\" \"SQLE\" #> [275] \"SRPK1\" \"SSTR2\" #> [277] \"STC1\" \"SVIL\" #> [279] \"TAC1\" \"TACSTD2\" #> [281] \"TCEAL7\" \"TCF15\" #> [283] \"TCF4\" \"TCF7\" #> [285] \"TCIM\" \"TCL1A\" #> [287] \"TENT5C\" \"TFAP2A\" #> [289] \"THAP2\" \"TIFA\" #> [291] \"TIGIT\" \"TIMP4\" #> [293] \"TMEM147\" \"TNFRSF17\" #> [295] \"TOMM7\" \"TOP2A\" #> [297] \"TPD52\" \"TPSAB1\" #> [299] \"TRAC\" \"TRAF4\" #> [301] \"TRAPPC3\" \"TRIB1\" #> [303] \"TUBA4A\" \"TUBB2B\" #> [305] \"TYROBP\" \"UCP1\" #> [307] \"USP53\" \"VOPP1\" #> [309] \"VWF\" \"WARS\" #> [311] \"ZEB1\" \"ZEB2\" #> [313] \"ZNF562\" \"NegControlProbe_00042\" #> [315] \"NegControlProbe_00041\" \"NegControlProbe_00039\" #> [317] \"NegControlProbe_00035\" \"NegControlProbe_00034\" #> [319] \"NegControlProbe_00033\" \"NegControlProbe_00031\" #> [321] \"NegControlProbe_00025\" \"NegControlProbe_00024\" #> [323] \"NegControlProbe_00022\" \"NegControlProbe_00019\" #> [325] \"NegControlProbe_00017\" \"NegControlProbe_00016\" #> [327] \"NegControlProbe_00014\" \"NegControlProbe_00013\" #> [329] \"NegControlProbe_00012\" \"NegControlProbe_00009\" #> [331] \"NegControlProbe_00004\" \"NegControlProbe_00003\" #> [333] \"NegControlProbe_00002\" \"antisense_PROKR2\" #> [335] \"antisense_ULK3\" \"antisense_SCRIB\" #> [337] \"antisense_TRMU\" \"antisense_MYLIP\" #> [339] \"antisense_LGI3\" \"antisense_BCL2L15\" #> [341] \"antisense_ADCY4\" \"NegControlCodeword_0500\" #> [343] \"NegControlCodeword_0501\" \"NegControlCodeword_0502\" #> [345] \"NegControlCodeword_0503\" \"NegControlCodeword_0504\" #> [347] \"NegControlCodeword_0505\" \"NegControlCodeword_0506\" #> [349] \"NegControlCodeword_0507\" \"NegControlCodeword_0508\" #> [351] \"NegControlCodeword_0509\" \"NegControlCodeword_0510\" #> [353] \"NegControlCodeword_0511\" \"NegControlCodeword_0512\" #> [355] \"NegControlCodeword_0513\" \"NegControlCodeword_0514\" #> [357] \"NegControlCodeword_0515\" \"NegControlCodeword_0516\" #> [359] \"NegControlCodeword_0517\" \"NegControlCodeword_0518\" #> [361] \"NegControlCodeword_0519\" \"NegControlCodeword_0520\" #> [363] \"NegControlCodeword_0521\" \"NegControlCodeword_0522\" #> [365] \"NegControlCodeword_0523\" \"NegControlCodeword_0524\" #> [367] \"NegControlCodeword_0525\" \"NegControlCodeword_0526\" #> [369] \"NegControlCodeword_0527\" \"NegControlCodeword_0528\" #> [371] \"NegControlCodeword_0529\" \"NegControlCodeword_0530\" #> [373] \"NegControlCodeword_0531\" \"NegControlCodeword_0532\" #> [375] \"NegControlCodeword_0533\" \"NegControlCodeword_0534\" #> [377] \"NegControlCodeword_0535\" \"NegControlCodeword_0536\" #> [379] \"NegControlCodeword_0537\" \"NegControlCodeword_0538\" #> [381] \"NegControlCodeword_0539\" \"NegControlCodeword_0540\" #> [383] \"BLANK_0006\" \"BLANK_0013\" #> [385] \"BLANK_0037\" \"BLANK_0069\" #> [387] \"BLANK_0072\" \"BLANK_0087\" #> [389] \"BLANK_0110\" \"BLANK_0114\" #> [391] \"BLANK_0120\" \"BLANK_0147\" #> [393] \"BLANK_0180\" \"BLANK_0186\" #> [395] \"BLANK_0272\" \"BLANK_0278\" #> [397] \"BLANK_0319\" \"BLANK_0321\" #> [399] \"BLANK_0337\" \"BLANK_0350\" #> [401] \"BLANK_0351\" \"BLANK_0352\" #> [403] \"BLANK_0353\" \"BLANK_0354\" #> [405] \"BLANK_0355\" \"BLANK_0356\" #> [407] \"BLANK_0357\" \"BLANK_0358\" #> [409] \"BLANK_0359\" \"BLANK_0360\" #> [411] \"BLANK_0361\" \"BLANK_0362\" #> [413] \"BLANK_0363\" \"BLANK_0364\" #> [415] \"BLANK_0365\" \"BLANK_0366\" #> [417] \"BLANK_0367\" \"BLANK_0368\" #> [419] \"BLANK_0369\" \"BLANK_0370\" #> [421] \"BLANK_0371\" \"BLANK_0372\" #> [423] \"BLANK_0373\" \"BLANK_0374\" #> [425] \"BLANK_0375\" \"BLANK_0376\" #> [427] \"BLANK_0377\" \"BLANK_0378\" #> [429] \"BLANK_0379\" \"BLANK_0380\" #> [431] \"BLANK_0381\" \"BLANK_0382\" #> [433] \"BLANK_0383\" \"BLANK_0384\" #> [435] \"BLANK_0385\" \"BLANK_0386\" #> [437] \"BLANK_0387\" \"BLANK_0388\" #> [439] \"BLANK_0389\" \"BLANK_0390\" #> [441] \"BLANK_0391\" \"BLANK_0392\" #> [443] \"BLANK_0393\" \"BLANK_0394\" #> [445] \"BLANK_0395\" \"BLANK_0396\" #> [447] \"BLANK_0397\" \"BLANK_0398\" #> [449] \"BLANK_0399\" \"BLANK_0400\" #> [451] \"BLANK_0401\" \"BLANK_0402\" #> [453] \"BLANK_0403\" \"BLANK_0404\" #> [455] \"BLANK_0405\" \"BLANK_0406\" #> [457] \"BLANK_0407\" \"BLANK_0408\" #> [459] \"BLANK_0409\" \"BLANK_0410\" #> [461] \"BLANK_0411\" \"BLANK_0412\" #> [463] \"BLANK_0413\" \"BLANK_0414\" #> [465] \"BLANK_0415\" \"BLANK_0416\" #> [467] \"BLANK_0417\" \"BLANK_0418\" #> [469] \"BLANK_0419\" \"BLANK_0420\" #> [471] \"BLANK_0421\" \"BLANK_0422\" #> [473] \"BLANK_0423\" \"BLANK_0424\" #> [475] \"BLANK_0425\" \"BLANK_0426\" #> [477] \"BLANK_0427\" \"BLANK_0428\" #> [479] \"BLANK_0429\" \"BLANK_0430\" #> [481] \"BLANK_0431\" \"BLANK_0432\" #> [483] \"BLANK_0433\" \"BLANK_0434\" #> [485] \"BLANK_0435\" \"BLANK_0436\" #> [487] \"BLANK_0437\" \"BLANK_0438\" #> [489] \"BLANK_0439\" \"BLANK_0440\" #> [491] \"BLANK_0441\" \"BLANK_0442\" #> [493] \"BLANK_0443\" \"BLANK_0444\" #> [495] \"BLANK_0445\" \"BLANK_0446\" #> [497] \"BLANK_0447\" \"BLANK_0448\" #> [499] \"BLANK_0449\" \"BLANK_0450\" #> [501] \"BLANK_0451\" \"BLANK_0452\" #> [503] \"BLANK_0453\" \"BLANK_0454\" #> [505] \"BLANK_0455\" \"BLANK_0456\" #> [507] \"BLANK_0457\" \"BLANK_0458\" #> [509] \"BLANK_0459\" \"BLANK_0460\" #> [511] \"BLANK_0461\" \"BLANK_0462\" #> [513] \"BLANK_0463\" \"BLANK_0464\" #> [515] \"BLANK_0465\" \"BLANK_0466\" #> [517] \"BLANK_0467\" \"BLANK_0468\" #> [519] \"BLANK_0469\" \"BLANK_0470\" #> [521] \"BLANK_0471\" \"BLANK_0472\" #> [523] \"BLANK_0473\" \"BLANK_0474\" #> [525] \"BLANK_0475\" \"BLANK_0476\" #> [527] \"BLANK_0477\" \"BLANK_0478\" #> [529] \"BLANK_0479\" \"BLANK_0480\" #> [531] \"BLANK_0481\" \"BLANK_0482\" #> [533] \"BLANK_0483\" \"BLANK_0484\" #> [535] \"BLANK_0485\" \"BLANK_0486\" #> [537] \"BLANK_0487\" \"BLANK_0488\" #> [539] \"BLANK_0489\" \"BLANK_0497\" #> [541] \"BLANK_0499\" is_blank <- str_detect(rownames(sfe), \"^BLANK_\") sum(is_blank) #> [1] 159 is_neg <- str_detect(rownames(sfe), \"^NegControlProbe\") sum(is_neg) #> [1] 20 is_neg2 <- str_detect(rownames(sfe), \"^NegControlCodeword\") sum(is_neg2) #> [1] 41 is_anti <- str_detect(rownames(sfe), \"^antisense\") sum(is_anti) #> [1] 8 is_any_neg <- is_blank | is_neg | is_neg2 | is_anti sfe <- addPerCellQCMetrics(sfe, subsets = list(blank = is_blank, negProbe = is_neg, negCodeword = is_neg2, anti = is_anti, any_neg = is_any_neg)) names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" #> [11] \"nCounts_normed\" \"nGenes_normed\" #> [13] \"prop_nuc\" \"sum\" #> [15] \"detected\" \"subsets_blank_sum\" #> [17] \"subsets_blank_detected\" \"subsets_blank_percent\" #> [19] \"subsets_negProbe_sum\" \"subsets_negProbe_detected\" #> [21] \"subsets_negProbe_percent\" \"subsets_negCodeword_sum\" #> [23] \"subsets_negCodeword_detected\" \"subsets_negCodeword_percent\" #> [25] \"subsets_anti_sum\" \"subsets_anti_detected\" #> [27] \"subsets_anti_percent\" \"subsets_any_neg_sum\" #> [29] \"subsets_any_neg_detected\" \"subsets_any_neg_percent\" #> [31] \"total\" cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_percent$\")] plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) #> Warning: Removed 285 rows containing non-finite outside the scale range #> (`stat_bin()`). plotColDataHistogram(sfe, cols_use, bins = 100, ncol = 3) + scale_x_log10() + annotation_logticks(sides = \"b\") #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 565577 rows containing non-finite outside the scale range #> (`stat_bin()`). cols_use2 <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_detected$\")] plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) + # Avoid decimal breaks on x axis unless there're too few breaks scale_x_continuous(breaks = scales::breaks_extended(Q = c(1,2,5))) get_neg_ctrl_outliers <- function(col, sfe) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\") outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } cols_use <- names(colData(sfe))[str_detect(names(colData(sfe)), \"_percent$\")] for (n in cols_use) { sfe <- get_neg_ctrl_outliers(n, sfe) } names(colData(sfe)) #> [1] \"Sample\" \"Barcode\" #> [3] \"transcript_counts\" \"control_probe_counts\" #> [5] \"control_codeword_counts\" \"cell_area\" #> [7] \"nucleus_area\" \"sample_id\" #> [9] \"nCounts\" \"nGenes\" #> [11] \"nCounts_normed\" \"nGenes_normed\" #> [13] \"prop_nuc\" \"sum\" #> [15] \"detected\" \"subsets_blank_sum\" #> [17] \"subsets_blank_detected\" \"subsets_blank_percent\" #> [19] \"subsets_negProbe_sum\" \"subsets_negProbe_detected\" #> [21] \"subsets_negProbe_percent\" \"subsets_negCodeword_sum\" #> [23] \"subsets_negCodeword_detected\" \"subsets_negCodeword_percent\" #> [25] \"subsets_anti_sum\" \"subsets_anti_detected\" #> [27] \"subsets_anti_percent\" \"subsets_any_neg_sum\" #> [29] \"subsets_any_neg_detected\" \"subsets_any_neg_percent\" #> [31] \"total\" \"is_blank_outlier\" #> [33] \"is_negProbe_outlier\" \"is_negCodeword_outlier\" #> [35] \"is_anti_outlier\" \"is_any_neg_outlier\" plotSpatialFeature(sfe, \"is_blank_outlier\", colGeometryName = \"cellSeg\") plotColData(sfe, y = \"is_blank_outlier\", x = \"cell_area\", point_fun = function(...) list()) plotSpatialFeature(sfe, \"is_anti_outlier\", colGeometryName = \"cellSeg\") inds_keep <- sfe$nCounts > 0 & sfe$nucleus_area < 400 & !sfe$is_anti_outlier & !sfe$is_blank_outlier & !sfe$is_negCodeword_outlier & !sfe$is_negProbe_outlier (sfe <- sfe[,inds_keep]) #> class: SpatialFeatureExperiment #> dim: 541 117503 #> metadata(1): Samples #> assays(1): counts #> rownames(541): ABCC11 ACTA2 ... BLANK_0497 BLANK_0499 #> rowData names(6): ID Symbol ... vars cv2 #> colnames(117503): 1 2 ... 118707 118708 #> colData names(36): Sample Barcode ... is_anti_outlier #> is_any_neg_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : x_centroid y_centroid #> imgData names(1): sample_id #> #> unit: #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON), nucSeg (GEOMETRY) #> #> Graphs: #> sample01: plotColDataHistogram(sfe, cols_use2, bins = 20, ncol = 3) + # Avoid decimal breaks on x axis unless there're too few breaks scale_x_continuous(breaks = scales::breaks_extended(3, Q = c(1,2,5)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"Xenium breast cancer dataset","text":"look mean variance gene Real genes generally higher mean expression across cells negative controls. real genes negative controls plotted different colors red line \\(y = x\\) expected data follows Poisson distribution. Negative controls real genes form mostly separate clusters. Negative controls stick close line, real genes overdispersed. Unlike CosMX dataset, negative controls don’t seem overdispersed.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_neg <- is_any_neg plotRowData(sfe, x = \"means\", y = \"is_neg\") + scale_y_log10() + annotation_logticks(sides = \"b\") plotRowData(sfe, x=\"means\", y=\"vars\", color_by = \"is_neg\") + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() + labs(color = \"Negative control\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"Xenium breast cancer dataset","text":"’s sparse dense region. poses question type neighborhood graph use, e.g. conceivable cells sparse region just singletons. Furthermore, unclear length scale influence might . might depend cell type contact secreted signals used cell type, length scale influence. k nearest neighbors used, neighbors dense region much closer together sparse region. distance based neighbors used, cells dense region neighbors cells sparse region, sparse region can break multiple compartments distance cutoff long enough. purpose demonstration, use k nearest neighbors \\(k = 5\\), inverse distance weighting. Note using neighbors leads longer computation time spatial autocorrelation metrics. Global Moran’s indicatse positive spatial autocorrelation. strength spatial autocorrelation can vary spatially, also run local Moran’s . pointsize argument adjusts point size scattermore. default 0, meaning single pixels, since cells sparse region hard see way, increase pointsize. still plot polygons larger single panel plots, use scattermore multi-panel plots polygons panel invisible anyway due small size save time. Interestingly, nCounts homogeneous interior dense region, nGenes homogeneous edge dense region. expected, cell area homogeneous sparse region. However, nucleus area homogeneous interior dense region. Moran plot nCounts obvious clusters . lower panel, 2D histogram influential points plotted red.","code":"system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 6.706 0.028 6.738 sfe <- colDataMoransI(sfe, c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGraphName = \"knn5\") colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"),] #> DataFrame with 4 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts 0.422387 5.55509 #> nGenes 0.401395 3.13694 #> cell_area 0.628837 7.57098 #> nucleus_area 0.377248 6.88331 sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"cell_area\", \"nucleus_area\"), colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0, pointsize = 1) sfe <- colDataUnivariate(sfe, \"moran.plot\", \"nCounts\", colGraphName = \"knn5\") p1 <- moranPlot(sfe, \"nCounts\", binned = TRUE, plot_influential = FALSE) p2 <- moranPlot(sfe, \"nCounts\", binned = TRUE) p1 / p2 + plot_layout(guides = \"collect\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"Xenium breast cancer dataset","text":"default, gene expression, log normalized counts used spatial autocorrelation metrics, running Moran’s , normalize data. Use cores available speed . expected, generally negative controls tightly clustered around 0, real genes positive Moran’s , means generally technical artifact spatial trend. significantly negative Moran’s observed. negative spatial autocorrelation rare gene expression? two negative controls sizable Moran’s ? somewhat spatial trend antisense probe, detected upper left. However, might significantly affect results since 2 counts 1% counts cell. negative control codeword 1 count per cell cells negative control detected seem far . detected negative controls, detected one also one highest Moran’s among negative controls. However, negative control higher Moran’s among detected. genes highest Moran’s ? highlight histological regions, CosMX vignette. Moran’s relate gene expression level? highly expressed genes higher Moran’s , less expressed genes higher Moran’s well.","code":"sfe <- logNormCounts(sfe) system.time( sfe <- runMoransI(sfe, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 38.816 5.136 24.308 rowData(sfe)$is_neg <- is_any_neg plotRowData(sfe, x = \"moran_sample01\", y = \"is_neg\") ord <- order(rowData(sfe)$moran_sample01[is_any_neg], decreasing = TRUE)[1:2] top_neg <- rownames(sfe)[is_any_neg][ord] plotSpatialFeature(sfe, top_neg, colGeometryName = \"centroids\", scattermore = TRUE, pointsize = 1) head(sort(rowData(sfe)$means[is_any_neg], decreasing = TRUE), 15) #> antisense_PROKR2 antisense_SCRIB antisense_BCL2L15 #> 0.0192761036 0.0131741317 0.0066806805 #> antisense_TRMU antisense_MYLIP antisense_ULK3 #> 0.0042041480 0.0030807724 0.0028169494 #> BLANK_0485 antisense_ADCY4 antisense_LGI3 #> 0.0023403658 0.0019829281 0.0017871884 #> BLANK_0430 NegControlProbe_00035 NegControlProbe_00012 #> 0.0015063445 0.0010978443 0.0010382714 #> NegControlProbe_00033 NegControlProbe_00014 BLANK_0120 #> 0.0009446567 0.0009361463 0.0009276359 top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2, pointsize = 0.5) plotRowData(sfe, x = \"means\", y = \"moran_sample01\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"non-spatial-dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Non-spatial dimension reduction and clustering","title":"Xenium breast cancer dataset","text":"run non-spatial PCA scRNA-seq data spatial region explicitly used, PC’s highlight spatial regions due spatial autocorrelation gene expression histological regions different cell types. Non-spatial clustering locating clusters space Now scater can also rasterize plots lots points rasterise argument, different mechanism scattermore requires system dependencies. Plot location clusters space","code":"set.seed(29) sfe <- runPCA(sfe, ncomponents = 30, scale = TRUE, BSPARAM = IrlbaParam()) ElbowPlot(sfe, ndims = 30) plotDimLoadings(sfe, dims = 1:6) spatialReducedDim(sfe, \"PCA\", 6, colGeometryName = \"centroids\", divergent = TRUE, diverge_center = 0, ncol = 2, scattermore = TRUE, pointsize = 0.5) colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\")[,1:15], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' plotPCA(sfe, ncomponents = 4, colour_by = \"cluster\", rasterise = FALSE) plotSpatialFeature(sfe, \"cluster\", colGeometryName = \"cellSeg\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential expression","title":"Xenium breast cancer dataset","text":"Cluster marker genes found Wilcoxon rank sum test commonly done scRNA-seq. ’s already sorted p-values: code extracts significant markers cluster: allows plotting top marker genes heatmap:","code":"markers <- findMarkers(sfe, groups = colData(sfe)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") markers[[6]] #> DataFrame with 541 rows and 16 columns #> p.value FDR summary.AUC AUC.1 AUC.2 AUC.3 #> #> TENT5C 1.63326e-296 8.83591e-294 0.967126 0.925332 0.970773 0.973994 #> SEC11C 7.21967e-230 1.95292e-227 0.910578 0.899842 0.926434 0.943356 #> MZB1 1.53225e-208 2.76316e-206 0.890901 0.954750 0.952539 0.953965 #> PRDM1 1.85488e-171 2.50872e-169 0.851328 0.902618 0.844973 0.845018 #> SLAMF7 1.36571e-121 1.47770e-119 0.797638 0.973023 0.928639 0.964871 #> ... ... ... ... ... ... ... #> TRAF4 1 1 0.190188 0.190188 0.498645 0.463686 #> USP53 1 1 0.237375 0.237375 0.515006 0.439300 #> VWF 1 1 0.148324 0.521143 0.492783 0.472879 #> ZEB2 1 1 0.227203 0.764394 0.227203 0.317134 #> ZNF562 1 1 0.286342 0.286342 0.426846 0.479020 #> AUC.4 AUC.5 AUC.7 AUC.8 AUC.9 AUC.10 AUC.11 #> #> TENT5C 0.978959 0.957859 0.975948 0.951552 0.970808 0.957569 0.967970 #> SEC11C 0.955726 0.912545 0.930013 0.929966 0.935488 0.925262 0.924159 #> MZB1 0.957403 0.947673 0.955833 0.909592 0.954841 0.954570 0.946052 #> PRDM1 0.886670 0.720187 0.838240 0.903536 0.876730 0.880933 0.863019 #> SLAMF7 0.970315 0.931636 0.966127 0.973165 0.967341 0.927913 0.951534 #> ... ... ... ... ... ... ... ... #> TRAF4 0.498581 0.510160 0.476258 0.267629 0.424212 0.485015 0.482499 #> USP53 0.367034 0.532702 0.506453 0.317515 0.461329 0.516505 0.492849 #> VWF 0.436092 0.501213 0.148324 0.548782 0.541546 0.545200 0.438237 #> ZEB2 0.296810 0.491759 0.278638 0.784944 0.645498 0.568365 0.301803 #> ZNF562 0.468102 0.525932 0.449265 0.297023 0.407262 0.512553 0.480637 #> AUC.12 AUC.13 AUC.14 #> #> TENT5C 0.953956 0.967126 0.979535 #> SEC11C 0.902004 0.910578 0.926748 #> MZB1 0.939414 0.890901 0.952151 #> PRDM1 0.834607 0.851328 0.899233 #> SLAMF7 0.944967 0.797638 0.971807 #> ... ... ... ... #> TRAF4 0.475006 0.352237 0.389822 #> USP53 0.552792 0.545264 0.333889 #> VWF 0.507755 0.508495 0.557926 #> ZEB2 0.368603 0.251563 0.771073 #> ZNF562 0.525134 0.543883 0.375170 genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe, genes_use, x = \"cluster\", point_fun = function(...) list()) genes_use2 <- unique(unlist(lapply(markers, function(x) rownames(x)[1:5]))) plotGroupedHeatmap(sfe, genes_use2, group = \"cluster\", colour = scales::viridis_pal()(100))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"local-spatial-statistics-of-marker-genes","dir":"Articles","previous_headings":"","what":"Local spatial statistics of marker genes","title":"Xenium breast cancer dataset","text":"First plot genes space reference Global Moran’s marker genes shown : marker genes positive spatial autocorrelation, stronger others. Local Moran’s marker genes shown : seems histological regions tend spatially homogenous gene expression others. epithelial region tends homogenous. genes, regions higher expression also higher local Moran’s , FOXA1 GATA3, genes, case, FGL2 LUM. Finally, assess local spatial heteroscdasticity (LOSH) marker genes find local heterogeneity: , just like CosMX dataset, LOSH higher gene highly expressed (e.g. CD3E, LUM, TENT5C) cases (e.g. FOXA1, GATA3). may due spatial distribution different cell types.","code":"plotSpatialFeature(sfe, genes_use, colGeometryName = \"centroids\", ncol = 3, pointsize = 0.3, scattermore = TRUE) setNames(rowData(sfe)[genes_use, \"moran_sample01\"], genes_use) #> FOXA1 FGL2 LUM ADIPOQ CD3E TENT5C CD93 GATA3 #> 0.7421765 0.2604219 0.6812312 0.5493112 0.4015241 0.2806543 0.3250982 0.6558350 #> MYLK APOC1 CPA3 MS4A1 LILRA4 KRT15 #> 0.4653625 0.2696177 0.1904912 0.2144728 0.1092981 0.5425155 sfe <- runUnivariate(sfe, \"localmoran\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"localmoran\", features = genes_use, colGeometryName = \"centroids\", ncol = 3, divergent = TRUE, diverge_center = 0, scattermore = TRUE, pointsize = 0.3) sfe <- runUnivariate(sfe, \"LOSH\", features = genes_use, colGraphName = \"knn5\", BPPARAM = MulticoreParam(2)) plotLocalResult(sfe, \"LOSH\", features = genes_use, colGeometryName = \"centroids\", ncol = 3, scattermore = TRUE, pointsize = 0.3)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig5_xenium.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Xenium breast cancer dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] patchwork_1.2.0 scran_1.30.2 #> [3] bluster_1.12.0 BiocSingular_1.18.0 #> [5] BiocParallel_1.36.0 scater_1.30.1 #> [7] scuttle_1.12.0 stringr_1.5.1 #> [9] ggplot2_3.5.1 SpatialFeatureExperiment_1.3.0 #> [11] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [13] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [15] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [17] IRanges_2.36.0 S4Vectors_0.40.2 #> [19] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [21] matrixStats_1.3.0 SFEData_1.4.0 #> [23] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] splines_4.3.3 later_1.3.2 #> [3] bitops_1.0-7 filelock_1.0.3 #> [5] tibble_3.2.1 lifecycle_1.0.4 #> [7] sf_1.0-16 edgeR_4.0.16 #> [9] lattice_0.22-6 magrittr_2.0.3 #> [11] limma_3.58.1 sass_0.4.9 #> [13] rmarkdown_2.26 jquerylib_0.1.4 #> [15] yaml_2.3.8 metapod_1.10.1 #> [17] httpuv_1.6.15 sp_2.1-4 #> [19] cowplot_1.1.3 DBI_1.2.2 #> [21] RColorBrewer_1.1-3 abind_1.4-5 #> [23] zlibbioc_1.48.2 purrr_1.0.2 #> [25] RCurl_1.98-1.14 rappdirs_0.3.3 #> [27] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [29] irlba_2.3.5.1 terra_1.7-71 #> [31] pheatmap_1.0.12 units_0.8-5 #> [33] RSpectra_0.16-1 dqrng_0.3.2 #> [35] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [37] codetools_0.2-20 DelayedArray_0.28.0 #> [39] tidyselect_1.2.1 farver_2.1.1 #> [41] ScaledMatrix_1.10.0 viridis_0.6.5 #> [43] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [45] BiocNeighbors_1.20.2 e1071_1.7-14 #> [47] systemfonts_1.0.6 tools_4.3.3 #> [49] ggnewscale_0.4.10 ragg_1.3.0 #> [51] Rcpp_1.0.12 glue_1.7.0 #> [53] gridExtra_2.3 SparseArray_1.2.4 #> [55] mgcv_1.9-1 xfun_0.43 #> [57] dplyr_1.1.4 HDF5Array_1.30.1 #> [59] withr_3.0.0 BiocManager_1.30.22 #> [61] fastmap_1.1.1 boot_1.3-30 #> [63] rhdf5filters_1.14.1 fansi_1.0.6 #> [65] spData_2.3.0 digest_0.6.35 #> [67] rsvd_1.0.5 R6_2.5.1 #> [69] mime_0.12 textshaping_0.3.7 #> [71] colorspace_2.1-0 wk_0.9.1 #> [73] scattermore_1.2 RSQLite_2.3.6 #> [75] hexbin_1.28.3 utf8_1.2.4 #> [77] generics_0.1.3 class_7.3-22 #> [79] httr_1.4.7 htmlwidgets_1.6.4 #> [81] S4Arrays_1.2.1 spdep_1.3-3 #> [83] pkgconfig_2.0.3 scico_1.5.0 #> [85] gtable_0.3.5 blob_1.2.4 #> [87] XVector_0.42.0 htmltools_0.5.8.1 #> [89] scales_1.3.0 png_0.1-8 #> [91] knitr_1.45 rjson_0.2.21 #> [93] nlme_3.1-164 curl_5.2.1 #> [95] proxy_0.4-27 cachem_1.0.8 #> [97] rhdf5_2.46.1 BiocVersion_3.18.1 #> [99] KernSmooth_2.23-22 parallel_4.3.3 #> [101] vipor_0.4.7 AnnotationDbi_1.64.1 #> [103] desc_1.4.3 s2_1.1.6 #> [105] pillar_1.9.0 grid_4.3.3 #> [107] vctrs_0.6.5 promises_1.3.0 #> [109] dbplyr_2.5.0 beachmat_2.18.1 #> [111] xtable_1.8-4 cluster_2.1.6 #> [113] beeswarm_0.4.0 evaluate_0.23 #> [115] magick_2.8.3 cli_3.6.2 #> [117] locfit_1.5-9.9 compiler_4.3.3 #> [119] rlang_1.1.3 crayon_1.5.2 #> [121] labeling_0.4.3 classInt_0.4-10 #> [123] fs_1.6.4 ggbeeswarm_0.7.2 #> [125] stringi_1.8.3 viridisLite_0.4.2 #> [127] deldir_2.0-4 munsell_0.5.1 #> [129] Biostrings_2.70.3 Matrix_1.6-5 #> [131] ExperimentHub_2.10.0 sparseMatrixStats_1.14.0 #> [133] bit64_4.0.5 Rhdf5lib_1.24.2 #> [135] KEGGREST_1.42.0 statmod_1.5.0 #> [137] shiny_1.8.1.1 interactiveDisplayBase_1.40.0 #> [139] highr_0.10 AnnotationHub_3.10.1 #> [141] igraph_2.0.3 memoise_2.0.1 #> [143] bslib_0.7.0 bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"MERFISH mouse liver dataset and considerations of large data","text":"SpatialFeatureExperiment (SFE) Voyager packages originally developed around relatively small Visium dataset proof concept, hence originally optimized large datasets. However, larger smFISH datasets hundreds thousands, sometimes million cells already produced soon produced routinely. Among studies using smFISH-based spatial transcriptomics technologies reported number cells per dataset, number cells per dataset increased past years (Moses Pachter 2022). anticipation large datasets, vignette produced using limited GitHub Actions resources (MacOS), 14 GB RAM 3 CPU cores 14 GB disk space, comparable laptops. therefore expect analyses vignette scale reasonably sized datasets. dataset use vignette MERFISH mouse liver dataset downloaded Vizgen website. use discuss issues large datasets upcoming features next release Voyager. gene count matrix cell metadata (including centroid coordinates) downloaded CSV files read R. 7 z-planes imaged, cell segmentation available one z-plane. cell polygons HDF5 files, one HDF5 file per field view (FOV), 1000 FOVs dataset. Converting HDF5 files sf data frame trivial. See vignette creating SpatialFeatureExperiment (SFE) object code used conversion, polygons included SFE object. cell metadata already cell volume. polygons used analyses, polygons can’t seen static plot hundreds thousands cells anyway, conversion optional. transcript spot locations available, yet work large point dataset. load packages used. 395,215 cells dataset. Plotting polygons takes , isn’t bad. However, wish save plot PDF. avoid problem, can either use scattermore = TRUE argument plotSpatialFeature() plot centroids since polygons hard see anyway. Cell density can vaguely seen plot . count number cells bins better visualize cell density. Cell density part homogeneous shows structure denser regions seem relate blood vessels.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(scater) library(ggplot2) library(patchwork) library(stringr) library(spdep) library(BiocParallel) library(BiocSingular) library(gstat) library(BiocNeighbors) library(sf) library(automap) theme_set(theme_bw()) (sfe <- VizgenLiverData()) #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache #> require(\"SpatialFeatureExperiment\") #> class: SpatialFeatureExperiment #> dim: 385 395215 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(395215): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(9): fov volume ... nCounts nGenes #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01: plotGeometry(sfe, \"cellSeg\") plotCellBin2D(sfe, bins = 300, hex = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"MERFISH mouse liver dataset and considerations of large data","text":"Plotting almost 400,000 polygons kind slow doable. nCounts kind looks like salt pepper. Using scattermore package can speed plotting large number points. non-interactive plot, cell polygons small see anyway, plotting cell centroid points fine. run server, plotting almost 400,000 polygons took around 23 seconds, using geom_scattermore() (scattermore = TRUE) took 2 seconds. Since geom_scattermore() rasterizes plot, plot pixelated zoomed . interactive data visualization useful ESDA, need static figures publications. Voyager 1.2.0 (Bioconductor 3.17), bounding box can specified zoom data. Much time making plot spent subsetting sf data frame bounding box. , spatial autocorrelation evident upper right region smaller cells, less rest patch. nCounts seems related cell size; larger cells seem total counts. Interactive data visualization currently beyond scope Voyager vignette. existing tools interactive visualization highly multiplexed imaging data, MERmaid (G. Wang et al. 2020) MERFISH data, TissUUmaps (Behanova et al. 2023), Visinity (Warchol et al. 2023), samui broswer (Sriworarat et al. 2023). Since aren’t many genes, genes negative control probes can displayed: number real genes 347. Next, plot distribution nCounts divided number genes panel, distribution comparable across datasets different numbers genes. Xenium dataset, mysterious regular notches histogram number genes detected. also plot number genes detected per cell, geom_scattermore(). Similarly nCounts, points look intermingled. Distribution cell volume space: Next, explore nCounts relates nGenes: two branches plot. cell size relate nCounts? lower branch larger cells don’t tend total counts, upper branch larger cells tend total counts. also examine cell size relates number genes detected: seem clusters possibly related cell type.","code":"names(colData(sfe)) #> [1] \"fov\" \"volume\" \"min_x\" \"max_x\" \"min_y\" \"max_y\" #> [7] \"sample_id\" \"nCounts\" \"nGenes\" system.time( print(plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\")) ) #> user system elapsed #> 19.499 0.900 20.442 system.time({ print(plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"centroids\", scattermore = TRUE)) }) #> user system elapsed #> 1.625 0.194 1.823 bbox_use <- c(xmin = 3000, xmax = 3500, ymin = 2500, ymax = 3000) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"cellSeg\", bbox = bbox_use) rownames(sfe) #> [1] \"Comt\" \"Ldha\" \"Pck1\" \"Akr1a1\" #> [5] \"Ugt2b1\" \"Acsl5\" \"Ugt2a3\" \"Igf1\" #> [9] \"Errfi1\" \"Serping1\" \"Adh4\" \"Hsd17b2\" #> [13] \"Tpi1\" \"Cyp1a2\" \"Acsl1\" \"Akr1d1\" #> [17] \"Alas1\" \"Aldh7a1\" \"G6pc\" \"Hsd17b12\" #> [21] \"Pdhb\" \"Gpd1\" \"Cyp7b1\" \"Pgam1\" #> [25] \"Hc\" \"Dld\" \"Cyp2c23\" \"Proz\" #> [29] \"Acss2\" \"Psap\" \"Cald1\" \"Hsd3b3\" #> [33] \"Galm\" \"Cxcl12\" \"Sardh\" \"Cebpa\" #> [37] \"Aldh3a2\" \"Gck\" \"Sdc1\" \"Pdha1\" #> [41] \"Npc2\" \"Hsd17b6\" \"Aqp1\" \"Adh7\" #> [45] \"Smpdl3a\" \"Egfr\" \"Pgm1\" \"Fasn\" #> [49] \"Ctsc\" \"Abcb4\" \"Fyb\" \"Alas2\" #> [53] \"Gpi1\" \"Fech\" \"Lsr\" \"Psmd3\" #> [57] \"Gm2a\" \"Pabpc1\" \"Cbr4\" \"Tkt\" #> [61] \"Tmem56\" \"Eif3f\" \"Cxadr\" \"Srd5a1\" #> [65] \"Cyp2c55\" \"Gnai2\" \"Gimap6\" \"Hsd3b2\" #> [69] \"Grn\" \"Rpp14\" \"Csnk1a1\" \"Egr1\" #> [73] \"Mpeg1\" \"Acsl4\" \"Hmgb1\" \"Mpp1\" #> [77] \"Lcp1\" \"Plvap\" \"Aldh1b1\" \"Oxsm\" #> [81] \"Dlat\" \"Csk\" \"Mcat\" \"Hsd17b7\" #> [85] \"Epas1\" \"Eif3a\" \"Nrp1\" \"Dek\" #> [89] \"H2afy\" \"Bpgm\" \"Hsd3b6\" \"Dnase1l3\" #> [93] \"Serpinh1\" \"Tinagl1\" \"Aldoc\" \"Cyp2c38\" #> [97] \"Dpt\" \"Mrc1\" \"Minpp1\" \"Fgf1\" #> [101] \"Alcam\" \"Gimap4\" \"Cav2\" \"Eng\" #> [105] \"Adgre1\" \"Shisa5\" \"Csf1r\" \"Esam\" #> [109] \"Unc93b1\" \"Cnp\" \"Clec14a\" \"Kdr\" #> [113] \"Adpgk\" \"Gca\" \"Pkm\" \"Mkrn1\" #> [117] \"Sdc3\" \"Acaca\" \"Gpr182\" \"Bmp2\" #> [121] \"Tfrc\" \"Timp3\" \"Calcrl\" \"Pfkl\" #> [125] \"Wnt2\" \"Cybb\" \"Icam1\" \"Cdh5\" #> [129] \"Sgms2\" \"Cd48\" \"Stk17b\" \"Tubb6\" #> [133] \"Vcam1\" \"Hgf\" \"Ramp1\" \"Arsb\" #> [137] \"Pld4\" \"Smarca4\" \"Fstl1\" \"Pfkm\" #> [141] \"Lhfp\" \"Lmna\" \"Cd300lg\" \"Laptm5\" #> [145] \"Timp2\" \"Slc25a37\" \"Fzd7\" \"Lyve1\" #> [149] \"Acacb\" \"Cyp1a1\" \"Eno3\" \"Cd83\" #> [153] \"Epcam\" \"Ltbp4\" \"Pgm2\" \"Mertk\" #> [157] \"Pth1r\" \"Itga2b\" \"Kctd12\" \"Srd5a3\" #> [161] \"Bmp5\" \"Pecam1\" \"G6pc3\" \"Cyp17a1\" #> [165] \"Stab2\" \"Cygb\" \"Col1a2\" \"Nid1\" #> [169] \"Cd44\" \"Ctnnal1\" \"Ephb4\" \"Elk3\" #> [173] \"Foxq1\" \"Cxcl14\" \"Fzd4\" \"Itgb2\" #> [177] \"Tcf7\" \"Srd5a2\" \"Aldh3b1\" \"Flt4\" #> [181] \"Selp\" \"Rbpj\" \"Ep300\" \"Rhoj\" #> [185] \"Fzd1\" \"Tcf7l2\" \"Ssh2\" \"Col6a1\" #> [189] \"Notch2\" \"Tcf4\" \"Tek\" \"Trim47\" #> [193] \"Tent5c\" \"Ncf1\" \"Lepr\" \"Pck2\" #> [197] \"Lmnb1\" \"Selplg\" \"Myh10\" \"Aldoart1\" #> [201] \"Podxl\" \"Kitl\" \"Tcf3\" \"Tspan13\" #> [205] \"Dll4\" \"Fzd8\" \"Lad1\" \"Procr\" #> [209] \"Ccr2\" \"Akr1c18\" \"Maml1\" \"Ms4a1\" #> [213] \"Hk3\" \"Bcam\" \"Fzd5\" \"Dkk3\" #> [217] \"Bank1\" \"Itgal\" \"Pgam2\" \"Axin2\" #> [221] \"Pfkp\" \"Meis2\" \"Jag1\" \"Gimap3\" #> [225] \"Rassf4\" \"Notch1\" \"Cd93\" \"Tet2\" #> [229] \"Tcf7l1\" \"Cd34\" \"Hvcn1\" \"Mal\" #> [233] \"Itgb7\" \"Wnt4\" \"Kit\" \"Gapdhs\" #> [237] \"Kcnj16\" \"Tnfrsf13c\" \"Hk1\" \"Pdgfra\" #> [241] \"Apobec3\" \"Slc34a2\" \"Vav1\" \"Lamp3\" #> [245] \"Meis1\" \"Lck\" \"Efnb2\" \"Notch4\" #> [249] \"Klrb1c\" \"Angpt2\" \"Vwf\" \"E2f2\" #> [253] \"Ccr1\" \"Angpt1\" \"B4galt6\" \"Cyp21a1\" #> [257] \"Pdpn\" \"Dll1\" \"Ammecr1\" \"Csf3r\" #> [261] \"Ndn\" \"Fgf2\" \"Runx1\" \"Mpl\" #> [265] \"Mecom\" \"Itgam\" \"Hoxb4\" \"Tox\" #> [269] \"Prickle2\" \"Acss1\" \"Cyp2b9\" \"Aldh3a1\" #> [273] \"Bmp7\" \"Gata2\" \"Il7r\" \"Satb1\" #> [277] \"Sfrp1\" \"Eno2\" \"Mrvi1\" \"Mki67\" #> [281] \"Nes\" \"Tmod1\" \"Ace\" \"Gfap\" #> [285] \"Tgfb2\" \"Tomt\" \"Flt3\" \"Sult2b1\" #> [289] \"Hkdc1\" \"Notch3\" \"Cdh11\" \"Il6\" #> [293] \"Hk2\" \"Mmrn1\" \"Vangl2\" \"Pou2af1\" #> [297] \"Hoxb5\" \"Jag2\" \"Aldh3b2\" \"Gypa\" #> [301] \"Lrp2\" \"Lef1\" \"Olr1\" \"Lox\" #> [305] \"Txlnb\" \"Slc12a1\" \"Aldh3b3\" \"Cxcr2\" #> [309] \"Nkd2\" \"Sult1e1\" \"Acsl6\" \"Ddx4\" #> [313] \"Ldhc\" \"Kcnj1\" \"Acsbg1\" \"Fzd3\" #> [317] \"F13a1\" \"Hsd11b2\" \"Dkk2\" \"Hsd17b1\" #> [321] \"Fzd2\" \"Cyp2b23\" \"Eno4\" \"Celsr2\" #> [325] \"Obscn\" \"Slamf1\" \"Akap14\" \"Gnaz\" #> [329] \"Cd177\" \"Tet1\" \"Cspg4\" \"Aldoart2\" #> [333] \"Cyp2b19\" \"Ryr2\" \"Ldhal6b\" \"Acsf3\" #> [337] \"Chodl\" \"Ivl\" \"Cyp11b1\" \"Sfrp2\" #> [341] \"Dkk1\" \"Cyp11a1\" \"1700061G19Rik\" \"Acsbg2\" #> [345] \"Olah\" \"Pdha2\" \"Hsd17b3\" \"Blank-0\" #> [349] \"Blank-1\" \"Blank-2\" \"Blank-3\" \"Blank-4\" #> [353] \"Blank-5\" \"Blank-6\" \"Blank-7\" \"Blank-8\" #> [357] \"Blank-9\" \"Blank-10\" \"Blank-11\" \"Blank-12\" #> [361] \"Blank-13\" \"Blank-14\" \"Blank-15\" \"Blank-16\" #> [365] \"Blank-17\" \"Blank-18\" \"Blank-19\" \"Blank-20\" #> [369] \"Blank-21\" \"Blank-22\" \"Blank-23\" \"Blank-24\" #> [373] \"Blank-25\" \"Blank-26\" \"Blank-27\" \"Blank-28\" #> [377] \"Blank-29\" \"Blank-30\" \"Blank-31\" \"Blank-32\" #> [381] \"Blank-33\" \"Blank-34\" \"Blank-35\" \"Blank-36\" #> [385] \"Blank-37\" n_panel <- 347 colData(sfe)$nCounts_normed <- sfe$nCounts / n_panel colData(sfe)$nGenes_normed <- sfe$nGenes / n_panel plotColDataHistogram(sfe, c(\"nCounts_normed\", \"nGenes_normed\")) plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"volume\", colGeometryName = \"centroids\", scattermore = TRUE) plotColData(sfe, x=\"nCounts\", y=\"nGenes\", bins = 100) plotColData(sfe, x=\"volume\", y=\"nCounts\", bins = 100) plotColData(sfe, x=\"volume\", y=\"nGenes\", bins = 100)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"negative-controls","dir":"Articles","previous_headings":"Quality control","what":"Negative controls","title":"MERFISH mouse liver dataset and considerations of large data","text":"Blank probes used negative controls. Total transcript counts blank probes: Number blank features detected per cell: Percentage blank features per cell: percentage interesting: within tissue, cells high percentage blank counts scattered like salt pepper, cells left edge tissue, edges FOVs, tissue doesn’t end. Also plot histograms: NA’s cells without transcript detected. Unlike Xenium dataset, cells least one blank count. log transforming, zeroes removed plot. small percentage blank counts acceptable. remove outlier based distribution percentage ’s greater zero. blank percentage relate total counts? outliers percentage blank counts low total counts. seemingly real cells sizable nCounts relatively high percentage blank counts. Since distribution percentage long tail, log transform finding outliers. proportion cells outliers? ’s cutoff outlier? Remove outliers empty cells: still 390,000 cells left removing outliers.","code":"is_blank <- str_detect(rownames(sfe), \"^Blank-\") sfe <- addPerCellQCMetrics(sfe, subset = list(blank = is_blank)) names(colData(sfe)) #> [1] \"fov\" \"volume\" \"min_x\" #> [4] \"max_x\" \"min_y\" \"max_y\" #> [7] \"sample_id\" \"nCounts\" \"nGenes\" #> [10] \"nCounts_normed\" \"nGenes_normed\" \"sum\" #> [13] \"detected\" \"subsets_blank_sum\" \"subsets_blank_detected\" #> [16] \"subsets_blank_percent\" \"total\" plotSpatialFeature(sfe, \"subsets_blank_sum\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"subsets_blank_detected\", colGeometryName = \"centroids\", scattermore = TRUE) plotSpatialFeature(sfe, \"subsets_blank_percent\", colGeometryName = \"centroids\", scattermore = TRUE) plotColDataHistogram(sfe, paste0(\"subsets_blank_\", c(\"sum\", \"detected\", \"percent\"))) #> Warning: Removed 1332 rows containing non-finite outside the scale range #> (`stat_bin()`). mean(sfe$subsets_blank_sum > 0) #> [1] 0.7648799 plotColDataHistogram(sfe, \"subsets_blank_percent\") + scale_x_log10() + annotation_logticks() #> Warning in scale_x_log10(): log-10 transformation introduced #> infinite values. #> Warning: Removed 92923 rows containing non-finite outside the scale range #> (`stat_bin()`). plotColData(sfe, x=\"nCounts\", y=\"subsets_blank_percent\", bins = 100) #> Warning: Removed 1332 rows containing non-finite outside the scale range #> (`stat_bin2d()`). get_neg_ctrl_outliers <- function(col, sfe, nmads = 3, log = FALSE) { inds <- colData(sfe)$nCounts > 0 & colData(sfe)[[col]] > 0 df <- colData(sfe)[inds,] outlier_inds <- isOutlier(df[[col]], type = \"higher\", nmads = nmads, log = log) outliers <- rownames(df)[outlier_inds] col2 <- str_remove(col, \"^subsets_\") col2 <- str_remove(col2, \"_percent$\") new_colname <- paste(\"is\", col2, \"outlier\", sep = \"_\") colData(sfe)[[new_colname]] <- colnames(sfe) %in% outliers sfe } sfe <- get_neg_ctrl_outliers(\"subsets_blank_percent\", sfe, log = TRUE) mean(sfe$is_blank_outlier) #> [1] 0.008944499 min(sfe$subsets_blank_percent[sfe$is_blank_outlier]) #> [1] 2.303523 (sfe <- sfe[, !sfe$is_blank_outlier & sfe$nCounts > 0]) #> class: SpatialFeatureExperiment #> dim: 385 390348 #> metadata(0): #> assays(1): counts #> rownames(385): Comt Ldha ... Blank-36 Blank-37 #> rowData names(3): means vars cv2 #> colnames(390348): 10482024599960584593741782560798328923 #> 111551578131181081835796893618918348842 ... #> 92389687374928708938472537234969690424 #> 96399783859933548456002372694492036651 #> colData names(18): fov volume ... total is_blank_outlier #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : center_x center_y #> imgData names(1): sample_id #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT), cellSeg (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"genes","dir":"Articles","previous_headings":"Quality control","what":"Genes","title":"MERFISH mouse liver dataset and considerations of large data","text":"look mean variance gene: genes display higher mean expression blanks, considerable overlap distribution, probably genes expressed lower levels fewer cells included. “real” genes negative controls plotted different colors: red line \\(y = x\\) expected data follows Poisson distribution. zoomed , blanks somewhat overdispersed.","code":"rowData(sfe)$means <- rowMeans(counts(sfe)) rowData(sfe)$vars <- rowVars(counts(sfe)) rowData(sfe)$is_blank <- is_blank plotRowData(sfe, x = \"means\", y = \"is_blank\") + scale_y_log10() + annotation_logticks(sides = \"b\") plotRowData(sfe, x = \"means\", y = \"vars\", colour_by = \"is_blank\") + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal() as.data.frame(rowData(sfe)[is_blank,]) |> ggplot(aes(means, vars)) + geom_point() + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks() + coord_equal()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"MERFISH mouse liver dataset and considerations of large data","text":", plot zoomed patch visually inspect cell-cell contiguity: quite cells contiguous cell, cell segmentation imperfect, purely using poly2nb() problematic. next release, might implement way blend polygon contiguity graph graph case singletons. now, use k nearest neighbors \\(k = 5\\), seems like reasonable approximation contiguity based visual inspection. Voyager 1.2.0 (Bioconductor 3.17), findSpatialNeighbors() default uses BiocNeighbors k nearest neighbors distance neighbors saving distances neighbors. bypasses time consuming step spdep calculating distance based edge weights, compute distance, hence greatly speeding computation. spatial neighborhood graph, can compute Moran’s QC metrics. Unlike smFISH-based datasets website, nCounts nGenes sizable negative Moran’s ’s, closer 0 volume. interesting compare metrics across different tissues, add datasets SFEData future releases. Also check local Moran’s , since little patch examined , regions may positive spatial autocorrelation. niches around smaller blood vessels positive local Moran’s nCounts nGenes. likely due homogeneous endothelial cells compared hepatocytes.","code":"plotGeometry(sfe, \"cellSeg\", bbox = bbox_use) system.time( colGraph(sfe, \"knn5\") <- findSpatialNeighbors(sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") ) #> user system elapsed #> 39.365 0.177 39.630 sfe <- colDataMoransI(sfe, c(\"nCounts\", \"nGenes\", \"volume\"), colGraphName = \"knn5\") colFeatureData(sfe)[c(\"nCounts\", \"nGenes\", \"volume\"),] #> DataFrame with 3 rows and 2 columns #> moran_sample01 K_sample01 #> #> nCounts -0.1084532 4.22513 #> nGenes -0.0922130 2.25923 #> volume -0.0195237 3.89406 sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = c(\"nCounts\", \"nGenes\", \"volume\"), colGraphName = \"knn5\") plotLocalResult(sfe, \"localmoran\", c(\"nCounts\", \"nGenes\", \"volume\"), colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"morans-i","dir":"Articles","previous_headings":"","what":"Moran’s I","title":"MERFISH mouse liver dataset and considerations of large data","text":"’s actually slow thought almost 400,000 cells. Moran’s ’s distributed real genes blank probes? blanks clustered tightly around 0. vast majority real genes positive spatial autocorrelation, quite strong. genes negative spatial autocorrelation, although may may statistically significant. Plot top genes positive spatial autocorrelation: Unlike smFISH-based cancer datasets dataset, genes highest Moran’s highlight different histological regions. probably zones hepatic lobule, blood vessels. interesting compare spatial autocorrelation marker genes among different tissues cell types. Negative Moran’s means nearby cells tend dissimilar . hard see plotting whole tissue section, use bounding box . gene negative Moran’s compared one Moran’s closest 0. expected, feature Moran’s closest 0 blank.","code":"sfe <- logNormCounts(sfe) system.time( sfe <- runMoransI(sfe, BPPARAM = MulticoreParam(2)) ) #> user system elapsed #> 114.823 11.824 67.100 plotRowData(sfe, x = \"moran_sample01\", y = \"is_blank\") + geom_hline(yintercept = 0, linetype = 2) top_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01, decreasing = TRUE)[1:6]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"centroids\", scattermore = TRUE, ncol = 2) bottom_moran <- rownames(sfe)[order(rowData(sfe)$moran_sample01)[1]] bottom_abs_moran <- rownames(sfe)[order(abs(rowData(sfe)$moran_sample01))[1]] plotSpatialFeature(sfe, c(bottom_moran, bottom_abs_moran), colGeometryName = \"cellSeg\", bbox = bbox_use)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"spatial-autocorrelation-at-larger-length-scales","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation at larger length scales","title":"MERFISH mouse liver dataset and considerations of large data","text":"k nearest neighbor graph used concerns 5 cells around cell, small neighborhood, small length scale. current release Voyager, correlogram can computed get sense length scale spatial autocorrelation. However, since finding lag values higher higher orders neighborhoods slow large number cells higher orders, correlogram helpful . section, use methods involving binning explore spatial autocorrelation larger length scales.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"binning","dir":"Articles","previous_headings":"Spatial autocorrelation at larger length scales","what":"Binning","title":"MERFISH mouse liver dataset and considerations of large data","text":"sf package can create polygons grid, can bin cells attributes gene expressions. make 100 100 hexagonal grid bounding box cell centroids. use grid bin QC metrics averaging values cells. Since bins completely covered tissue fewer cells, mean may less susceptible edge effect sum, bins near edge lower sums, may spuriously increase Moran’s . Plot binned values: ’s outlier bin evident plotting single cells. ’s still edge effect around blood vessels. might truly edge effect, endothelial cells tend lower values 3 variables . compute Moran’s binned data, contiguity neighborhoods. zero.policy = TRUE bins neighbors. larger length scale, Moran’s becomes positive. Comparing Moran’s across different sized bins can give sense length scale spatial autocorrelation. However, problems binning watch : Edge effect, especially using sum binning function use aggregate values binning using rectangular grid, whether use rook queen neighbors. Rook means two cells neighbors share edge, queen means neighbors even merely share vertex. binning can greatly speed computation spatial autocorrelation metrics larger datasets, can used smaller datasets find length scales spatial autocorrelation. hand, seen , Moran’s can flip signs different length scales, larger datasets, exploring spatial autocorrelation cell level still interesting.","code":"(bins <- st_make_grid(colGeometry(sfe, \"centroids\"), n = 100, square = FALSE)) #> Geometry set for 11165 features #> Geometry type: POLYGON #> Dimension: XY #> Bounding box: xmin: -137.2225 ymin: -158.8407 xmax: 10396.09 ymax: 9708.547 #> CRS: NA #> First 5 geometries: #> POLYGON ((-85.58866 -69.40817, -137.2225 -39.59... #> POLYGON ((-85.58866 109.4569, -137.2225 139.267... #> POLYGON ((-85.58866 288.3219, -137.2225 318.132... #> POLYGON ((-85.58866 467.1869, -137.2225 496.997... #> POLYGON ((-85.58866 646.052, -137.2225 675.8628... df <- cbind(colGeometry(sfe, \"centroids\"), colData(sfe)[,c(\"nCounts\", \"nGenes\", \"volume\")]) df_binned <- aggregate(df, bins, FUN = mean) # Remove bins not containing cells df_binned <- df_binned[!is.na(df_binned$nCounts),] # Not using facet_wrap to give each panel its own color scale plts <- lapply(c(\"nCounts\", \"nGenes\", \"volume\"), function(f) { ggplot(df_binned[,f]) + geom_sf(aes(fill = .data[[f]]), linewidth = 0) + scale_fill_distiller(palette = \"Blues\", direction = 1) + theme_void() }) wrap_plots(plts, nrow = 2) nb <- poly2nb(df_binned) listw <- nb2listw(nb, zero.policy = TRUE) calculateMoransI(t(as.matrix(st_drop_geometry(df_binned[,c(\"nCounts\", \"nGenes\", \"volume\")]))), listw = listw, zero.policy = TRUE) #> DataFrame with 3 rows and 2 columns #> moran K #> #> nCounts 0.490837 5.21703 #> nGenes 0.422267 16.10323 #> volume 0.352221 4.78100"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"semivariogram","dir":"Articles","previous_headings":"Spatial autocorrelation at larger length scales","what":"Semivariogram","title":"MERFISH mouse liver dataset and considerations of large data","text":"geostatistical data, underlying spatial process sampled known locations. Kriging uses Gaussian process interpolate values sample locations, semivariogram used model spatial dependency locations covariance Gaussian process. kriging, semivariogram can used exploratory data analysis tool find length scale anisotropy spatial autocorrelation. One classic R packages geostatistical tradition gstat, use find semivariograms, defined \\[ \\gamma(t) = \\frac 1 2 \\mathrm{Var}(X_t - X_0), \\] \\(X\\) value gene expression, \\(t\\) spatial vector. \\(X_0\\) value location interest, \\(X_t\\) value lagged \\(t\\). positive spatial autocorrelation, variance smaller among nearby values, variogram increase distance, eventually leveling distance beyond length scale spatial autocorrelation. “semi” comes 1/2. variogram Voyager v1.2.0 higher (Bioconductor 3.17 later) can computed runUnivariate() function. See vignette variograms variogram maps. First find empirical variogram assuming ’s directions. data binned distance intervals, much faster correlogram cell level. width argument controlls bin size. cutoff argument maximum distance consider. use defaults. first argument formula; covariates can specified, done . different widths cutoffs, variogram can estimated different length scales. gstat package can also fit model empirical variogram. See vgm() different types models. automap package can choose model user, used Voyager. Unfortunately, gstat doesn’t scale 400,000 cells, although worked 100,000 cells smFISH-based datasets website. since variogram used explore larger length scales anyway, use binned data , problems binning apply. numbers plot number pairs distance bin. variogram 0 0 distance; variance within bin size, called nugget. variogram levels greater distance, value variogram levels sill. Range variogram leveling , indicating length scale spatial autocorrelation; range visual inspection appears closer 1000 model somehow indicates 423. variogram map can made see spatial autocorrelation may differ different directions, .e. anisotropy apparently ’s anisotropy shorter length scales, may artifact hexagonal bins. Going beyond 2000 (whatever unit), variance drops northwest southeast direction directions, perhaps related repetitiveness hepatic lobules general NE/SW direction blood vessels seen previous plots. variogram can also calculated specified angles, selected sides hexagon: variogram rises going beyond 2000 30 90 degrees drops 150 degrees. consistent variogram map. differences averaged omni-directional variogram. gstat fit anisotropy parameters, fitted curve omni-directional. fits pretty well 2000. nCounts, may differ QC metrics genes. anisotropy varies space? problem variogram ’s global, giving one result entire dataset, albeit nuanced just number Moran’s , kriging assumes data intrinsically stationary, meaning variogram model applies everywhere, spatial dependence depends lag two observations. Voyager 1.2.0 implements ggplot2 based plotting functions make better looking customizable plots variograms SFE objects. However, binned data SFE object. considering writing method spatially bin cell level data SFE object Bioconductor 3.18. gstat using lattice, predecessor ggplot2 make facetted plots superseded ggplot2. gstat one oldest R packages still CRAN, dating back days S (prequel R), although oldest archive CRAN 2003. spdep also really old; oldest archive CRAN 2002, ’s still active development. using time honored packages methods (Moran’s Geary’s C date back 1950s modern form date back 1969 (Cliff Ord 1969; Bivand 2013)) cool new spatial transcriptomics dataset, participating glorious tradition, develop spatial analysis tradition forms around spatial -omics data analysis.","code":"# as_Spatial since automap uses old fashioned sp, the predecessor of sf v <- autofitVariogram(nCounts ~ 1, as_Spatial(df_binned)) plot(v) v2 <- variogram(nCounts ~ 1, data = df_binned, width = 300, cutoff = 4500, map = TRUE) plot(v2) v3 <- variogram(nCounts ~ 1, df_binned, alpha = c(30, 90, 150)) v3_model <- fit.variogram(v3, vgm(\"Ste\")) plot(v3, v3_model)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"pca-for-larger-datasets","dir":"Articles","previous_headings":"","what":"PCA for larger datasets","title":"MERFISH mouse liver dataset and considerations of large data","text":"many ways PCA R, BiocSingular package makes number different methods available consistent user interface, supports memory data DelayedArray. According benchmark, stats::prcomp() shipped R rather slow larger datasets. fastest methods irlba::irlba() RSpectra::svds(), former supported BiocSingular. use IRLBA see long takes. Many PCA algorithms involve repeated matrix multiplications. R come optimized BLAS LAPACK, portability reasons. However, BLAS LAPACK used R can changed optimized one (’s ), speed matrix multiplication. ’s pretty quick almost 400,000 cells, aren’t many genes . Use elbow plot see variance explained PC: Plot top gene loadings PC Many genes seem related endothelium. Plot first 4 PCs space PC1 PC4 highlight major blood vessels, PC2 PC3 less spatial structure. CosMX Xenium datasets website, top PCs clear spatial structures despite absence spatial information non-spatial PCA clear spatial compartments cell types, seem case dataset except blood vessels. seen genes strong spatial structures. methods spatially informed PCA, MULTISPATI PCA (Dray, Saı̈d, Débias 2008) adespatial package, seeks maximize variance (non-spatial PCA) Moran’s PC. Unlike traditional PCs, eigenvalues, signifying variance explained, positive, MULTISPATI PCA can negative eigenvalues, signify negative spatial autocorrelation. PCs MULTISPATI PCA positive eigenvalues also spatially coherent non-spatial PCA. CosMX Xenium datasets website, spatial coherence MULTISPATI might make difference, might make difference dataset non-spatial PCs don’t show much spatial structure, least larger scale entire tissue section. Voyager 1.2.0 (Bioconductor 3.17) faster implementation MULTISPATI PCA adespatial, demonstrated dataset another vignette. PC2 PC3 don’t seem large scale spatial structure, may local spatial structure obvious plotting entire section, zoom bounding box: ’s spatial structure PC2 PC3 smaller scale, perhaps negative spatial autocorrelation. Like global Moran’s , PCA MULTISPATI PCA return one result entire dataset. contrast, geographically weighted PCA (GWPCA) (Harris et al. 2015) can account spatial heterogeneity. GWPCA runs PCA spatial location using nearby locations weighed kernel. different locations can different PCs, results can visualized “winning variables” PC, .e. plotting feature highest loading PC space. likely doesn’t scale 400,000 cells, still interesting performed spatially binned data. GWPCA might added Bioconductor 3.18 require changes user interface, GWPCA features rather cell embeddings.","code":"set.seed(29) system.time( sfe <- runPCA(sfe, ncomponents = 20, subset_row = !is_blank, exprs_values = \"logcounts\", scale = TRUE, BSPARAM = IrlbaParam()) ) #> user system elapsed #> 21.014 1.319 22.367 gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) #> Ncells 16650376 889.3 32047514 1711.6 NA 32047514 1711.6 #> Vcells 250335884 1910.0 490341574 3741.1 16384 490265901 3740.5 ElbowPlot(sfe) plotDimLoadings(sfe) spatialReducedDim(sfe, \"PCA\", 4, colGeometryName = \"centroids\", scattermore = TRUE, divergent = TRUE, diverge_center = 0) spatialReducedDim(sfe, \"PCA\", ncomponents = 2:3, colGeometryName = \"cellSeg\", bbox = bbox_use, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"more-challenges-from-large-datasets","dir":"Articles","previous_headings":"","what":"More challenges from large datasets","title":"MERFISH mouse liver dataset and considerations of large data","text":", despite numerous cells, data loaded memory. data doesn’t fit memory? might write new vignette DelayedArray demonstrating memory data analysis Bioconductor 3.18. already supported SingleCellExperiment, SFE inherits . However, geometries, graphs, local results can take lot memory well. can possibly stored SQL databases operated SQLDataFrame. geometric operations can handled sedona, although options limited compared GEOS, performs geometric operations behind scene sf. Another question can raised large spatial transcriptomics data: still good idea analyze entire dataset ? must many interesting unique neighborhoods might get attention deserve whole dataset analyzed . , geographical space, national level data usually analyzed block resolution, although reason privacy subjects. County resolution often used, aren’t hundreds thousands counties. Many analyses done cities counties neighborhood resolution; using largest geographical unit isn’t always relevant. Back histological space: aggregate cells larger spatial units? decide scale spatial units (analogous nation vs state vs county etc) relevant? traditional anatomical ontologies, Allen Brain Atlas, isn’t available tissues. Also, single cell -omics data, traditional ontologies can improved. Furthermore, 3D thick section single cell resolution spatial transcriptomics data, STARmap (X. Wang et al. 2018) EASI-FISH (Y. Wang et al. 2021), although vast majority spatial -omics data thin sections pretty much de facto 2D. mostly live surface Earth, many 2D geospatial resources 3D. However, methods can principle applied 3D existing software primarily made 2D data might work. example, GEOS supports 3D data, principle 3D geometries sf work, although ’s little documentation . Also, k nearest neighbor, Moran’s , variograms, etc. principle work 3D, gstat officially supports 3D. challenges related 3D data: Even multiple z-planes imaged, resolution much lower z direction x y directions. z-plane treated attribute coordinate? make static plots 3D data publications? complicated plotting z-planes separately, since 3D block can sectioned direction. Also interactive visualization, need somehow see tissue. Finally, geospatial tradition one tradition relevant large spatial data, present Voyager works vector data. uncertain whether raster added later version, existing tools large raster data well, TileDB. traditions can relevant, astronomy image processing, beyond scope package.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig6_merfish.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"MERFISH mouse liver dataset and considerations of large data","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] SpatialFeatureExperiment_1.3.0 automap_1.1-9 #> [3] BiocNeighbors_1.20.2 gstat_2.1-1 #> [5] BiocSingular_1.18.0 BiocParallel_1.36.0 #> [7] spdep_1.3-3 sf_1.0-16 #> [9] spData_2.3.0 stringr_1.5.1 #> [11] patchwork_1.2.0 scater_1.30.1 #> [13] ggplot2_3.5.1 scuttle_1.12.0 #> [15] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [17] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [19] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [21] IRanges_2.36.0 S4Vectors_0.40.2 #> [23] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [25] matrixStats_1.3.0 SFEData_1.4.0 #> [27] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] xts_0.13.2 lifecycle_1.0.4 #> [7] edgeR_4.0.16 lattice_0.22-6 #> [9] magrittr_2.0.3 limma_3.58.1 #> [11] sass_0.4.9 rmarkdown_2.26 #> [13] jquerylib_0.1.4 yaml_2.3.8 #> [15] httpuv_1.6.15 sp_2.1-4 #> [17] cowplot_1.1.3 RColorBrewer_1.1-3 #> [19] DBI_1.2.2 abind_1.4-5 #> [21] zlibbioc_1.48.2 purrr_1.0.2 #> [23] RCurl_1.98-1.14 rappdirs_0.3.3 #> [25] GenomeInfoDbData_1.2.11 ggrepel_0.9.5 #> [27] irlba_2.3.5.1 terra_1.7-71 #> [29] units_0.8-5 RSpectra_0.16-1 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] e1071_1.7-14 systemfonts_1.0.6 #> [43] tools_4.3.3 ggnewscale_0.4.10 #> [45] ragg_1.3.0 Rcpp_1.0.12 #> [47] glue_1.7.0 gridExtra_2.3 #> [49] SparseArray_1.2.4 xfun_0.43 #> [51] dplyr_1.1.4 HDF5Array_1.30.1 #> [53] withr_3.0.0 BiocManager_1.30.22 #> [55] fastmap_1.1.1 boot_1.3-30 #> [57] rhdf5filters_1.14.1 bluster_1.12.0 #> [59] fansi_1.0.6 digest_0.6.35 #> [61] rsvd_1.0.5 R6_2.5.1 #> [63] mime_0.12 textshaping_0.3.7 #> [65] colorspace_2.1-0 wk_0.9.1 #> [67] scattermore_1.2 RSQLite_2.3.6 #> [69] hexbin_1.28.3 utf8_1.2.4 #> [71] generics_0.1.3 intervals_0.15.4 #> [73] FNN_1.1.4 class_7.3-22 #> [75] httr_1.4.7 htmlwidgets_1.6.4 #> [77] S4Arrays_1.2.1 pkgconfig_2.0.3 #> [79] scico_1.5.0 gtable_0.3.5 #> [81] blob_1.2.4 XVector_0.42.0 #> [83] htmltools_0.5.8.1 scales_1.3.0 #> [85] png_0.1-8 knitr_1.45 #> [87] rjson_0.2.21 spacetime_1.3-1 #> [89] curl_5.2.1 proxy_0.4-27 #> [91] cachem_1.0.8 zoo_1.8-12 #> [93] rhdf5_2.46.1 BiocVersion_3.18.1 #> [95] KernSmooth_2.23-22 parallel_4.3.3 #> [97] vipor_0.4.7 AnnotationDbi_1.64.1 #> [99] desc_1.4.3 s2_1.1.6 #> [101] reshape_0.8.9 pillar_1.9.0 #> [103] grid_4.3.3 vctrs_0.6.5 #> [105] promises_1.3.0 dbplyr_2.5.0 #> [107] beachmat_2.18.1 xtable_1.8-4 #> [109] cluster_2.1.6 beeswarm_0.4.0 #> [111] evaluate_0.23 magick_2.8.3 #> [113] cli_3.6.2 locfit_1.5-9.9 #> [115] compiler_4.3.3 rlang_1.1.3 #> [117] crayon_1.5.2 labeling_0.4.3 #> [119] classInt_0.4-10 plyr_1.8.9 #> [121] fs_1.6.4 ggbeeswarm_0.7.2 #> [123] stringi_1.8.3 stars_0.6-5 #> [125] viridisLite_0.4.2 deldir_2.0-4 #> [127] munsell_0.5.1 Biostrings_2.70.3 #> [129] Matrix_1.6-5 ExperimentHub_2.10.0 #> [131] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [133] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [135] statmod_1.5.0 shiny_1.8.1.1 #> [137] highr_0.10 interactiveDisplayBase_1.40.0 #> [139] AnnotationHub_3.10.1 igraph_2.0.3 #> [141] memoise_2.0.1 bslib_0.7.0 #> [143] bit_4.0.5"},{"path":[]},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"seqFISH exploratory data analysis","text":"data used vignette described Integration spatial single-cell transcriptomic data elucidates mouse organogenesis. Briefly, seqFISH use profile 351 genes several mouse embryos 8-12 somite stage (ss). focus single biological replicate, embryo 3. raw processed counts corresponding metadata available download Marioni lab. Expression matrices, segmentation data, segmented cell vertices provided R objects can readily imported R environment. data relevant vignette converted SFE object available download Box. data added SFEData package Bioconductor available 3.17 release. begin downloading data loading R. rows count matrix correspond 351 barcoded genes measured seqFISH. Additionally, authors provide metadata, including field view z-slice cell. filter count matrix metadata include cells single z-slice.","code":"library(Voyager) library(SFEData) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(batchelor) library(scater) library(scran) library(bluster) library(purrr) library(tidyr) library(dplyr) library(fossil) library(ggplot2) library(patchwork) library(spdep) library(BiocParallel) theme_set(theme_bw()) # Only Bioc 3.17 and above sfe <- LohoffGastrulationData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache names(colData(sfe)) #> [1] \"uniqueID\" \"embryo\" #> [3] \"pos\" \"z\" #> [5] \"x_global\" \"y_global\" #> [7] \"x_global_affine\" \"y_global_affine\" #> [9] \"embryo_pos\" \"embryo_pos_z\" #> [11] \"Area\" \"UMAP1\" #> [13] \"UMAP2\" \"celltype_mapped_refined\" #> [15] \"sample_id\" mask <- colData(sfe)$z == 2 sfe <- sfe[,mask]"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"quality-control","dir":"Articles","previous_headings":"","what":"Quality control","title":"seqFISH exploratory data analysis","text":"begin quality control (QC) cells computing metrics common single-cell analysis store colData field SFE object. , compute number counts per cell. also compute average display violin plot. Notably, cells dataset fewer counts expected single-cell sequencing experiment cells higher counts seem dispersed throughout tissue. Fewer counts expected seqFISH experiments probing highly expressed genes may lead optical crowding multiple imaging rounds. Since counts collected several fields view, visualize number cells total counts field separately. variability total number counts field view. completely apparent accounts low number counts FOVs. example, FOV 22 fewest number cells, comparably counts detected regions cells (e.g. FOV 18). Next, compute number genes detected per cell, defined number genes non-zero counts. plot metric FOV done . Many cells fewer 100 detected genes. part reflects panel 351 probed genes chosen distinguish cell types developmental stages distinct cell types likely express small subset 351 genes. authors also note gene panel consists lowly expressed moderately expressed genes. Taken together, technical details can explain relatively low number counts genes per cell. , plot number genes detected per cell FOV. plot mirrors plot total counts. single FOV stands obvious outlier. authors provided cell type assignments metadata. can assess whether low quality cells tend located particular FOV. appears FOV 26 31 largest fraction low quality cells. Interestingly, correspond FOVs largest number cells overall. plot nCounts vs. nGenes FOV. scRNA-seq, gene expression variance seqFISH measurements overdispersed compared variance counts Poisson distributed. understand mean-variance relationship, compute mean variance gene among cells tissue. , perform calculation separately FOV red line represents line \\(y = x\\), mean-variance relationship expected Poisson distributed data. data deviate expectation FOV. case, variance greater expected.","code":"colData(sfe)$nCounts <- colSums(counts(sfe)) avg <- mean(colData(sfe)$nCounts) violin <- plotColData(sfe, \"nCounts\") + geom_hline(yintercept = avg, color='red') + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"seg_coords\") violin + spatial pos <- colData(sfe)$pos counts_spl <- split.data.frame(t(counts(sfe)), pos) # nCounts per FOV df <- map_dfr(counts_spl, rowSums, .id='pos') |> pivot_longer(cols=contains('embryo'), values_to = 'nCounts') |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> dplyr::filter(!is.na(nCounts)) cells_fov <- colData(sfe) |> as.data.frame() |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> ggplot(aes(pos,)) + geom_bar() + theme_minimal() + labs( x = \"\", y = \"Number of cells\") + theme(axis.text.x = element_text(angle = 90)) counts_fov <- ggplot(df, aes(pos, nCounts)) + geom_boxplot(outlier.size = 0.5) + theme_minimal() + labs(x = \"\", y = 'nCounts') + theme(axis.text.x = element_text(angle = 90)) cells_fov / counts_fov colData(sfe)$nGenes <- colSums(counts(sfe) > 0) avg <- mean(colData(sfe)$nGenes) violin <- plotColData(sfe, \"nGenes\") + geom_hline(yintercept = avg, color='red') + theme(legend.position = \"top\") spatial <- plotSpatialFeature(sfe, \"nGenes\", colGeometryName = \"seg_coords\") violin + spatial df <- map_dfr(counts_spl, ~ rowSums(.x > 0), .id='pos') |> pivot_longer(cols = contains('embryo'), values_to = 'nGenes') |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) |> filter(!is.na(nGenes)) |> merge(df) genes_fov <- ggplot(df, aes(pos, nGenes)) + geom_boxplot(outlier.size = 0.5) + theme_bw() + labs(x = \"\") + theme(axis.text.x = element_text(angle = 90)) genes_fov meta <- data.frame(colData(sfe)) meta <- meta |> group_by(pos) |> add_tally(name = \"nCells_FOV\") |> filter(celltype_mapped_refined %in% \"Low quality\") |> add_tally(name = \"nLQ_FOV\") |> mutate(prop_lq = nLQ_FOV/nCells_FOV) |> distinct(pos, prop_lq) |> ungroup() |> mutate(pos = factor(pos, levels = paste0(\"Pos\", seq_len(length(unique(pos)))-1))) prop_lq <- ggplot(meta, aes(pos, prop_lq)) + geom_bar(stat = 'identity' ) + theme(axis.text.x = element_text(angle = 90)) prop_lq count_vs_genes_p <- ggplot(df, aes(nCounts, nGenes)) + geom_point( alpha = 0.5, size = 1, fill = \"white\" ) + facet_wrap(~ pos) count_vs_genes_p gene_meta <- map_dfr(counts_spl, colMeans, .id = 'pos') |> pivot_longer(cols = -pos, names_to = 'gene', values_to = 'mean') gene_meta <- map_dfr(counts_spl, ~colVars(.x, useNames = TRUE), .id = 'pos') |> pivot_longer(-pos, names_to = 'gene', values_to='variance') |> full_join(gene_meta) #> Joining with `by = join_by(pos, gene)` ggplot(gene_meta, aes(mean, variance)) + geom_point( alpha = 0.5, size = 1, fill = \"white\" ) + facet_wrap(~ pos) + geom_abline(slope = 1, intercept = 0, color = \"red\") + scale_x_log10() + scale_y_log10() + annotation_logticks()"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"data-normalization-and-dimension-reduction","dir":"Articles","previous_headings":"","what":"Data normalization and dimension reduction","title":"seqFISH exploratory data analysis","text":"exploratory analysis indicates presence batch effects corresponding FOV. use normalization scheme batch aware. SFE object inherits SpatialExperimentand SingleCellExperiment, classes, can take advantage normalization methods implemented scran batchelor R packages. first use multiBatchNorm() function scale data within batch. noted documentation, function uses median-based normalization ratio average counts batches. Batch correction dimension reduction accomplished using fastMNN() performs multi-sample PCA across multiple gene expression matrices project cells common low-dimensional space. function fastMNN returns batch-corrected matrix reducedDims slot SingleCellExperiment object. extract relevant data store SFE ojbject. Now visualize first two PCs space. notice PCs may show spatial structure correlates biological niches cells. Unfortunately, FOV artifacts can still seen.","code":"sfe <- multiBatchNorm(sfe, batch = pos) sfe_red <- fastMNN(sfe, batch = pos, cos.norm = FALSE, d = 20) reducedDim(sfe, \"PCA\") <- reducedDim(sfe_red, \"corrected\") assay(sfe, \"reconstructed\") <- assay(sfe_red, \"reconstructed\") spatialReducedDim(sfe, \"PCA\", ncomponents = 2, divergent = TRUE, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"clustering","dir":"Articles","previous_headings":"","what":"Clustering","title":"seqFISH exploratory data analysis","text":"Much like single cell analysis, can use batch-corrected data cluster cells. implement graph-based clustering algorithm plot resulting clusters space. plot colored cluster ID cell types provided author. authors assigned cells types identified clustering step. case, clustering results seem recapitulate major cell niches previous annotations. can compute Rand index using function fossil package assess similarity two clustering results. value 1 suggest clustering results identical, value 0 suggest results agree . relatively large Rand index suggests cells often found cluster cases.","code":"colData(sfe)$cluster <- clusterRows(reducedDim(sfe, \"PCA\"), BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\") ) ) plotSpatialFeature(sfe, c(\"cluster\", \"celltype_mapped_refined\"), colGeometryName = \"seg_coords\") g1 <- as.numeric(colData(sfe)$cluster) g2 <- as.numeric(colData(sfe)$celltype_mapped_refined) rand.index(g1, g2) #> [1] 0.8486922"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"univariate-spatial-statistics","dir":"Articles","previous_headings":"","what":"Univariate Spatial Statistics","title":"seqFISH exploratory data analysis","text":"point, may interested identifying genes exhibit spatial variability, whose expression depends spatial location within tissue. Measures spatial autocorrelation can useful identifyign genes display spatial variablity. Among common measures Moran’s Geary’s C. latter case, less 1 indicates positive spatial autocorrelation, value larger 1 points negative spatial autocorrelation. former case, positive negative values Moran’s indicate positive negative spatial autocorrelation, respectively. tests require spatial neighborhood graph computation statistic. several ways define spatial neighbors findSpatialNeighbors() function wraps methods implemented spdep package. , compute k-nearest neighborhood graph. dist_type = \"idw\" weights edges graph inverse distance neighbors. also save variable genes use computations . use runUnivariate() function compute spatial autocorrelation metrics save results save SFE object. mc type test implements permutation test statistic relies nsim argument computing p-value statistic. can plot results Monte Carlo simulations: vertical line represents observed value Moran’s density represents Moran’s computed permuted data. simulations suggest spatial autocorrelation feature significant. function can also used plot geary.mc results. Now, might ask: genes display spatial autocorrelation? appears genes highest spatial autocorrelation seem obvious expression patterns tissue. interesting see genes also differentially expressed clusters . Non-spatial differential gene expression can interrogated using findMarkers() function implemented scran package complex methods identifying spatially variable genes actively developed. analyses bring interesting considerations. one, unclear whether normalization scheme employed effectively removes FOV batch effects. said, may times FOV differences expected represent biological differences, example context tumor sample. remains seen normalization methods perform best cases, represents area research.","code":"colGraph(sfe, \"knn5\") <- findSpatialNeighbors( sfe, method = \"knearneigh\", dist_type = \"idw\", k = 5, style = \"W\") dec <- modelGeneVar(sfe) hvgs <- getTopHVGs(dec, n = 100) sfe <- runUnivariate( sfe, type = \"geary.mc\", features = hvgs, colGraphName = \"knn5\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- runUnivariate( sfe, type = \"moran.mc\", features = hvgs, colGraphName = \"knn5\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- colDataUnivariate( sfe, type = \"moran.mc\", features = c(\"nCounts\", \"nGenes\"), colGraphName = \"knn5\", nsim = 100) plotMoranMC(sfe, \"Meox1\") top_moran <- rownames(sfe)[order(-rowData(sfe)$moran.mc_statistic_sample01)[1:4]] plotSpatialFeature(sfe, top_moran, colGeometryName = \"seg_coords\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig7_seqfish.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"seqFISH exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocParallel_1.36.0 spdep_1.3-3 #> [3] sf_1.0-16 spData_2.3.0 #> [5] patchwork_1.2.0 fossil_0.4.0 #> [7] shapefiles_0.7.2 foreign_0.8-86 #> [9] maps_3.4.2 sp_2.1-4 #> [11] dplyr_1.1.4 tidyr_1.3.1 #> [13] purrr_1.0.2 bluster_1.12.0 #> [15] scran_1.30.2 scater_1.30.1 #> [17] ggplot2_3.5.1 scuttle_1.12.0 #> [19] batchelor_1.18.1 SpatialFeatureExperiment_1.3.0 #> [21] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [23] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [25] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [27] IRanges_2.36.0 S4Vectors_0.40.2 #> [29] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [31] matrixStats_1.3.0 SFEData_1.4.0 #> [33] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] later_1.3.2 bitops_1.0-7 #> [3] filelock_1.0.3 tibble_3.2.1 #> [5] lifecycle_1.0.4 edgeR_4.0.16 #> [7] lattice_0.22-6 magrittr_2.0.3 #> [9] limma_3.58.1 sass_0.4.9 #> [11] rmarkdown_2.26 jquerylib_0.1.4 #> [13] yaml_2.3.8 metapod_1.10.1 #> [15] httpuv_1.6.15 RColorBrewer_1.1-3 #> [17] cowplot_1.1.3 DBI_1.2.2 #> [19] ResidualMatrix_1.12.0 abind_1.4-5 #> [21] zlibbioc_1.48.2 RCurl_1.98-1.14 #> [23] rappdirs_0.3.3 GenomeInfoDbData_1.2.11 #> [25] ggrepel_0.9.5 irlba_2.3.5.1 #> [27] terra_1.7-71 units_0.8-5 #> [29] RSpectra_0.16-1 dqrng_0.3.2 #> [31] pkgdown_2.0.9 DelayedMatrixStats_1.24.0 #> [33] codetools_0.2-20 DelayedArray_0.28.0 #> [35] tidyselect_1.2.1 farver_2.1.1 #> [37] ScaledMatrix_1.10.0 viridis_0.6.5 #> [39] BiocFileCache_2.10.2 jsonlite_1.8.8 #> [41] BiocNeighbors_1.20.2 e1071_1.7-14 #> [43] systemfonts_1.0.6 tools_4.3.3 #> [45] ggnewscale_0.4.10 ragg_1.3.0 #> [47] Rcpp_1.0.12 glue_1.7.0 #> [49] gridExtra_2.3 SparseArray_1.2.4 #> [51] xfun_0.43 HDF5Array_1.30.1 #> [53] withr_3.0.0 BiocManager_1.30.22 #> [55] fastmap_1.1.1 boot_1.3-30 #> [57] rhdf5filters_1.14.1 fansi_1.0.6 #> [59] digest_0.6.35 rsvd_1.0.5 #> [61] R6_2.5.1 mime_0.12 #> [63] textshaping_0.3.7 colorspace_2.1-0 #> [65] wk_0.9.1 RSQLite_2.3.6 #> [67] utf8_1.2.4 generics_0.1.3 #> [69] class_7.3-22 httr_1.4.7 #> [71] htmlwidgets_1.6.4 S4Arrays_1.2.1 #> [73] pkgconfig_2.0.3 scico_1.5.0 #> [75] gtable_0.3.5 blob_1.2.4 #> [77] XVector_0.42.0 htmltools_0.5.8.1 #> [79] scales_1.3.0 png_0.1-8 #> [81] knitr_1.45 rjson_0.2.21 #> [83] curl_5.2.1 proxy_0.4-27 #> [85] cachem_1.0.8 rhdf5_2.46.1 #> [87] BiocVersion_3.18.1 KernSmooth_2.23-22 #> [89] parallel_4.3.3 vipor_0.4.7 #> [91] AnnotationDbi_1.64.1 desc_1.4.3 #> [93] s2_1.1.6 pillar_1.9.0 #> [95] grid_4.3.3 vctrs_0.6.5 #> [97] promises_1.3.0 BiocSingular_1.18.0 #> [99] dbplyr_2.5.0 beachmat_2.18.1 #> [101] xtable_1.8-4 cluster_2.1.6 #> [103] beeswarm_0.4.0 evaluate_0.23 #> [105] magick_2.8.3 cli_3.6.2 #> [107] locfit_1.5-9.9 compiler_4.3.3 #> [109] rlang_1.1.3 crayon_1.5.2 #> [111] labeling_0.4.3 classInt_0.4-10 #> [113] fs_1.6.4 ggbeeswarm_0.7.2 #> [115] viridisLite_0.4.2 deldir_2.0-4 #> [117] munsell_0.5.1 Biostrings_2.70.3 #> [119] Matrix_1.6-5 ExperimentHub_2.10.0 #> [121] sparseMatrixStats_1.14.0 bit64_4.0.5 #> [123] Rhdf5lib_1.24.2 KEGGREST_1.42.0 #> [125] statmod_1.5.0 shiny_1.8.1.1 #> [127] highr_0.10 interactiveDisplayBase_1.40.0 #> [129] AnnotationHub_3.10.1 igraph_2.0.3 #> [131] memoise_2.0.1 bslib_0.7.0 #> [133] bit_4.0.5"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"CODEX exploratory data analysis","text":"","code":"library(Voyager) library(SingleCellExperiment) library(SpatialExperiment) library(SpatialFeatureExperiment) library(batchelor) library(scater) library(scran) library(bluster) library(glue) library(purrr) library(tidyr) library(dplyr) library(ggplot2) library(gghighlight) library(patchwork) library(spdep) library(spatialDE) library(BiocParallel) theme_set(theme_bw())"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"dataset","dir":"Articles","previous_headings":"","what":"Dataset","title":"CODEX exploratory data analysis","text":"dataset used vignette paper Strategies Accurate Cell Type Identification CODEX Multiplexed Imaging Data(Hickey, et.al 2021). data collected part HuBMap consortium seeks characterize healthy human tissues make data broadly available. specifically, dataset characterizes 4 regions large intestine (colon) single donor. vignette focus data sigmoid colon. intestinal sections interrogated using multiplexed imaging method CO-Detection indEXing (CODEX). CODEX involves cyclical staining tissue DNA-barcoded antibodies. round experimentation, fluoresently labeled probes hybridize tissue bound DNA-conjugated antibodies subsequently imaged stripped tissue. present, technology quantifies 60 markers single experiment. Raw images generated process subjected image stitching, drift compensation, deconvolution, cycle concatenation using publicly avaialable software. result pre-processing matrix contains location individual cells quantified markers cell. Cell types assigned described manuscript linked . Briefly, authors used hand-gating strategy define cell types create standard compare effect normalization methods clustering cell annotation. raw intensity data available download HuBMAP identifier HBM575.THQMM.284 cell type annotations provided supplementary data manuscript. data relevant vignette converted SFE object available download Box. data submitted SFEData package Bioconductor available future release. begin downloading data loading R. rows count matrix correspond 47 barcoded genes measured CODEX. Additionally, authors provide metadata cells, including cell type. turns column names unique cause errors downstream analysis. update column names ","code":"download.file(\"https://caltech.box.com/public/static/zfr8l20450n2z28lnp0ugdj471ph9eyx\",'./codex.Rds', mode='wb', method = 'wget', quiet = TRUE) sfe <- readRDS(\"./codex.Rds\") sfe #> class: SpatialFeatureExperiment #> dim: 47 19724 #> metadata(0): #> assays(1): protein #> rownames(47): MUC2 SOX9 ... CD49a CD163 #> rowData names(0): #> colnames(19724): 1 2 ... 182 184 #> colData names(9): cell_id cell_type ... fn sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : X Y #> imgData names(0): #> #> unit: full_res_image_pixels #> Geometries: #> colGeometries: centroids (POINT) #> #> Graphs: #> sample01: cellids <- glue(\"{colData(sfe)$fn}_{colData(sfe)$cell_id}\") colnames(sfe) <- cellids"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"exploratory-data-analysis","dir":"Articles","previous_headings":"Dataset","what":"Exploratory Data Analysis","title":"CODEX exploratory data analysis","text":"can see figure colonic epithelium enriched cells loose connective tissue muscle layers beneath epithelial layer sparsely populated. line known colon histology. epithelium enriched goblet cells invaginations project inwards towards connective tissue. Smooth muscle cells also prominent colon, bands muscle contract move colonic contents towards rectum. can visualize cell types space using plotSpatialFeature() function. highlight Goblet smooth muscle cells display relative distribution tissue. Since CODEX image processing relies segmentation, dot plot represents single cell. , cell represented centroid, can also visualized cell polygons cases segmentation mask available. goblet cells clearly define epithelial border tissue thick bands smooth muscle cells prominent mucosa. Next, compute gene level metrics 47 barcoded genes. contrast RNA-based methods, fields matrix represent intensities rather counts. appears sigmoid relationship mean variance protein expression. pattern reminiscent might expected intensity values derived Gamma distribution, continuous analog Negative Binomial distribution typically used describe count data scRNA-seq experiments. may implications CODEX data variance stabilized future. CODEX data subject noise several sources including segmentation artifacts, nonspecific staining, imperfect tissue processing. factors can limit accurate quantification signal intensity impede accurate cell annotation. authors dataset tested effects several normalization methods cell type annotation clustering found Z-score normalization marker resulted accurate identification rare common cell types. cell , demonstrate accomplish using standard matrix operations. normalized count matrix typically stored logcounts slot scRNA-seq data, instead store normalized matrix slot called normalizedIntensity.","code":"celldensity <- plotCellBin2D(sfe) celldensity spatial <- plotSpatialFeature(sfe, features='cell_type', colGeometryName = \"centroids\") + gghighlight(cell_type %in% c(\"Goblet\", \"SmoothMuscleME\")) #> Warning: Tried to calculate with group_by(), but the calculation failed. #> Falling back to ungrouped filter operation... spatial rowData(sfe)$mean <- rowMeans(assay(sfe)) rowData(sfe)$var <- rowVars(assay(sfe)) data.frame(rowData(sfe)) |> ggplot(aes(mean, var)) + geom_point() mtx <- assay(sfe, 'protein') assay(sfe, 'normalizedIntensity') <- (mtx - rowMeans(mtx))/rowSds(mtx) assays(sfe) #> List of length 2 #> names(2): protein normalizedIntensity"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"spatial-eda","dir":"Articles","previous_headings":"","what":"Spatial EDA","title":"CODEX exploratory data analysis","text":"Neighbor definition critical step computation metrics spatial dependency like Moran’s Geary’s C. definition neighbors complex, even cell polygons available. latter case, poly2nb method might appropriate assign two cells neighbors physically touch share border. may tenable cases cells sparse cells represented centroids, dataset. compute spatial neighborhood graph using knearestneigh function implemented spdep. brief, Euclidean distances computed pair cells k nearest cells considered neighbors. following code cell, consdier k=10 speed purposes, may ideal general. weights neighborhood matrix inverse-distance weighted, weight regions listed neighbors increases distance pairs points decreases. Setting style = \"W\" ensures weights row standardized. plotColGraph() function plots graph space along corresponding colGeometry, since many cells dataset, plotting neighborhood graph may useful many connections obscure overlapping lines. case, demonstrate use function . Next, explore univariate metrics global spatial autocorrelation. Since genes quantified study, compute metrics genes. larger datasets, may useful restrict analysis variable genes. use runUnivariate() function compute spatial autocorrelation metrics save results SFE object. results computations accessible rowData attribute SFE object. Next, plot results genes highest Moran’s statistic. vertical line plot represents observed Moran’s density represents Moran’s statistic random permutations data. plots suggests Moran’s statistic significant. can plot normalized intensity genes space. genes appear spatial distribution, also seems may overlap cell type. cells appear express genes interest seem spatially restricted known boundaries tissue. moranPlot() function plots spatial data spatially lagged values enables users assess similar observed values neighbors. variable centered, plot divided four quadrants defined horizontal line y = 0 vertical line x = 0. Points upper right (high-high) lower left (low-low) quadrants indicate positive spatial association, points lower right (high-low) upper left (low-high) quadrants include observations exhibit negative spatial association.","code":"colGraph(sfe, \"knn10\") <- findSpatialNeighbors( sfe, method = \"knearneigh\", dist_type = \"idw\", k = 10, style = \"W\") #> Warning in (function (to_check, X, clust_centers, clust_info, dtype, nn, : #> detected tied distances to neighbors, see ?'BiocNeighbors-ties' plotColGraph(sfe, colGraphName = \"knn10\", colGeometryName = 'centroids') sfe <- runUnivariate( sfe, type = \"moran.mc\", features = rownames(sfe), exprs_values = \"normalizedIntensity\", colGraphName = \"knn10\", nsim = 100, BPPARAM = MulticoreParam(2)) sfe <- runUnivariate( sfe, type = \"moran.plot\", features = rownames(sfe), exprs_values = \"normalizedIntensity\", colGraphName = \"knn10\") colnames(rowData(sfe)) #> [1] \"mean\" \"var\" #> [3] \"moran.mc_statistic_sample01\" \"moran.mc_parameter_sample01\" #> [5] \"moran.mc_p.value_sample01\" \"moran.mc_alternative_sample01\" #> [7] \"moran.mc_method_sample01\" \"moran.mc_res_sample01\" top_moran <- data.frame(rowData(sfe)) |> arrange(desc(moran.mc_statistic_sample01)) |> head(6) |> rownames() moran <- plotMoranMC(sfe, features = top_moran, facet_by = 'features') moran plotSpatialFeature( sfe, features=top_moran, colGeometryName = \"centroids\", exprs_values = \"normalizedIntensity\", scattermore = TRUE, pointsize = 1) moranPlot(sfe, top_moran[1])"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"differential-expression","dir":"Articles","previous_headings":"","what":"Differential Expression","title":"CODEX exploratory data analysis","text":"Moran’s global spatial autocorrelation metrics provide insight spatial patterns gene expression, necessarily limited structure imposed spatial weights matrix. complimentary task might identify spatially variable (SV) genes. One method described SpatialDE: identification spatially variable genes. method described manuscript relies Gaussian process regression decomposes variability expression spatial non-spatial components. contrast Moran’s , covariance pair cells modeled function distance . Notably, require explicit specification hte neighborhood graph, rather parameter controls decay covariance distance increases. spatialDE package implemented R requires normalized matrix input. spatialDE() function package performs normalization steps running algorithm. data already normalized, use run() function directly run spatialDE. first convert centroid coordinates data frame required function. can plot normalized expression top 5 genes space. Perhaps unsurprisingly, expression top DE genes seems highlight spatial distribution known cell types tissue rather identify spatially restricted gene expression. related experimental design targeted genes chosen differentiate cell types. Perhaps genome-wide technologies, potential discovery neew gene expression patterns plausible. open question whether results offer new information compared inferred typical DE expression methods. analyses represent minority types inferences can made protein expression data. interested investigate protein expression results compare inform data spatail scRNA-sequencing experiments. Already, work done obtain multimodal spatial measurements sample. Importantly however, considerations made types biases individual technology adds measurements. active areas research ripe future exploration.","code":"# Store coordinates in a data frame object coords <- centroids(sfe)$geometry |> purrr::map_dfr(\\(x) c(x = x[1], y = x[2])) # de_res <- spatialDE::run(assay(sfe,\"normalizedIntensity\"), coords, verbose=TRUE) # top_genes <- de_res |> # arrange(pval) |> # slice_head(n=6) |> # pull(g) # # plotSpatialFeature(sfe, top_genes, colGeometryName=\"centroids\", # exprs_values = \"normalizedIntensity\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session Info","title":"CODEX exploratory data analysis","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] BiocParallel_1.36.0 spatialDE_1.8.1 #> [3] spdep_1.3-3 sf_1.0-16 #> [5] spData_2.3.0 patchwork_1.2.0 #> [7] gghighlight_0.4.1 dplyr_1.1.4 #> [9] tidyr_1.3.1 purrr_1.0.2 #> [11] glue_1.7.0 bluster_1.12.0 #> [13] scran_1.30.2 scater_1.30.1 #> [15] ggplot2_3.5.1 scuttle_1.12.0 #> [17] batchelor_1.18.1 SpatialFeatureExperiment_1.3.0 #> [19] SpatialExperiment_1.12.0 SingleCellExperiment_1.24.0 #> [21] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [23] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [25] IRanges_2.36.0 S4Vectors_0.40.2 #> [27] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [29] matrixStats_1.3.0 Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] memoise_2.0.1 DelayedMatrixStats_1.24.0 #> [15] RCurl_1.98-1.14 terra_1.7-71 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] BiocNeighbors_1.20.2 Rhdf5lib_1.24.2 #> [21] s2_1.1.6 SparseArray_1.2.4 #> [23] rhdf5_2.46.1 sass_0.4.9 #> [25] KernSmooth_2.23-22 bslib_0.7.0 #> [27] basilisk_1.14.3 htmlwidgets_1.6.4 #> [29] desc_1.4.3 cachem_1.0.8 #> [31] ResidualMatrix_1.12.0 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 Matrix_1.6-5 #> [37] R6_2.5.1 fastmap_1.1.1 #> [39] GenomeInfoDbData_1.2.11 digest_0.6.35 #> [41] colorspace_2.1-0 ggnewscale_0.4.10 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] filelock_1.0.3 fansi_1.0.6 #> [51] mgcv_1.9-1 abind_1.4-5 #> [53] compiler_4.3.3 proxy_0.4-27 #> [55] withr_3.0.0 backports_1.4.1 #> [57] viridis_0.6.5 DBI_1.2.2 #> [59] highr_0.10 HDF5Array_1.30.1 #> [61] MASS_7.3-60.0.1 DelayedArray_0.28.0 #> [63] rjson_0.2.21 classInt_0.4-10 #> [65] tools_4.3.3 units_0.8-5 #> [67] vipor_0.4.7 beeswarm_0.4.0 #> [69] nlme_3.1-164 rhdf5filters_1.14.1 #> [71] grid_4.3.3 checkmate_2.3.1 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] isoband_0.2.7 gtable_0.3.5 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 metapod_1.10.1 #> [81] sp_2.1-4 utf8_1.2.4 #> [83] XVector_0.42.0 ggrepel_0.9.5 #> [85] pillar_1.9.0 limma_3.58.1 #> [87] splines_4.3.3 lattice_0.22-6 #> [89] deldir_2.0-4 tidyselect_1.2.1 #> [91] locfit_1.5-9.9 knitr_1.45 #> [93] gridExtra_2.3 edgeR_4.0.16 #> [95] scattermore_1.2 xfun_0.43 #> [97] statmod_1.5.0 yaml_2.3.8 #> [99] boot_1.3-30 evaluate_0.23 #> [101] codetools_0.2-20 tibble_3.2.1 #> [103] cli_3.6.2 reticulate_1.36.1 #> [105] systemfonts_1.0.6 munsell_0.5.1 #> [107] jquerylib_0.1.4 Rcpp_1.0.12 #> [109] dir.expiry_1.10.0 png_0.1-8 #> [111] parallel_4.3.3 pkgdown_2.0.9 #> [113] basilisk.utils_1.14.1 sparseMatrixStats_1.14.0 #> [115] bitops_1.0-7 viridisLite_0.4.2 #> [117] scales_1.3.0 e1071_1.7-14 #> [119] crayon_1.5.2 scico_1.5.0 #> [121] rlang_1.1.3"},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/vig9_splitseq.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"SPLiT-seq basic quality control","text":"data vignette shipped cellatlas repository. count matrix metadata provided cellatlas/examples folder AnnData object. begin loading object converting SpatialFeatureExperiment object.","code":"library(stringr) library(Matrix) library(SpatialExperiment) library(SpatialFeatureExperiment) library(scater) library(scuttle) library(Voyager) if (!file.exists(\"splitseq.rds\")) download.file(\"https://github.com/pachterlab/voyager/raw/documentation-devel/vignettes/splitseq.rds\", destfile = \"splitseq.rds\") sce <- readRDS(\"splitseq.rds\") is_mito <- str_detect(rowData(sce)$gene_name, regex(\"^mt-\", ignore_case=TRUE)) sum(is_mito) #> [1] 37 sce <- addPerCellQCMetrics(sce, subsets = list(mito = is_mito)) names(colData(sce)) #> [1] \"sum\" \"detected\" \"subsets_mito_sum\" #> [4] \"subsets_mito_detected\" \"subsets_mito_percent\" \"total\" plotColData(sce, \"sum\") + plotColData(sce, \"detected\") + plotColData(sce, \"subsets_mito_percent\") #> Warning: Removed 7213 rows containing non-finite outside the scale range #> (`stat_ydensity()`). #> Warning: Removed 7213 rows containing missing values or values outside the scale range #> (`position_quasirandom()`). plotColData(sce, x = \"sum\", y = \"detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. plotColData(sce, x = \"sum\", y = \"subsets_mito_detected\", bins = 100) + scale_fill_distiller(palette = \"Blues\", direction = 1) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sce <- sce[, which(sce$subsets_mito_percent < 20)] sce <- sce[rowSums(counts(sce)) > 0,] sce #> class: SingleCellExperiment #> dim: 18272 102057 #> metadata(0): #> assays(1): counts #> rownames(18272): ENSMUSG00000086053.2 ENSMUSG00000051285.18 ... #> ENSMUSG00000079808.4 ENSMUSG00000095041.8 #> rowData names(1): gene_name #> colnames(102057): AAACATCGAAACATCGACTTCATC AAACATCGAAACATCGAGTCTTGG ... #> TTCACGCATTCACGCATCATATTC TTCACGCATTCACGCATTCATCGC #> colData names(6): sum detected ... subsets_mito_percent total #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] Voyager_1.4.0 scater_1.30.1 #> [3] ggplot2_3.5.1 scuttle_1.12.0 #> [5] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [7] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [9] Biobase_2.62.0 GenomicRanges_1.54.1 #> [11] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [13] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [15] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [17] Matrix_1.6-5 stringr_1.5.1 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 patchwork_1.2.0 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] abind_1.4-5 compiler_4.3.3 #> [51] proxy_0.4-27 withr_3.0.0 #> [53] BiocParallel_1.36.0 viridis_0.6.5 #> [55] DBI_1.2.2 highr_0.10 #> [57] HDF5Array_1.30.1 DelayedArray_0.28.0 #> [59] rjson_0.2.21 classInt_0.4-10 #> [61] bluster_1.12.0 tools_4.3.3 #> [63] units_0.8-5 vipor_0.4.7 #> [65] beeswarm_0.4.0 glue_1.7.0 #> [67] rhdf5filters_1.14.1 grid_4.3.3 #> [69] sf_1.0-16 cluster_2.1.6 #> [71] generics_0.1.3 gtable_0.3.5 #> [73] class_7.3-22 BiocSingular_1.18.0 #> [75] ScaledMatrix_1.10.0 sp_2.1-4 #> [77] utf8_1.2.4 XVector_0.42.0 #> [79] ggrepel_0.9.5 pillar_1.9.0 #> [81] limma_3.58.1 dplyr_1.1.4 #> [83] lattice_0.22-6 deldir_2.0-4 #> [85] tidyselect_1.2.1 locfit_1.5-9.9 #> [87] knitr_1.45 gridExtra_2.3 #> [89] edgeR_4.0.16 xfun_0.43 #> [91] statmod_1.5.0 stringi_1.8.3 #> [93] yaml_2.3.8 boot_1.3-30 #> [95] evaluate_0.23 codetools_0.2-20 #> [97] tibble_3.2.1 cli_3.6.2 #> [99] reticulate_1.36.1 systemfonts_1.0.6 #> [101] munsell_0.5.1 jquerylib_0.1.4 #> [103] Rcpp_1.0.12 png_0.1-8 #> [105] parallel_4.3.3 pkgdown_2.0.9 #> [107] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [109] viridisLite_0.4.2 scales_1.3.0 #> [111] e1071_1.7-14 purrr_1.0.2 #> [113] crayon_1.5.2 scico_1.5.0 #> [115] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Basic analysis of 10X example Visium dataset","text":"introductory vignette SpatialFeatureExperiment data representation Voyager anlaysis package, demonstrate basic exploratory data analysis (EDA) spatial transcriptomics data. Basic knowledge R SingleCellExperiment assumed. vignette showcases packages Visium spatial gene expression system dataset, downloaded 10X website, Space Ranger output format. technology chosen due popularity, therefore availability numerous publicly available datasets analysis (Moses Pachter 2022). Voyager developed goal facilitating use geospatial methods spatial genomics, introductory vignette restricted non-spatial scRNA-seq EDA Visium dataset. another Visium introductory vignette using dataset SFEData package 10X website. load packages used vignette. download data 10X website. unfiltered gene count matrix: spatial information: Decompress downloaded content: outs directory Space Ranger output looks like: gene count matrix directory: spatial directory: outputs spatial directory explained 10X website. tissue_hires_image.png relatively high resolution image tissue, full resolution. tissue_lowres_image.png file low resolution image tissue, suitable quick plotting, shown : array dots framing tissue seen image fiducials, used align tissue image positions Visium spots, gene expression can matched spatial locations. alignment fiducials shown aligned_fiducials.jpg. Space Ranger can automatically detect spots tissue, spots highlighted detected_tissue_image.jpg. Inside scalefactors_json.json file: spot_diameter_fullres diameter Visium spot full resolution H&E image pixels. tissue_hires_scalef tissue_lowres_scalef ratio size high resolution (full resolution) low resolution H&E image full resolution image. fiducial_diameter_fullres diameter fiducial spot used align spots H&E image pixels full resolution image. tissue_positions_list.csv file contains information coordinates spots full resolution image whether spot tissue (in_tissue, 1 means yes 0 means ) automatically detected Space Ranger manually annotated Loupe browser. spatial_enrichment.csv file Moran’s (presumably spots tissue) p-value gene detected least 10 spots least 20 UMIs. read Space Ranger output R SFE object:","code":"library(Voyager) library(SpatialExperiment) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(ggplot2) library(scater) library(scuttle) library(scran) library(stringr) library(patchwork) library(bluster) library(rjson) theme_set(theme_bw()) if (!file.exists(\"visium_ob.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_raw_feature_bc_matrix.tar.gz\", destfile = \"visium_ob.tar.gz\") if (!file.exists(\"visium_ob_spatial.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_spatial.tar.gz\", destfile = \"visium_ob_spatial.tar.gz\") if (!dir.exists(\"outs\")) { dir.create(\"outs\") system(\"tar -xvf visium_ob.tar.gz -C outs\") system(\"tar -xvf visium_ob_spatial.tar.gz -C outs\") } list.dirs(\"outs\") #> [1] \"outs\" \"outs/raw_feature_bc_matrix\" #> [3] \"outs/spatial\" list.files(\"outs/raw_feature_bc_matrix\") #> [1] \"barcodes.tsv.gz\" \"features.tsv.gz\" \"matrix.mtx.gz\" list.files(\"outs/spatial\") #> [1] \"aligned_fiducials.jpg\" \"detected_tissue_image.jpg\" #> [3] \"scalefactors_json.json\" \"spatial_enrichment.csv\" #> [5] \"tissue_hires_image.png\" \"tissue_lowres_image.png\" #> [7] \"tissue_positions.csv\" fromJSON(file = \"outs/spatial/scalefactors_json.json\") #> $tissue_hires_scalef #> [1] 0.2 #> #> $tissue_lowres_scalef #> [1] 0.06 #> #> $fiducial_diameter_fullres #> [1] 118.9155 #> #> $spot_diameter_fullres #> [1] 73.61433 head(read.csv(\"outs/spatial/tissue_positions.csv\")) #> barcode in_tissue array_row array_col pxl_row_in_fullres #> 1 ACGCCTGACACGCGCT-1 0 0 0 8668 #> 2 TACCGATCCAACACTT-1 0 1 1 8611 #> 3 ATTAAAGCGGACGAGC-1 0 0 2 8554 #> 4 GATAAGGGACGATTAG-1 0 1 3 8498 #> 5 GTGCAAATCACCAATA-1 0 0 4 8441 #> 6 TGTTGGCTGGCGGAAG-1 0 1 5 8384 #> pxl_col_in_fullres #> 1 1102 #> 2 1200 #> 3 1102 #> 4 1200 #> 5 1102 #> 6 1200 head(read.csv(\"outs/spatial/spatial_enrichment.csv\")) #> Feature.ID Feature.Name Feature.Type I P.value #> 1 ENSMUSG00000001023 S100a5 Gene Expression 0.7709048 0 #> 2 ENSMUSG00000019874 Fabp7 Gene Expression 0.6987346 0 #> 3 ENSMUSG00000002985 Apoe Gene Expression 0.6945210 0 #> 4 ENSMUSG00000025739 Gng13 Gene Expression 0.6585750 0 #> 5 ENSMUSG00000090223 Pcp4 Gene Expression 0.6317032 0 #> 6 ENSMUSG00000053310 Nrgn Gene Expression 0.6033600 0 #> Adjusted.p.value Feature.Counts.in.Spots.Under.Tissue #> 1 0 9019 #> 2 0 13462 #> 3 0 67509 #> 4 0 5260 #> 5 0 45118 #> 6 0 10723 #> Median.Normalized.Average.Counts Barcodes.Detected.per.Feature #> 1 15.848669 1021 #> 2 20.679932 1170 #> 3 76.635169 1184 #> 4 8.803694 1050 #> 5 25.811125 1133 #> 6 6.075966 898 (sfe <- read10xVisiumSFE(samples = \".\", type = \"sparse\", data = \"raw\")) #> class: SpatialFeatureExperiment #> dim: 32285 4992 #> metadata(0): #> assays(1): counts #> rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000095019 ENSMUSG00000095041 #> rowData names(8): symbol Feature.Type ... #> Median.Normalized.Average.Counts_sample01 #> Barcodes.Detected.per.Feature_sample01 #> colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> sample01:"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"quality-control-qc","dir":"Articles","previous_headings":"","what":"Quality control (QC)","title":"Basic analysis of 10X example Visium dataset","text":"mouse olfactory bulb conventionally plotted horizontally. entire SFE object can transposed histologial space make olfactory bulb horizontal. Percentage mitochondrial counts spots outside tissue higher near tissue, especially left. 3 peaks, apparently histologically relevant. Also obvious outliers. unlike scRNA-seq data. Spots tissue wide range mitocondrial percentage. Spots tissue fall 3 clusters plot, seemingly related histological regions.","code":"is_mt <- str_detect(rowData(sfe)$symbol, \"^mt-\") sfe <- addPerCellQCMetrics(sfe, subsets = list(mito = is_mt)) names(colData(sfe)) #> [1] \"in_tissue\" \"array_row\" \"array_col\" #> [4] \"sample_id\" \"sum\" \"detected\" #> [7] \"subsets_mito_sum\" \"subsets_mito_detected\" \"subsets_mito_percent\" #> [10] \"total\" sfe <- SpatialFeatureExperiment::transpose(sfe) plotSpatialFeature(sfe, c(\"sum\", \"detected\", \"subsets_mito_percent\"), image_id = \"lowres\", maxcell = 5e4, ncol = 2) plotColData(sfe, \"sum\", x = \"in_tissue\", color_by = \"in_tissue\") + plotColData(sfe, \"detected\", x = \"in_tissue\", color_by = \"in_tissue\") + plotColData(sfe, \"subsets_mito_percent\", x = \"in_tissue\", color_by = \"in_tissue\") + plot_layout(guides = \"collect\") plotColData(sfe, x = \"sum\", y = \"subsets_mito_percent\", color_by = \"in_tissue\") + geom_density_2d() sfe_tissue <- sfe[,sfe$in_tissue] plotColData(sfe_tissue, x = \"sum\", y = \"detected\", bins = 75) #clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue, lowess = FALSE) hvgs <- getTopHVGs(dec, n = 2000)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"dimension-reduction-and-clustering","dir":"Articles","previous_headings":"","what":"Dimension reduction and clustering","title":"Basic analysis of 10X example Visium dataset","text":"clustering show dimension reduction plots Significant markers cluster can obtained follows: genes interesting view spatial context: spatial analyses dataset performed “advanced” version vignette.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat ElbowPlot(sfe_tissue, ndims = 30) names(rowData(sfe_tissue)) #> [1] \"symbol\" #> [2] \"Feature.Type\" #> [3] \"I_sample01\" #> [4] \"P.value_sample01\" #> [5] \"Adjusted.p.value_sample01\" #> [6] \"Feature.Counts.in.Spots.Under.Tissue_sample01\" #> [7] \"Median.Normalized.Average.Counts_sample01\" #> [8] \"Barcodes.Detected.per.Feature_sample01\" plotDimLoadings(sfe_tissue, dims = 1:5, swap_rownames = \"symbol\", ncol = 3) set.seed(29) colData(sfe_tissue)$cluster <- clusterRows(reducedDim(sfe_tissue, \"PCA\")[,1:3], BLUSPARAM = SNNGraphParam( cluster.fun = \"leiden\", cluster.args = list( resolution_parameter = 0.5, objective_function = \"modularity\"))) plotPCA(sfe_tissue, ncomponents = 5, colour_by = \"cluster\") plotSpatialFeature(sfe_tissue, features = \"cluster\", colGeometryName = \"spotPoly\", image_id = \"lowres\") spatialReducedDim(sfe_tissue, \"PCA\", ncomponents = 5, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, ncol = 2, image_id = \"lowres\", maxcell = 5e4) markers <- findMarkers(sfe_tissue, groups = colData(sfe_tissue)$cluster, test.type = \"wilcox\", pval.type = \"all\", direction = \"up\") genes_use <- vapply(markers, function(x) rownames(x)[1], FUN.VALUE = character(1)) plotExpression(sfe_tissue, rowData(sfe_tissue)[genes_use, \"symbol\"], x = \"cluster\", colour_by = \"cluster\", swap_rownames = \"symbol\") plotSpatialFeature(sfe_tissue, genes_use, colGeometryName = \"spotPoly\", ncol = 2, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Basic analysis of 10X example Visium dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] rjson_0.2.21 bluster_1.12.0 #> [3] patchwork_1.2.0 stringr_1.5.1 #> [5] scran_1.30.2 scater_1.30.1 #> [7] scuttle_1.12.0 ggplot2_3.5.1 #> [9] SpatialFeatureExperiment_1.3.0 SpatialExperiment_1.12.0 #> [11] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 #> [13] Biobase_2.62.0 GenomicRanges_1.54.1 #> [15] GenomeInfoDb_1.38.8 IRanges_2.36.0 #> [17] S4Vectors_0.40.2 BiocGenerics_0.48.1 #> [19] MatrixGenerics_1.14.0 matrixStats_1.3.0 #> [21] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] terra_1.7-71 htmltools_0.5.8.1 #> [19] S4Arrays_1.2.1 BiocNeighbors_1.20.2 #> [21] Rhdf5lib_1.24.2 s2_1.1.6 #> [23] SparseArray_1.2.4 rhdf5_2.46.1 #> [25] sass_0.4.9 spData_2.3.0 #> [27] KernSmooth_2.23-22 bslib_0.7.0 #> [29] htmlwidgets_1.6.4 desc_1.4.3 #> [31] cachem_1.0.8 igraph_2.0.3 #> [33] lifecycle_1.0.4 pkgconfig_2.0.3 #> [35] rsvd_1.0.5 Matrix_1.6-5 #> [37] R6_2.5.1 fastmap_1.1.1 #> [39] GenomeInfoDbData_1.2.11 digest_0.6.35 #> [41] colorspace_2.1-0 ggnewscale_0.4.10 #> [43] dqrng_0.3.2 RSpectra_0.16-1 #> [45] irlba_2.3.5.1 textshaping_0.3.7 #> [47] beachmat_2.18.1 labeling_0.4.3 #> [49] fansi_1.0.6 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 BiocParallel_1.36.0 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 R.utils_2.12.3 #> [59] HDF5Array_1.30.1 MASS_7.3-60.0.1 #> [61] DelayedArray_0.28.0 classInt_0.4-10 #> [63] tools_4.3.3 units_0.8-5 #> [65] vipor_0.4.7 beeswarm_0.4.0 #> [67] R.oo_1.26.0 glue_1.7.0 #> [69] rhdf5filters_1.14.1 grid_4.3.3 #> [71] sf_1.0-16 cluster_2.1.6 #> [73] generics_0.1.3 isoband_0.2.7 #> [75] gtable_0.3.5 R.methodsS3_1.8.2 #> [77] class_7.3-22 BiocSingular_1.18.0 #> [79] ScaledMatrix_1.10.0 metapod_1.10.1 #> [81] sp_2.1-4 utf8_1.2.4 #> [83] XVector_0.42.0 ggrepel_0.9.5 #> [85] pillar_1.9.0 limma_3.58.1 #> [87] dplyr_1.1.4 lattice_0.22-6 #> [89] deldir_2.0-4 tidyselect_1.2.1 #> [91] locfit_1.5-9.9 knitr_1.45 #> [93] gridExtra_2.3 edgeR_4.0.16 #> [95] xfun_0.43 statmod_1.5.0 #> [97] DropletUtils_1.22.0 stringi_1.8.3 #> [99] yaml_2.3.8 boot_1.3-30 #> [101] evaluate_0.23 codetools_0.2-20 #> [103] tibble_3.2.1 cli_3.6.2 #> [105] systemfonts_1.0.6 munsell_0.5.1 #> [107] jquerylib_0.1.4 Rcpp_1.0.12 #> [109] parallel_4.3.3 pkgdown_2.0.9 #> [111] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [113] viridisLite_0.4.2 scales_1.3.0 #> [115] e1071_1.7-14 purrr_1.0.2 #> [117] crayon_1.5.2 scico_1.5.0 #> [119] rlang_1.1.3 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Spatial analysis of 10X example Visium dataset","text":"introductory vignette, performed basic non-spatial analyses mouse olfactory bulb Visium dataset 10X website. vignette, perform spatial analyses histological space well gene expression space. load packages used vignette: download data 10X website. unfiltered gene count matrix: spatial information: Decompress downloaded content: Contents outs directory Space Ranger explained introductory vignette. read data R SFE object. add QC metrics, already plotted introductory vignette.","code":"library(Voyager) library(SpatialFeatureExperiment) library(SingleCellExperiment) library(ggplot2) library(scater) library(scuttle) library(scran) library(stringr) library(patchwork) library(bluster) library(rjson) library(EBImage) library(terra) library(rlang) library(sf) library(rmapshaper) library(dplyr) library(BiocParallel) library(BiocNeighbors) library(reticulate) theme_set(theme_bw()) # Specify Python version to use gget PY_PATH <- system(\"which python\", intern = TRUE) use_python(PY_PATH) py_config() #> python: /Users/runner/hostedtoolcache/Python/3.10.14/x64/bin/python #> libpython: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/libpython3.10.dylib #> pythonhome: /Users/runner/hostedtoolcache/Python/3.10.14/x64:/Users/runner/hostedtoolcache/Python/3.10.14/x64 #> version: 3.10.14 (main, Mar 20 2024, 15:17:04) [Clang 13.0.0 (clang-1300.0.29.30)] #> numpy: /Users/runner/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/numpy #> numpy_version: 1.26.4 #> #> NOTE: Python version was forced by use_python() function gget <- import(\"gget\") if (!file.exists(\"visium_ob.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_raw_feature_bc_matrix.tar.gz\", destfile = \"visium_ob.tar.gz\") if (!file.exists(\"visium_ob_spatial.tar.gz\")) download.file(\"https://cf.10xgenomics.com/samples/spatial-exp/2.0.0/Visium_Mouse_Olfactory_Bulb/Visium_Mouse_Olfactory_Bulb_spatial.tar.gz\", destfile = \"visium_ob_spatial.tar.gz\") if (!dir.exists(\"outs\")) { dir.create(\"outs\") system(\"tar -xvf visium_ob.tar.gz -C outs\") system(\"tar -xvf visium_ob_spatial.tar.gz -C outs\") } (sfe <- read10xVisiumSFE(samples = \".\", type = \"sparse\", data = \"raw\")) #> class: SpatialFeatureExperiment #> dim: 32285 4992 #> metadata(0): #> assays(1): counts #> rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ... #> ENSMUSG00000095019 ENSMUSG00000095041 #> rowData names(8): symbol Feature.Type ... #> Median.Normalized.Average.Counts_sample01 #> Barcodes.Detected.per.Feature_sample01 #> colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ... #> TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1 #> colData names(4): in_tissue array_row array_col sample_id #> reducedDimNames(0): #> mainExpName: NULL #> altExpNames(0): #> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres #> imgData names(4): sample_id image_id data scaleFactor #> #> unit: full_res_image_pixel #> Geometries: #> colGeometries: spotPoly (POLYGON) #> #> Graphs: #> sample01: is_mt <- str_detect(rowData(sfe)$symbol, \"^mt-\") sfe <- addPerCellQCMetrics(sfe, subsets = list(mito = is_mt))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"tissue-segmentation","dir":"Articles","previous_headings":"","what":"Tissue segmentation","title":"Spatial analysis of 10X example Visium dataset","text":"Space Ranger can automatically detect spots tissue Loupe browser can used manually annotate spots tissue, may interesting get tissue outline polygon, know much spot overlaps tissue plot outline. tissue boundary polygon can manually annotated QuPath, saves polygon GeoJSON can directly read R st_read(). can segment tissue computationally. R generally isn’t great image processing, packages can perform segmentation, EBImage, based house C C++ code, imager, based CImg. don’t full resolution image. perform tissue segmentation high resolution downsampled image scale make coordinates tissue boundary match spots. EBImage package used . Compared OpenCV, EBImage slow full resolution image, fine downsized image. rendered static webpage, image static, run interactively, image shown interactive widget can zoom pan. show RGB channels separately tissue can discerned thresholding. tall peak right background. much lower peaks around 0.6 0.85 must tissue. capture faint bluish region, blue channel used thresholding. threshold chosen based histogram experimenting nearby values. use opening operation (erosion followed dilation) denoise small holes tissue, can removed closing operation (dilation followed erosion): larger holes tissue mask, may real holes faint regions nuclei missed thresholding. might large enough affect Visium spots intersect tissue. Now main piece tissue clear. must object largest area. However, two small pieces belong tissue top left. debris fiducials can removed setting pixels mask outside bounding box main piece 0. assign different value contiguous object bwlabel(), use computeFeatures.shape() find area among shape features (e.g. perimeter) object. remove small pieces debris. Object number 797 piece debris bottom left. pieces area 100 pixels tissue. Since debris really small bits tissue, boundary debris tissue can blurry. two distinguished morphology H&E image proximity main tissue. remove debris mask Since holes mask faint regions tissue missed thresholding, holes filled segmentation process took lot manual oversight, choosing threshold, choosing kernel size shape opening closing operations, deciding whether fill holes, deciding debris tissue.","code":"img <- readImage(\"outs/spatial/tissue_hires_image.png\") display(img) img2 <- img colorMode(img2) <- Grayscale display(img2, all = TRUE) hist(img) mask <- img2[,,3] < 0.87 display(mask) kern <- makeBrush(3, shape='disc') mask_open <- opening(mask, kern) display(mask_open) mask_close <- closing(mask_open, kern) display(mask_close) mask_label <- bwlabel(mask_close) fts <- computeFeatures.shape(mask_label) head(fts) #> s.area s.perimeter s.radius.mean s.radius.sd s.radius.min s.radius.max #> 1 39 25 3.428773 1.3542219 1.4176036 5.762777 #> 2 20 14 2.032665 0.3439068 1.5000000 2.500000 #> 3 8 8 1.144123 0.4370160 0.7071068 1.581139 #> 4 14 10 1.689175 0.2160726 1.5811388 2.121320 #> 5 15 12 1.716761 0.4684015 1.0000000 2.236068 #> 6 9 8 1.207107 0.2071068 1.0000000 1.414214 summary(fts[,\"s.area\"]) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 8.0 14.0 51.0 595.9 345.0 496326.0 max_ind <- which.max(fts[,\"s.area\"]) inds <- which(as.array(mask_label) == max_ind, arr.ind = TRUE) head(inds) #> row col #> [1,] 1168 562 #> [2,] 1169 562 #> [3,] 1170 562 #> [4,] 1158 563 #> [5,] 1159 563 #> [6,] 1160 563 row_inds <- c(seq_len(min(inds[,1])-1), seq(max(inds[,1])+1, nrow(mask_label), by = 1)) col_inds <- c(seq_len(min(inds[,2])-1), seq(max(inds[,2])+1, nrow(mask_label), by = 1)) mask_label[row_inds, ] <- 0 mask_label[,col_inds] <- 0 display(mask_label) unique(as.vector(mask_label)) #> [1] 0 421 425 429 430 438 450 458 461 469 473 483 487 505 523 633 640 642 651 #> [20] 678 739 741 757 762 775 778 789 791 797 805 810 813 820 821 822 826 831 838 #> [39] 839 840 843 845 848 849 861 862 863 fts2 <- fts[unique(as.vector(mask_label))[-1],] fts2 <- fts2[order(fts2[,\"s.area\"], decreasing = TRUE),] plot(fts2[,1][-1], type = \"l\", ylab = \"Area\") head(fts2, 10) #> s.area s.perimeter s.radius.mean s.radius.sd s.radius.min s.radius.max #> 421 496326 3151 395.118732 68.6493949 234.1605637 485.715835 #> 450 217 55 7.840627 1.2883202 5.0010247 10.458197 #> 849 211 63 7.961248 2.0228566 3.8569142 12.189753 #> 797 182 56 7.547362 2.3368359 3.0772370 11.839919 #> 461 136 54 7.186020 3.2751628 0.9255555 12.479805 #> 741 92 56 6.365661 2.8382829 1.3273276 11.653219 #> 840 69 33 4.503526 1.6417370 1.6026264 7.076974 #> 862 63 37 4.854424 2.4445530 0.6361407 8.837838 #> 839 45 25 3.305562 0.7074306 1.9320455 4.526897 #> 775 32 22 2.887407 1.1755375 0.5300865 4.543636 #display(mask_label == 797) mask_label[mask_label %in% c(797, as.numeric(rownames(fts2)[fts2[,1] < 100]))] <- 0 mask_label <- fillHull(mask_label) display(paintObjects(mask_label, img, col=c(\"red\", \"yellow\"), opac=c(1, 0.3)))"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"convert-tissue-mask-to-polygon","dir":"Articles","previous_headings":"","what":"Convert tissue mask to polygon","title":"Spatial analysis of 10X example Visium dataset","text":"Now tissue mask, convert polygon. OpenCV can directly perform conversion, isn’t comprehensive R wrapper OpenCV, conversion convoluted R. first convert Image object raster implemented terra, core R package geospatial raster data. terra can convert raster polygon. image downsized, polygon look quite pixelated. mitigate pixelation save memory, ms_simplify() function used simplify polygon, keeping small proportion vertices. st_simplify() function sf can also simplify polygons, can’t specify proportion vertices keep. adding geometry SFE object, needs scaled match coordinates spots mouse olfactory bulb conventionally plotted horizontally. entire SFE object can transposed histologial space make olfactory bulb horizontal. can use geometric operations find spots intersect tissue, spots covered tissue, much spot intersects tissue. Discrepancies Space Ranger’s annotation annotation based tissue segmentation : Spots margin can intersect tissue without covered . can also get geometries intersections tissue Visium spots, calculate percentage spot tissue. However, percentage may useful tissue segmentation subject error. percentage may useful pathologist annotated histological regions objects nuclei myofibers. spots intersect tissue, total counts relate percentage spot tissue? Spots fully covered tissue lower total UMI counts, can due fully tissue cell types lower total counts histological region near edge, spots fully covered tissue also low UMI counts.","code":"raster2polygon <- function(seg, keep = 0.2) { r <- rast(as.array(seg), extent = ext(0, nrow(seg), 0, ncol(seg))) |> trans() |> flip() r[r < 1] <- NA contours <- st_as_sf(as.polygons(r, dissolve = TRUE)) simplified <- ms_simplify(contours, keep = keep) list(full = contours, simplified = simplified) } tb <- raster2polygon(mask_label) scale_factors <- fromJSON(file = \"outs/spatial/scalefactors_json.json\") tb$simplified$geometry <- tb$simplified$geometry / scale_factors$tissue_hires_scalef tissueBoundary(sfe) <- tb$simplified plotSpatialFeature(sfe, \"sum\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, color = \"black\"), image_id = \"lowres\") + theme_void() sfe <- SpatialFeatureExperiment::transpose(sfe) plotSpatialFeature(sfe, \"sum\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, color = \"black\"), image_id = \"lowres\") # Which spots intersect tissue sfe$int_tissue <- annotPred(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", pred = st_intersects) sfe$cov_tissue <- annotPred(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", pred = st_covered_by) sfe$diff_sr <- case_when(sfe$in_tissue == sfe$int_tissue ~ \"same\", sfe$in_tissue & !sfe$int_tissue ~ \"Space Ranger\", sfe$int_tissue & !sfe$in_tissue ~ \"segmentation\") |> factor(levels = c(\"Space Ranger\", \"same\", \"segmentation\")) plotSpatialFeature(sfe, \"diff_sr\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, size = 0.5, color = \"black\")) + scale_fill_brewer(type = \"div\", palette = 4) #> Scale for fill is already present. #> Adding another scale for fill, which will replace the existing scale. sfe$diff_int_cov <- sfe$int_tissue != sfe$cov_tissue plotSpatialFeature(sfe, \"diff_int_cov\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(fill = NA, size = 0.5, color = \"black\")) spot_ints <- annotOp(sfe, colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", op = st_intersection) sfe$pct_tissue <- st_area(spot_ints) / st_area(spotPoly(sfe)) * 100 sfe_tissue <- sfe[,sfe$int_tissue] plotColData(sfe_tissue, x = \"pct_tissue\", y = \"sum\", color_by = \"diff_int_cov\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"spatial-autocorrelation-of-qc-metrics","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of QC metrics","title":"Spatial analysis of 10X example Visium dataset","text":"","code":"colGraph(sfe_tissue, \"visium\") <- findVisiumGraph(sfe_tissue) qc_features <- c(\"sum\", \"detected\", \"subsets_mito_percent\") sfe_tissue <- colDataUnivariate(sfe_tissue, \"moran.mc\", qc_features, nsim = 200) plotMoranMC(sfe_tissue, qc_features) sfe_tissue <- colDataUnivariate(sfe_tissue, \"sp.correlogram\", qc_features, order = 8) plotCorrelogram(sfe_tissue, qc_features) sfe_tissue <- colDataUnivariate(sfe_tissue, \"localmoran\", qc_features) plotLocalResult(sfe_tissue, \"localmoran\", qc_features, ncol = 2, colGeometryName = \"spotPoly\", divergent = TRUE, diverge_center = 0, image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- colDataUnivariate(sfe_tissue, \"LOSH\", qc_features) plotLocalResult(sfe_tissue, \"LOSH\", qc_features, ncol = 2, colGeometryName = \"spotPoly\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- colDataUnivariate(sfe_tissue, \"moran.plot\", qc_features) moranPlot(sfe_tissue, \"subsets_mito_percent\")"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"spatial-autocorrelation-of-gene-expression","dir":"Articles","previous_headings":"","what":"Spatial autocorrelation of gene expression","title":"Spatial analysis of 10X example Visium dataset","text":"Normalize data scran method, find highly variable genes Find Moran’s highly variable genes: vast majority genes positive Moran’s . ’ll find genes highest Moran’s : can use gget info module gget package get additional information genes, descriptions, synonyms, transcripts collection reference databases including Ensembl, UniProt NCBI , showing gene descriptions NCBI: Plot genes highest Moran’s : global Moran’s seems tissue structure. genes negative Moran’s might statistically significant: 2000 highly variable genes 2000 tests, longer significant correcting multiple testing. global Moran’s relate gene expression level? Genes highly expressed overall tend higher Moran’s .","code":"#clusters <- quickCluster(sfe_tissue) #sfe_tissue <- computeSumFactors(sfe_tissue, clusters=clusters) #sfe_tissue <- sfe_tissue[, sizeFactors(sfe_tissue) > 0] sfe_tissue <- logNormCounts(sfe_tissue) dec <- modelGeneVar(sfe_tissue) hvgs <- getTopHVGs(dec, n = 2000) sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, BPPARAM = MulticoreParam(2)) plotRowDataHistogram(sfe_tissue, \"moran_sample01\") #> Warning: Removed 30285 rows containing non-finite outside the scale range #> (`stat_bin()`). top_moran <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_sample01, decreasing = TRUE)[1:9]] gget_info <- gget$info(top_moran) rownames(gget_info) <- gget_info$primary_gene_name select(gget_info, ncbi_description) #> ncbi_description #> S100a5 Predicted to enable calcium-dependent protein binding activity; metal ion binding activity; and protein homodimerization activity. Located in neuronal cell body. Orthologous to human S100A5 (S100 calcium binding protein A5). [provided by Alliance of Genome Resources, Apr 2022] #> Fabp7 Predicted to enable fatty acid binding activity. Acts upstream of or within cell proliferation in forebrain; neurogenesis; and prepulse inhibition. Located in several cellular components, including cell projection; cell-cell junction; and neuronal cell body. Is expressed in several structures, including central nervous system; cranial nerve; gut; peripheral nervous system; and retina. Orthologous to human FABP7 (fatty acid binding protein 7). [provided by Alliance of Genome Resources, Apr 2022] #> Apoe This gene encodes a member of the apolipoprotein A1/A4/E family of proteins. This protein is involved in the transport of lipoproteins in the blood. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. Homozygous knockout mice for this gene accumulate high levels of cholesterol in the blood and develop atherosclerosis. Different alleles of this gene have been associated with either increased risk or a protective effect for Alzheimer's disease in human patients. This gene maps to chromosome 7 in a cluster with the related apolipoprotein C1, C2 and C4 genes. [provided by RefSeq, Apr 2015] #> Gng13 Predicted to enable G-protein beta-subunit binding activity. Acts upstream of or within phospholipase C-activating G protein-coupled receptor signaling pathway and sensory perception of taste. Located in dendrite. Part of heterotrimeric G-protein complex. Is expressed in brain; gonad; gut; and liver. Orthologous to human GNG13 (G protein subunit gamma 13). [provided by Alliance of Genome Resources, Apr 2022] #> Pcp4 Enables calmodulin binding activity. Predicted to be involved in several processes, including calmodulin dependent kinase signaling pathway; negative regulation of protein kinase activity; and positive regulation of dopamine secretion. Predicted to be located in axon and neurofilament. Predicted to be part of protein-containing complex. Predicted to be active in cytoplasm. Is expressed in several structures, including alimentary system; central nervous system; genitourinary system; peripheral nervous system; and sensory organ. Orthologous to human PCP4 (Purkinje cell protein 4). [provided by Alliance of Genome Resources, Apr 2022] #> Mtco2 Predicted to enable copper ion binding activity and oxidoreductase activity. Predicted to contribute to cytochrome-c oxidase activity. Predicted to be involved in ATP synthesis coupled electron transport; positive regulation of cellular biosynthetic process; and positive regulation of necrotic cell death. Located in mitochondrion. Is expressed in embryo; epiblast; heart; liver; and metanephros. Orthologous to several human genes including MTCO2P12 (MT-CO2 pseudogene 12). [provided by Alliance of Genome Resources, Apr 2022] #> Ptgds Enables prostaglandin-D synthase activity and retinoid binding activity. Involved in prostaglandin biosynthetic process and regulation of circadian sleep/wake cycle, sleep. Acts upstream of or within negative regulation of male germ cell proliferation. Located in extracellular region. Is expressed in several structures, including alimentary system; genitourinary system; integumental system; nervous system; and sensory organ. Human ortholog(s) of this gene implicated in carotid artery disease. Orthologous to human PTGDS (prostaglandin D2 synthase). [provided by Alliance of Genome Resources, Apr 2022] #> Mtnd4 Predicted to enable NADH dehydrogenase (ubiquinone) activity and ubiquinone binding activity. Predicted to contribute to NADH dehydrogenase activity. Predicted to be involved in several processes, including electron transport coupled proton transport; mitochondrial electron transport, NADH to ubiquinone; and mitochondrial respiratory chain complex I assembly. Located in mitochondrion. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; Parkinson's disease; macular degeneration; and schizophrenia. Orthologous to human MT-ND4 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 4). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, top_moran, ncol = 3, image_id = \"lowres\", maxcell = 5e4, swap_rownames = \"symbol\") neg_moran <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_sample01, decreasing = FALSE)[1:9]] # Display NCBI descriptions for these genes gget_info_neg <- gget$info(neg_moran) rownames(gget_info_neg) <- gget_info_neg$primary_gene_name select(gget_info_neg, ncbi_description) #> ncbi_description #> Hibch Predicted to enable 3-hydroxyisobutyryl-CoA hydrolase activity. Predicted to be involved in valine catabolic process. Predicted to act upstream of or within branched-chain amino acid catabolic process. Located in mitochondrion. Orthologous to human HIBCH (3-hydroxyisobutyryl-CoA hydrolase). [provided by Alliance of Genome Resources, Apr 2022] #> Syngr2 Predicted to be involved in regulated exocytosis and synaptic vesicle membrane organization. Predicted to act upstream of or within protein targeting. Located in synaptic vesicle. Is expressed in several structures, including genitourinary system; heart; liver; lung; and spleen. Orthologous to human SYNGR2 (synaptogyrin 2). [provided by Alliance of Genome Resources, Apr 2022] #> Entpd5 Enables guanosine-diphosphatase activity and uridine-diphosphatase activity. Involved in several processes, including positive regulation of glycolytic process; protein N-linked glycosylation; and regulation of phosphatidylinositol 3-kinase signaling. Located in endoplasmic reticulum. Is expressed in several structures, including genitourinary system; gut; hemolymphoid system gland; liver; and nose. Orthologous to human ENTPD5 (ectonucleoside triphosphate diphosphohydrolase 5 (inactive)). [provided by Alliance of Genome Resources, Apr 2022] #> Fyco1 Predicted to enable metal ion binding activity. Predicted to be involved in plus-end-directed vesicle transport along microtubule and positive regulation of autophagosome maturation. Predicted to be located in Golgi apparatus. Predicted to be active in autophagosome; late endosome; and lysosome. Is expressed in central nervous system; early conceptus; and retina. Human ortholog(s) of this gene implicated in cataract 18. Orthologous to human FYCO1 (FYVE and coiled-coil domain autophagy adaptor 1). [provided by Alliance of Genome Resources, Apr 2022] #> Ptpn18 Enables non-membrane spanning protein tyrosine phosphatase activity. Acts upstream of or within blastocyst formation. Located in cytoplasm and nucleus. Is expressed in several structures, including alimentary system; brain; genitourinary system; immune system; and liver and biliary system. Orthologous to human PTPN18 (protein tyrosine phosphatase non-receptor type 18). [provided by Alliance of Genome Resources, Apr 2022] #> Cbl Enables SH3 domain binding activity and ephrin receptor binding activity. Involved in regulation of platelet-derived growth factor receptor-alpha signaling pathway. Acts upstream of or within regulation of Rap protein signal transduction. Located in Golgi apparatus and cilium. Part of flotillin complex. Is expressed in male reproductive system and urinary system. Human ortholog(s) of this gene implicated in acute myeloid leukemia; juvenile myelomonocytic leukemia; lung non-small cell carcinoma; and myeloid neoplasm. Orthologous to human CBL (Cbl proto-oncogene). [provided by Alliance of Genome Resources, Apr 2022] #> Dusp18 Predicted to enable protein tyrosine phosphatase activity and protein tyrosine/serine/threonine phosphatase activity. Predicted to be involved in peptidyl-threonine dephosphorylation and peptidyl-tyrosine dephosphorylation. Predicted to act upstream of or within protein targeting to membrane; protein targeting to mitochondrion; and response to antibiotic. Predicted to be located in mitochondrial inner membrane; mitochondrial intermembrane space; and nucleoplasm. Predicted to be extrinsic component of mitochondrial inner membrane and intrinsic component of mitochondrial inner membrane. Is expressed in central nervous system; dorsal root ganglion; olfactory epithelium; and retina. Orthologous to human DUSP18 (dual specificity phosphatase 18). [provided by Alliance of Genome Resources, Apr 2022] #> Tsc22d3 Enables MRF binding activity. Acts upstream of or within several processes, including negative regulation of activation-induced cell death of T cells; negative regulation of skeletal muscle tissue development; and negative regulation of transcription by RNA polymerase II. Located in cytoplasm and nucleus. Is expressed in several structures, including early conceptus; genitourinary system; nervous system; sensory organ; and viscerocranium. Orthologous to human TSC22D3 (TSC22 domain family member 3). [provided by Alliance of Genome Resources, Apr 2022] #> Plekhg2 Predicted to enable guanyl-nucleotide exchange factor activity. Predicted to be involved in regulation of actin filament polymerization. Orthologous to human PLEKHG2 (pleckstrin homology and RhoGEF domain containing G2). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, neg_moran, ncol = 3, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4) sfe_tissue <- runUnivariate(sfe_tissue, \"moran.mc\", neg_moran, colGraphName = \"visium\", nsim = 200, alternative = \"less\") plotMoranMC(sfe_tissue, neg_moran, swap_rownames = \"symbol\") rowData(sfe_tissue)[neg_moran, c(\"moran_sample01\", \"moran.mc_p.value_sample01\")] #> DataFrame with 9 rows and 2 columns #> moran_sample01 moran.mc_p.value_sample01 #> #> ENSMUSG00000041426 -0.0531915 0.00497512 #> ENSMUSG00000048277 -0.0451179 0.00497512 #> ENSMUSG00000021236 -0.0445148 0.00497512 #> ENSMUSG00000025241 -0.0419121 0.00995025 #> ENSMUSG00000026126 -0.0399917 0.01990050 #> ENSMUSG00000034342 -0.0393964 0.00497512 #> ENSMUSG00000047205 -0.0381599 0.02487562 #> ENSMUSG00000031431 -0.0369456 0.02985075 #> ENSMUSG00000037552 -0.0368969 0.00995025 sfe_tissue <- addPerFeatureQCMetrics(sfe_tissue) names(rowData(sfe_tissue)) #> [1] \"symbol\" #> [2] \"Feature.Type\" #> [3] \"I_sample01\" #> [4] \"P.value_sample01\" #> [5] \"Adjusted.p.value_sample01\" #> [6] \"Feature.Counts.in.Spots.Under.Tissue_sample01\" #> [7] \"Median.Normalized.Average.Counts_sample01\" #> [8] \"Barcodes.Detected.per.Feature_sample01\" #> [9] \"moran_sample01\" #> [10] \"K_sample01\" #> [11] \"moran.mc_statistic_sample01\" #> [12] \"moran.mc_parameter_sample01\" #> [13] \"moran.mc_p.value_sample01\" #> [14] \"moran.mc_alternative_sample01\" #> [15] \"moran.mc_method_sample01\" #> [16] \"moran.mc_res_sample01\" #> [17] \"mean\" #> [18] \"detected\" plotRowData(sfe_tissue, x = \"mean\", y = \"moran_sample01\") + scale_x_log10() + annotation_logticks(sides = \"b\") + geom_density2d() #> Warning in scale_x_log10(): log-10 transformation introduced infinite values. #> log-10 transformation introduced infinite values. #> Warning: Removed 30285 rows containing non-finite outside the scale range #> (`stat_density2d()`). #> Warning: Removed 30285 rows containing missing values or values outside the scale range #> (`geom_point()`)."},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"apply-spatial-analysis-methods-to-gene-expression-space","dir":"Articles","previous_headings":"","what":"Apply spatial analysis methods to gene expression space","title":"Spatial analysis of 10X example Visium dataset","text":"Spatial statistics require spatial neighborhood graph can also applied k nearest neighbor graph histological space gene expression space. done depth vignette. store results “moran_ns”, confused spatial Moran’s results. genes tend similar neighbors 10 nearest neighbor graph PCA space gene expression rather histological space: Although Moran’s computed histological space, genes highest Moran’s PCA space also show spatial structure, different cell types reside different spatial regions.","code":"sfe_tissue <- runPCA(sfe_tissue, ncomponents = 30, subset_row = hvgs, scale = TRUE) # scale as in Seurat foo <- findKNN(reducedDim(sfe_tissue, \"PCA\")[,1:10], k=10, BNPARAM=AnnoyParam()) # Split by row foo_nb <- asplit(foo$index, 1) dmat <- 1/foo$distance # Row normalize the weights dmat <- sweep(dmat, 1, rowSums(dmat), FUN = \"/\") glist <- asplit(dmat, 1) # Sort based on index ord <- lapply(foo_nb, order) foo_nb <- lapply(seq_along(foo_nb), function(i) foo_nb[[i]][ord[[i]]]) class(foo_nb) <- \"nb\" glist <- lapply(seq_along(glist), function(i) glist[[i]][ord[[i]]]) listw <- list(style = \"W\", neighbours = foo_nb, weights = glist) class(listw) <- \"listw\" attr(listw, \"region.id\") <- colnames(sfe_tissue) colGraph(sfe_tissue, \"knn10\") <- listw sfe_tissue <- runMoransI(sfe_tissue, features = hvgs, BPPARAM = MulticoreParam(2), colGraphName = \"knn10\", name = \"moran_ns\") top_moran2 <- rownames(sfe_tissue)[order(rowData(sfe_tissue)$moran_ns_sample01, decreasing = TRUE)[1:9]] # Display NCBI descriptions for these genes gget_info2 <- gget$info(top_moran2) rownames(gget_info2) <- gget_info2$primary_gene_name select(gget_info2, ncbi_description) #> ncbi_description #> Mtco1 Enables cytochrome-c oxidase activity. Predicted to be involved in electron transport coupled proton transport; mitochondrial electron transport, cytochrome c to oxygen; and response to oxidative stress. Located in mitochondrial inner membrane. Part of mitochondrial respiratory chain complex IV. Is expressed in several structures, including brown fat; heart; liver; metanephros; and skeletal muscle. Orthologous to human MT-CO1 (mitochondrially encoded cytochrome c oxidase I). [provided by Alliance of Genome Resources, Apr 2022] #> Mtco2 Predicted to enable copper ion binding activity and oxidoreductase activity. Predicted to contribute to cytochrome-c oxidase activity. Predicted to be involved in ATP synthesis coupled electron transport; positive regulation of cellular biosynthetic process; and positive regulation of necrotic cell death. Located in mitochondrion. Is expressed in embryo; epiblast; heart; liver; and metanephros. Orthologous to several human genes including MTCO2P12 (MT-CO2 pseudogene 12). [provided by Alliance of Genome Resources, Apr 2022] #> Apoe This gene encodes a member of the apolipoprotein A1/A4/E family of proteins. This protein is involved in the transport of lipoproteins in the blood. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. Homozygous knockout mice for this gene accumulate high levels of cholesterol in the blood and develop atherosclerosis. Different alleles of this gene have been associated with either increased risk or a protective effect for Alzheimer's disease in human patients. This gene maps to chromosome 7 in a cluster with the related apolipoprotein C1, C2 and C4 genes. [provided by RefSeq, Apr 2015] #> Mtnd4 Predicted to enable NADH dehydrogenase (ubiquinone) activity and ubiquinone binding activity. Predicted to contribute to NADH dehydrogenase activity. Predicted to be involved in several processes, including electron transport coupled proton transport; mitochondrial electron transport, NADH to ubiquinone; and mitochondrial respiratory chain complex I assembly. Located in mitochondrion. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; Parkinson's disease; macular degeneration; and schizophrenia. Orthologous to human MT-ND4 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 4). [provided by Alliance of Genome Resources, Apr 2022] #> Fabp7 Predicted to enable fatty acid binding activity. Acts upstream of or within cell proliferation in forebrain; neurogenesis; and prepulse inhibition. Located in several cellular components, including cell projection; cell-cell junction; and neuronal cell body. Is expressed in several structures, including central nervous system; cranial nerve; gut; peripheral nervous system; and retina. Orthologous to human FABP7 (fatty acid binding protein 7). [provided by Alliance of Genome Resources, Apr 2022] #> mt-Nd2 Predicted to enable NADH dehydrogenase (ubiquinone) activity; ionotropic glutamate receptor binding activity; and protein kinase binding activity. Acts upstream of or within reactive oxygen species metabolic process. Located in mitochondrion. Is expressed in early conceptus and secondary oocyte. Human ortholog(s) of this gene implicated in Leber hereditary optic neuropathy; multiple sclerosis; myocardial infarction; neurodegenerative disease (multiple); and urinary bladder cancer. Orthologous to human MT-ND2 (mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2). [provided by Alliance of Genome Resources, Apr 2022] #> Ptn Predicted to enable several functions, including glycosaminoglycan binding activity; signaling receptor binding activity; and syndecan binding activity. Involved in several processes, including decidualization; learning or memory; and regulation of nervous system development. Acts upstream of or within bone mineralization. Located in extracellular region. Is expressed in several structures, including alimentary system; genitourinary system; nervous system; respiratory system; and sensory organ. Human ortholog(s) of this gene implicated in adrenal carcinoma. Orthologous to human PTN (pleiotrophin). [provided by Alliance of Genome Resources, Apr 2022] #> Apod The protein encoded by this gene is a component of high-density lipoprotein (HDL), but is unique in that it shares greater structural similarity to lipocalin than to other members of the apolipoprotein family, and has a wider tissue expression pattern. The encoded protein is involved in lipid metabolism, and ablation of this gene results in defects in triglyceride metabolism. Elevated levels of this gene product have been observed in multiple tissues of Niemann-Pick disease mouse models, as well as in some tumors. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Aug 2014] #> Atp1a2 Predicted to enable several functions, including ATP binding activity; ATP hydrolysis activity; and alkali metal ion binding activity. Involved in several processes, including cellular response to steroid hormone stimulus; locomotory exploration behavior; and response to auditory stimulus. Acts upstream of or within several processes, including forebrain development; regulation of blood circulation; and regulation of muscle contraction. Located in T-tubule; cell projection; and neuronal cell body. Is expressed in several structures, including genitourinary system; heart; musculature; nervous system; and sensory organ. Used to study familial hemiplegic migraine 2. Human ortholog(s) of this gene implicated in alternating hemiplegia of childhood; benign neonatal seizures; familial hemiplegic migraine 2; hypertension; and migraine with aura. Orthologous to human ATP1A2 (ATPase Na+/K+ transporting subunit alpha 2). [provided by Alliance of Genome Resources, Apr 2022] plotSpatialFeature(sfe_tissue, top_moran2, ncol = 3, swap_rownames = \"symbol\", image_id = \"lowres\", maxcell = 5e4)"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_10x_spatial.html","id":"session-info","dir":"Articles","previous_headings":"","what":"Session info","title":"Spatial analysis of 10X example Visium dataset","text":"","code":"sessionInfo() #> R version 4.3.3 (2024-02-29) #> Platform: x86_64-apple-darwin20 (64-bit) #> Running under: macOS Ventura 13.6.6 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> time zone: UTC #> tzcode source: internal #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] reticulate_1.36.1 BiocNeighbors_1.20.2 #> [3] BiocParallel_1.36.0 dplyr_1.1.4 #> [5] rmapshaper_0.5.0 sf_1.0-16 #> [7] rlang_1.1.3 terra_1.7-71 #> [9] EBImage_4.44.0 rjson_0.2.21 #> [11] bluster_1.12.0 patchwork_1.2.0 #> [13] stringr_1.5.1 scran_1.30.2 #> [15] scater_1.30.1 scuttle_1.12.0 #> [17] ggplot2_3.5.1 SingleCellExperiment_1.24.0 #> [19] SummarizedExperiment_1.32.0 Biobase_2.62.0 #> [21] GenomicRanges_1.54.1 GenomeInfoDb_1.38.8 #> [23] IRanges_2.36.0 S4Vectors_0.40.2 #> [25] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 #> [27] matrixStats_1.3.0 SpatialFeatureExperiment_1.3.0 #> [29] Voyager_1.4.0 #> #> loaded via a namespace (and not attached): #> [1] RColorBrewer_1.1-3 jsonlite_1.8.8 #> [3] wk_0.9.1 magrittr_2.0.3 #> [5] ggbeeswarm_0.7.2 magick_2.8.3 #> [7] farver_2.1.1 rmarkdown_2.26 #> [9] fs_1.6.4 zlibbioc_1.48.2 #> [11] ragg_1.3.0 vctrs_0.6.5 #> [13] spdep_1.3-3 memoise_2.0.1 #> [15] DelayedMatrixStats_1.24.0 RCurl_1.98-1.14 #> [17] htmltools_0.5.8.1 S4Arrays_1.2.1 #> [19] curl_5.2.1 Rhdf5lib_1.24.2 #> [21] s2_1.1.6 SparseArray_1.2.4 #> [23] rhdf5_2.46.1 sass_0.4.9 #> [25] spData_2.3.0 KernSmooth_2.23-22 #> [27] bslib_0.7.0 htmlwidgets_1.6.4 #> [29] desc_1.4.3 cachem_1.0.8 #> [31] igraph_2.0.3 lifecycle_1.0.4 #> [33] pkgconfig_2.0.3 rsvd_1.0.5 #> [35] Matrix_1.6-5 R6_2.5.1 #> [37] fastmap_1.1.1 GenomeInfoDbData_1.2.11 #> [39] digest_0.6.35 colorspace_2.1-0 #> [41] ggnewscale_0.4.10 dqrng_0.3.2 #> [43] RSpectra_0.16-1 irlba_2.3.5.1 #> [45] textshaping_0.3.7 beachmat_2.18.1 #> [47] labeling_0.4.3 fansi_1.0.6 #> [49] mgcv_1.9-1 abind_1.4-5 #> [51] compiler_4.3.3 proxy_0.4-27 #> [53] withr_3.0.0 tiff_0.1-12 #> [55] viridis_0.6.5 DBI_1.2.2 #> [57] highr_0.10 R.utils_2.12.3 #> [59] HDF5Array_1.30.1 MASS_7.3-60.0.1 #> [61] DelayedArray_0.28.0 classInt_0.4-10 #> [63] tools_4.3.3 units_0.8-5 #> [65] vipor_0.4.7 beeswarm_0.4.0 #> [67] R.oo_1.26.0 glue_1.7.0 #> [69] dbscan_1.1-12 nlme_3.1-164 #> [71] rhdf5filters_1.14.1 grid_4.3.3 #> [73] cluster_2.1.6 generics_0.1.3 #> [75] isoband_0.2.7 gtable_0.3.5 #> [77] R.methodsS3_1.8.2 class_7.3-22 #> [79] BiocSingular_1.18.0 ScaledMatrix_1.10.0 #> [81] metapod_1.10.1 sp_2.1-4 #> [83] utf8_1.2.4 XVector_0.42.0 #> [85] ggrepel_0.9.5 pillar_1.9.0 #> [87] limma_3.58.1 splines_4.3.3 #> [89] lattice_0.22-6 deldir_2.0-4 #> [91] tidyselect_1.2.1 locfit_1.5-9.9 #> [93] knitr_1.45 gridExtra_2.3 #> [95] V8_4.4.2 edgeR_4.0.16 #> [97] xfun_0.43 statmod_1.5.0 #> [99] DropletUtils_1.22.0 fftwtools_0.9-11 #> [101] stringi_1.8.3 geojsonsf_2.0.3 #> [103] yaml_2.3.8 boot_1.3-30 #> [105] evaluate_0.23 codetools_0.2-20 #> [107] tibble_3.2.1 cli_3.6.2 #> [109] systemfonts_1.0.6 munsell_0.5.1 #> [111] jquerylib_0.1.4 Rcpp_1.0.12 #> [113] png_0.1-8 parallel_4.3.3 #> [115] pkgdown_2.0.9 jpeg_0.1-10 #> [117] sparseMatrixStats_1.14.0 bitops_1.0-7 #> [119] SpatialExperiment_1.12.0 viridisLite_0.4.2 #> [121] scales_1.3.0 e1071_1.7-14 #> [123] purrr_1.0.2 crayon_1.5.2 #> [125] scico_1.5.0 cowplot_1.1.3"},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Visium Processing Workflows with Voyager","text":"Pros: Commercial kit Provided many core facilities widely available spatial transcriptomics technologies Transcriptome wide Formalin fixed, paraffin embedded (FFPE) tissue compatible Can panel proteins addition RNA Accompanied H&E fluorescent images tissue morphology lower resolution, data size manageable larger tissue areas larger number samples Cons: Lower resolution – 55 \\(\\mu\\)m spot diameter 100 \\(\\mu\\)m center center Relatively low detection efficiency transcripts full length, protocol adapted long read sequencing","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"dowload-data-and-create-a-spatialfeatureexperiment-object","dir":"Articles","previous_headings":"Getting Started","what":"Dowload Data and Create a SpatialFeatureExperiment object","title":"Visium Processing Workflows with Voyager","text":"Several publicly available Visium datasets available 10X Genomics website. vignettes provide examples processing raw data using workflow includes seqspec, gget, kallisto/bustools generate count matrix demonstrate read output typical Visium experiment SpatialFeatureExperiment object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/visium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Visium Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using variety Visium datasets. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"pros-and-cons","dir":"Articles","previous_headings":"","what":"Pros and cons","title":"Xenium Processing Workflows with Voyager","text":"Pros: Commercial kit Single cell resolution High detection efficiency Formalin fixed, paraffin embedded (FFPE) tissue compatible Provides subcellular transcript localization information Compatible H&E immunofluorescence Cons: curated panel usually hundred genes required. However, 10X provides curated gene panels common applications oncology, neuroscience, development, well panel design services. Data size harder manage larger tissue areas number samples. spatial analysis methods can scale hundreds thousands millions cells.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"getting-started","dir":"Articles","previous_headings":"","what":"Getting Started","title":"Xenium Processing Workflows with Voyager","text":"10x Genomics publicly released Xenium human breast cancer dataset website. tutorial processing output various spatial transcriptomics technologies SpatialFeatureExperiment(SFE) object use Voyager Getting Started page. output files format Xenium data may change technology developed released.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/articles/xenium_landing.html","id":"analysis-workflows","dir":"Articles","previous_headings":"","what":"Analysis Workflows","title":"Xenium Processing Workflows with Voyager","text":"vignettes demonstrate workflows can implemented Voyager using variety Visium datasets. analysis tasks include basic quality control, spatial exploratory data analysis, identification spatially variable genes, computation global local spatial statistics. Accompanying Colab notebooks linked available.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Lambda Moses. Author, maintainer. Kayla Jackson. Author. Laura Luebbert. Author. Sina Booeshaghi. Author. Lior Pachter. Author, reviewer.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Moses L, Einarsson PH, Jackson K, Luebbert L, Booeshaghi S, Antonsson S, Melsted P, Pachter L (2023). “Voyager: exploratory single-cell genomics data analysis geospatial statistics.” bioRxiv. doi:10.1101/2023.07.20.549945.","code":"@Article{, title = {Voyager: exploratory single-cell genomics data analysis with geospatial statistics}, author = {Lambda Moses and Pétur Helgi Einarsson and Kayla Jackson and Laura Luebbert and Sina Booeshaghi and Sindri Antonsson and Páll Melsted and Lior Pachter}, journal = {bioRxiv}, year = {2023}, doi = {10.1101/2023.07.20.549945}, }"},{"path":"https://pachterlab.github.io/voyager/dev/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"From geospatial to spatial omics","text":"SpatialFeatureExperiment Voyager can installed Bioconductor version 3.16 higher:","code":"if (!requireNamespace(\"BiocManager\")) install.packages(\"BiocManager\") BiocManager::install(version = \"3.17\") # Or a higher version in the future BiocManager::install(\"Voyager\")"},{"path":"https://pachterlab.github.io/voyager/dev/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"From geospatial to spatial omics","text":"Voyager: exploratory single-cell genomics data analysis geospatial statistics Lambda Moses, Pétur Helgi Einarsson, Kayla Jackson, Laura Luebbert, . Sina Booeshaghi, Sindri Antonsson, Páll Melsted, Lior Pachter bioRxiv 2023.07.20.549945; doi: https://doi.org/10.1101/2023.07.20.549945","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot the elbow plot or scree plot for PCA — ElbowPlot","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"Apparently, apparent way plot PC elbow plot extracting variance explained attribute dimred slot, even OSCA book makes elbow plot way, find kind cumbersome compared Seurat. writing function make elbow plot SCE less cumbersome.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"","code":"ElbowPlot( sce, ndims = 20, nfnega = 0, reduction = \"PCA\", sample_id = \"all\", facet = FALSE, ncol = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"sce SingleCellExperiment object, anything inherits SingleCellExperiment. ndims Number components positive eigenvalues, PCs non-spatial PCA. nfnega Number nega eigenvalues eigenvectors compute. indicate negative spatial autocorrelation. reduction Name dimension reduction use. must attribute called either \"percentVar\" \"eig\" eigenvalues. Defaults \"PCA\". sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. facet Logical, whether facet samples multiple samples present. relevant spatial PCA run separately sample, gives different results running jointly samples. ncol Number columns facets facetting.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"ggplot object. y axis eigenvalues percentage variance explained relevant.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ElbowPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot the elbow plot or scree plot for PCA — ElbowPlot","text":"","code":"library(SFEData) library(scater) #> Loading required package: SingleCellExperiment #> Loading required package: SummarizedExperiment #> Loading required package: MatrixGenerics #> Loading required package: matrixStats #> #> Attaching package: ‘MatrixGenerics’ #> The following objects are masked from ‘package:matrixStats’: #> #> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, #> colCounts, colCummaxs, colCummins, colCumprods, colCumsums, #> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, #> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, #> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds, #> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, #> colWeightedMeans, colWeightedMedians, colWeightedSds, #> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet, #> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, #> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, #> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, #> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks, #> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, #> rowWeightedMads, rowWeightedMeans, rowWeightedMedians, #> rowWeightedSds, rowWeightedVars #> Loading required package: GenomicRanges #> Loading required package: stats4 #> Loading required package: BiocGenerics #> #> Attaching package: ‘BiocGenerics’ #> The following objects are masked from ‘package:stats’: #> #> IQR, mad, sd, var, xtabs #> The following objects are masked from ‘package:base’: #> #> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append, #> as.data.frame, basename, cbind, colnames, dirname, do.call, #> duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted, #> lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, #> pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table, #> tapply, union, unique, unsplit, which.max, which.min #> Loading required package: S4Vectors #> #> Attaching package: ‘S4Vectors’ #> The following object is masked from ‘package:utils’: #> #> findMatches #> The following objects are masked from ‘package:base’: #> #> I, expand.grid, unname #> Loading required package: IRanges #> Loading required package: GenomeInfoDb #> Loading required package: Biobase #> Welcome to Bioconductor #> #> Vignettes contain introductory material; view with #> 'browseVignettes()'. To cite Bioconductor, see #> 'citation(\"Biobase\")', and for packages 'citation(\"pkgname\")'. #> #> Attaching package: ‘Biobase’ #> The following object is masked from ‘package:MatrixGenerics’: #> #> rowMedians #> The following objects are masked from ‘package:matrixStats’: #> #> anyMissing, rowMedians #> Loading required package: scuttle #> Loading required package: ggplot2 sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache #> require(“SpatialFeatureExperiment”) sfe <- runPCA(sfe, ncomponents = 10, exprs_values = \"counts\") ElbowPlot(sfe, ndims = 10)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":null,"dir":"Reference","previous_headings":"","what":"SFEMethod class — SFEMethod","title":"SFEMethod class — SFEMethod","text":"S4 class used wrap spatial analysis methods, taking inspiration caret tidymodels packages.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SFEMethod class — SFEMethod","text":"","code":"SFEMethod( name, fun, reorganize_fun, package, variate = c(\"uni\", \"bi\", \"multi\"), scope = c(\"global\", \"local\"), title = NULL, default_attr = NA, args_not_check = NA, joint = FALSE, use_graph = TRUE, use_matrix = FALSE, dest = c(\"reducedDim\", \"colData\") ) # S4 method for SFEMethod info(x, type) # S4 method for SFEMethod is_local(x) # S4 method for SFEMethod fun(x) # S4 method for SFEMethod reorganize_fun(x) # S4 method for SFEMethod args_not_check(x) # S4 method for SFEMethod is_joint(x) # S4 method for SFEMethod use_graph(x) # S4 method for SFEMethod use_matrix(x)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SFEMethod class — SFEMethod","text":"name Name method, used user-facing functions specify method use, \"moran\" Moran's . fun Function run method. See Details. reorganize_fun Function reorganize results add SFE object. See Details. package Name package whose implementation method used , used check package installed. variate many variables method works , must one \"uni\" univariate, \"bi\" bivariate, \"multi\" multivariate. scope Either \"global\", returning one result entire dataset, \"local\", returning one result spatial location. multivariate methods, irrelevant. title Descriptive title show plotting results. default_attr local methods return multiple fields, local Moran values p-values, default field use plotting. args_not_check character vector indicating argument checked comparing parameters previous run. joint Logical, whether makes sense run method multiple samples jointly. TRUE, fun must able handle adjacency matrix listw argument straightforward way concatenate listw objects multiple samples. use_graph Logical, indicate whether method uses spatial neighborhood graph unifying user facing functions argument asking graph though methods require graph. use_matrix Logical, whether function slot fun takes matrix input. argument used bivariate methods. dest Whether results appropriate reducedDim colData. used multivariate methods. overrides \"local\" field info. x SFEMethod object type One names info slot, see slot documentation.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"SFEMethod class — SFEMethod","text":"constructor returns SFEMethod object. getters return content corresponding slots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SFEMethod class — SFEMethod","text":"fun slot specified : methods, must arguments x vector, listw listw object specifying spatial neighborhood graph, zero.policy specifying cells without neighbors (default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA), optionally method specific arguments ... pass underlying imported function. original function implementing method package different argument names orders, write thin wrapper rearrange /rename arguments. univariate methods use spatial neighborhood graph, first two arguments must x listw. univariate methods use spatial neighborhood graph, variogram, first two arguments must x numeric vector coords_df sf data frame cell locations optionally regressors. formula argument optional can defaults specifying regressors use. bivariate methods, first three arguments must x, y, listw. multivariate methods, argument x mandatory, matrix input. arguments must present can optional defaults: listw ncomponents set number dimentions output. reorganize_fun slot specified : Univariate methods meant run separately gene, input reorganize_fun argument list outputs; element list corresponds output gene. univariate global methods, different fields result columns data frame one row results multiple features data frame. arguments , name rename primary field informative name needed, ... arguments specific methods. output reorganize_fun DataFrame whose rows correspond genes columns correspond fields output. univariate local methods, arguments , nb neighborhood list used multiple testing correction, p.adjust.method method correct multiple testing p.adjust, .... output reorganize_fun list reorganized output. element list corresponds gene, reorganized content element can vector, matrix, data frame, must dimensions genes. element vector, row matrix data frame corresponds cell. multivariate methods whose results go reducedDim, reorganize_fun one argument raw output. output reorganize_fun cell embedding matrix ready added reducedDim. relevant information gene loadings eigenvalues added attributes cell embedding matrix. multivariate methods whose results can go colData, arguments , nb, p.adjust.method. Unlike univariate local counterpart, takes raw output instead list outputs. output reorganize_fun vector data frame ready added colData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"slots","dir":"Reference","previous_headings":"","what":"Slots","title":"SFEMethod class — SFEMethod","text":"info named character vector specifying information method. fun function implementing method. See Details. reorganize_fun Function convert output fun format store SFE object. See Details. misc Miscellaneous information method interacts rest package. named list.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/SFEMethod.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"SFEMethod class — SFEMethod","text":"","code":"moran <- SFEMethod( name = \"moran\", title = \"Moran's I\", package = \"spdep\", variate = \"uni\", scope = \"global\", fun = function(x, listw, zero.policy = NULL) spdep::moran(x, listw, n = length(listw$neighbours), S0 = spdep::Szero(listw), zero.policy = zero.policy), reorganize_fun = Voyager:::.moran2df )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Bivariate spatial statistics — calculateBivariate","title":"Bivariate spatial statistics — calculateBivariate","text":"functions perform bivariate spatial analysis. version, bivariate global method supported lee, lee.mc, lee.test spdep, cross variograms gstat (use cross_variogram cross_variogram_map type argument, see variogram-internal). Global Lee statistic computed implementation much faster spdep. Bivariate local methods supported lee (use locallee type argument) localmoran_bv bivariate version Local Moran spdep.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Bivariate spatial statistics — calculateBivariate","text":"","code":"# S4 method for ANY calculateBivariate( x, y = NULL, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for SpatialFeatureExperiment calculateBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) runBivariate( x, type, feature1, feature2 = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, p.adjust.method = \"BH\", name = NULL, overwrite = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Bivariate spatial statistics — calculateBivariate","text":"x numeric matrix whose rows features/genes, numeric vector (y must specified), SpatialFeatureExperiment (SFE) object matrix assay. y numeric matrix whose rows features/genes, numeric vector. Bivariate statics computed pairwise combinations row names x row names y, except cross variogram combinations within x y also computed. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. coords_df sf data frame specifying location cell. used method specified type uses spatial neighborhood graph. Must specified otherwise. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA returnDF Logical, results added SFE object, whether results formatted DataFrame. p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. ... arguments passed S4 method (convenience wrappers like calculateMoransI) method used compute metrics specified argument type (general functions like calculateUnivariate). See documentation functions name specified type spdep package method specific arguments. variograms, see .variogram. feature1 ID symbol first genes SFE object, argument x. feature2 ID symbol second genes SFE object, argument x. Mandatory length feature1 1. colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. SFE method calculateUnivariate, specify location cells methods take spatial neighborhood graph variogram. geometry type POINT, spatialCoords(x) used instead. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. exprs_values Integer scalar string indicating assay x contains expression values. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. overwrite Logical, whether overwrite existing results name. Defaults FALSE.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Bivariate spatial statistics — calculateBivariate","text":"calculateBivariate function returns correlation matrix global Lee, results pair genes methods. Global results stored SFE object. methods return one result pair genes, return pairwise results 2 genes jointly. Local results stored localResults field SFE object, name concatenation two gene names separated two underscores (__).","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateBivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Bivariate spatial statistics — calculateBivariate","text":"","code":"library(SFEData) library(scater) library(scran) library(SpatialFeatureExperiment) library(SpatialExperiment) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) gs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gs, fdr.threshold = 0.01) g <- colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Matrix method mat <- logcounts(sfe)[hvgs[1:5],] df <- df2sf(spatialCoords(sfe), spatialCoordsNames(sfe)) out <- calculateBivariate(mat, type = \"lee\", listw = g) out <- calculateBivariate(mat, type = \"cross_variogram\", coords_df = df) # SFE method out <- calculateBivariate(sfe, type = \"lee\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\") out2 <- calculateBivariate(sfe, type = \"lee.test\", feature1 = \"Myh1\", feature2 = \"Myh2\", swap_rownames = \"symbol\") sfe <- runBivariate(sfe, type = \"locallee\", feature1 = \"Myh1\", feature2 = \"Myh2\", swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Multivariate spatial data analysis — calculateMultivariate","title":"Multivariate spatial data analysis — calculateMultivariate","text":"functions perform multivariate spatial data analysis, usually spatially informed dimension reduction.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Multivariate spatial data analysis — calculateMultivariate","text":"","code":"# S4 method for ANY,SFEMethod calculateMultivariate( x, type, listw = NULL, transposed = FALSE, zero.policy = TRUE, p.adjust.method = \"BH\", ... ) # S4 method for ANY,character calculateMultivariate(x, type, listw = NULL, transposed = FALSE, ...) # S4 method for SpatialFeatureExperiment,ANY calculateMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = \"logcounts\", sample_action = c(\"joint\", \"separate\"), BPPARAM = SerialParam(), ... ) runMultivariate( x, type, colGraphName = 1L, subset_row = NULL, exprs_values = \"logcounts\", sample_action = c(\"joint\", \"separate\"), BPPARAM = SerialParam(), name = NULL, dest = c(\"reducedDim\", \"colData\"), ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Multivariate spatial data analysis — calculateMultivariate","text":"x numeric matrix whose rows features/genes, SpatialFeatureExperiment (SFE) object matrix assay. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. transposed Logical, whether matrix genes columns cells rows. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. ... Extra arguments passed specific multivariate method. example, see multispati_rsp arguments MULTISPATI PCA. See localC arguments \"localC_multi\" \"localC_perm_multi\". colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. subset_row Vector specifying subset features use dimensionality reduction. can character vector row names, integer vector row indices logical vector. exprs_values Integer scalar string indicating assay x contains expression values. sample_action Character, either \"joint\" \"separate\". Spatial methods depend spatial coordinates /spatial neighborhood graph, SpatialExperiment uses sample_id keep coordinates different samples separate. spatial methods can sensibly run jointly multiple samples. case, \"joint\" run method jointly samples, \"separate\" run method separately sample concatenate results. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. parallelize computation across multiple samples large number samples. cautious using optimized BLAS matrix operations supports multithreading. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. dest Character, either \"reducedDim\" \"colData\". output multivariate method matrix array, spatially informed dimension reduction, option \"reducedDim\", results stored reducedDim SFE object. output vector, multivariate version localC, sotred colData. Data frame output, localC_perm, can stored either reducedDim colData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Multivariate spatial data analysis — calculateMultivariate","text":"calculateMultivariate, matrix cell embeddings whose attributes include loadings eigenvalues relevant, ready added SFE object reducedDim setter. run*, SpatialFeatureExperiment object results added. See Details results stored.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Multivariate spatial data analysis — calculateMultivariate","text":"argument type, package supports \"multispati\" MULTISPATI PCA, \"localC_multi\" multivariate generalization Geary's C, \"localC_perm_multi\" multivariate Geary's C permutation testing, \"gwpca\" geographically weighted PCA.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Multivariate spatial data analysis — calculateMultivariate","text":"Dray, S., Said, S. Debias, F. (2008) Spatial ordination vegetation data using generalization Wartenberg's multivariate spatial correlation. Journal vegetation science, 19, 45-56. Anselin, L. (2019), Local Indicator Multivariate Spatial Association: Extending Geary's c. Geogr Anal, 51: 133-150. doi:10.1111/gean.12164","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateMultivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Multivariate spatial data analysis — calculateMultivariate","text":"","code":"# example code library(SFEData) library(scater) library(scran) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) gvs <- modelGeneVar(sfe) hvgs <- getTopHVGs(gvs, fdr.threshold = 0.05) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- runMultivariate(sfe, \"multispati\", subset_row = hvgs)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":null,"dir":"Reference","previous_headings":"","what":"Univariate spatial stiatistics — calculateUnivariate","title":"Univariate spatial stiatistics — calculateUnivariate","text":"functions compute univariate spatial statistics, global local, matrices, data frames, SFE objects. SFE objects, statistics can computed numeric columns colData, colGeometries, annotGeometries, results stored within SFE object. calculateMoransI runMoransI convenience wrappers calculateUnivariate runUnivariate respectively.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Univariate spatial stiatistics — calculateUnivariate","text":"","code":"# S4 method for ANY,SFEMethod calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for ANY,character calculateUnivariate( x, type, listw = NULL, coords_df = NULL, BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, p.adjust.method = \"BH\", name = NULL, ... ) # S4 method for SpatialFeatureExperiment,ANY calculateUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) # S4 method for ANY calculateMoransI( x, ..., BPPARAM = SerialParam(), zero.policy = NULL, name = \"moran\" ) # S4 method for SpatialFeatureExperiment calculateMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), zero.policy = NULL, returnDF = TRUE, include_self = FALSE, p.adjust.method = \"BH\", swap_rownames = NULL, name = NULL, ... ) colDataUnivariate( x, type, features, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colDataMoransI( x, features, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colGeometryUnivariate( x, type, features, colGeometryName = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) colGeometryMoransI( x, features, colGeometryName = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) annotGeometryUnivariate( x, type, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) annotGeometryMoransI( x, features, annotGeometryName = 1L, annotGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) runUnivariate( x, type, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, overwrite = FALSE, ... ) runMoransI( x, features = NULL, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", exprs_values = \"logcounts\", BPPARAM = SerialParam(), swap_rownames = NULL, zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) reducedDimUnivariate( x, type, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... ) reducedDimMoransI( x, dimred = 1L, components = 1L, colGraphName = 1L, sample_id = \"all\", BPPARAM = SerialParam(), zero.policy = NULL, include_self = FALSE, p.adjust.method = \"BH\", name = NULL, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Univariate spatial stiatistics — calculateUnivariate","text":"x numeric matrix whose rows features/genes, SpatialFeatureExperiment (SFE) object matrix assay. type SFEMethod object, string matching name SFEMethod object. methods mentioned correspond SFEMethod objects already implemented Voyager package. Use listSFEMethods see methods available. can implement new SFEMethod objects apply Voyager functions spatial analysis methods. part inspired caret, parsnip, BiocSingular packages. listw Weighted neighborhood graph spdep listw object. used method specified type use spatial neighborhood graph, variogram. coords_df sf data frame specifying location cell. used method specified type uses spatial neighborhood graph. Must specified otherwise. BPPARAM BiocParallelParam object specifying whether computing metric numerous genes shall parallelized. zero.policy default NULL, use global option value; TRUE assign zero lagged value zones without neighbours, FALSE assign NA returnDF Logical, results added SFE object, whether results formatted DataFrame. p.adjust.method Method correct multiple testing, passed p.adjustSP. Methods allowed p.adjust.methods. name Name use store results, defaults name SFEMethod object passed argument type. Can set distinguish results method different parameters. ... arguments passed S4 method (convenience wrappers like calculateMoransI) method used compute metrics specified argument type (general functions like calculateUnivariate). See documentation functions name specified type spdep package method specific arguments. variograms, see .variogram. features Genes (calculate* SFE method run*) numeric columns colData(x) (colData*) colGeometry (colGeometry*) annotGeometry (annotGeometry*) univariate metric computed. Default NULL. NULL, metric computed genes values assay specified argument exprs_values. can parallelized argument BPPARAM. genes, row names SFE object Ensembl IDs, gene symbol can used converted IDs behind scene column rowData can specified swap_rownames. However, one symbol matches multiple IDs, warning given first match used. Internally, results always stored Ensembl ID rather symbol. colGraphName Name listw graph SFE object corresponds entities represented columns gene count matrix. Use colGraphNames look names available graphs cells/spots. Note multiple sample_ids, assumed graph name. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. SFE method calculateUnivariate, specify location cells methods take spatial neighborhood graph variogram. geometry type POINT, spatialCoords(x) used instead. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. exprs_values Integer scalar string indicating assay x contains expression values. include_self Logical, whether spatial neighborhood graph include edges location . Getis-Ord Gi* localG localG_perm, used method. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. annotGeometryName Name annotGeometry sf data frame whose numeric columns interest used compute metric. Use annotGeometryNames look names sf data frames associated annotations. annotGraphName Name listw graph SFE object corresponds annotGeometry interest. Use annotGraphNames look names available annotation graphs. overwrite Logical, whether overwrite existing results name. Defaults FALSE. dimred Name dimension reduction, can seen reducedDimNames. components Numeric vector components dimension reduction compute spatial statistics .","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Univariate spatial stiatistics — calculateUnivariate","text":"calculateUnivariate, returnDF = TRUE, DataFrame, otherwise list element results feature. run*, SpatialFeatureExperiment object results added. See Details results stored.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Univariate spatial stiatistics — calculateUnivariate","text":"univariate methods package spdep supported . methods global, meaning returning one result spatial locations dataset: moran, geary, moran.mc, geary.mc, moran.test, geary.test, globalG.test, sp.correlogram. variogram variogram map gstat package also supported. following methods local, meaning location results: moran.plot, localmoran, localmoran_perm, localC, localC_perm, localG, localG_perm, LOSH, LOSH.mc, LOSH.cs. GWmodel::gwss method supported soon, supported yet. Global results genes stored rowData. colGeometry annotGeometry, results added attribute data frame called featureData, DataFrame analogous rowData gene count matrix, can accessed geometryFeatureData function. New column names featureData follow rules rowData. colData, results can accessed colFeatureData function. Local results stored field localResults field SFE object, can accessed localResults localResult. results p-values, -log10 p adjusted -log10 p added. Note multiple testing correction, p.adjustSP used. results stored SFE object, parameters used compute results well construct spatial neighborhood graph also added. localResults, parameters added metadata field params localResults sorted name, defaults name SFEMethod object specified type argument. global methods, parameters results genes metadata rowData(x), organized name (metadata(rowData(x))$params[[name]]). colData, global method parameters stored metadata colData field params (metadata(colData(x))$params[[name]]). geometries, global method parameters attribute named \"params\" corresponding sf data frame (attr(df, \"params\")[[name]]).","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Univariate spatial stiatistics — calculateUnivariate","text":"Cliff, . D., Ord, J. K. 1981 Spatial processes, Pion, p. 17. Anselin, L. (1995), Local Indicators Spatial Association-LISA. Geographical Analysis, 27: 93-115. doi:10.1111/j.1538-4632.1995.tb00338.x Ord, J. K., & Getis, . 2012. Local spatial heteroscedasticity (LOSH), Annals Regional Science, 48 (2), 529-539. Ord, J. K. Getis, . 1995 Local spatial autocorrelation statistics: distributional issues application. Geographical Analysis, 27, 286-306","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/calculateUnivariate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Univariate spatial stiatistics — calculateUnivariate","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) features_use <- rownames(sfe)[1:5] # Moran's I moran_results <- calculateMoransI(sfe, features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) # This does not advocate for computing Moran's I on raw counts. # Just an example for function usage. sfe <- runMoransI(sfe, features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) # Look at the results head(rowData(sfe)) #> DataFrame with 6 rows and 8 columns #> Ensembl symbol type means #> #> ENSMUSG00000025902 ENSMUSG00000025902 Sox17 Gene Expression 0.007612179 #> ENSMUSG00000096126 ENSMUSG00000096126 Gm22307 Gene Expression 0.000200321 #> ENSMUSG00000033845 ENSMUSG00000033845 Mrpl15 Gene Expression 0.075921474 #> ENSMUSG00000025903 ENSMUSG00000025903 Lypla1 Gene Expression 0.057491987 #> ENSMUSG00000033813 ENSMUSG00000033813 Tcea1 Gene Expression 0.052283654 #> ENSMUSG00000002459 ENSMUSG00000002459 Rgs20 Gene Expression 0.000200321 #> vars cv2 moran_Vis5A K_Vis5A #> #> ENSMUSG00000025902 0.008757912 151.1411 -0.0424335 13.32749 #> ENSMUSG00000096126 0.000200321 4992.0000 NaN NaN #> ENSMUSG00000033845 0.114250804 19.8212 0.2485804 5.41594 #> ENSMUSG00000025903 0.080645121 24.3985 0.0070062 9.46309 #> ENSMUSG00000033813 0.073603279 26.9256 0.1592157 8.51384 #> ENSMUSG00000002459 0.000200321 4992.0000 NA NA # Local Moran's I sfe <- runUnivariate(sfe, type = \"localmoran\", features = features_use, colGraphName = \"visium\", exprs_values = \"counts\" ) head(localResult(sfe, \"localmoran\", features_use[1])) #> Ii E.Ii Var.Ii Z.Ii Pr(z != E(Ii)) #> AAATTACCTATCGATG -0.02897069 -0.001345388 0.01609308 -0.2177647 0.82761246 #> AACATATCAACTGGTG -0.29141104 -0.001345388 0.01609308 -2.2865292 0.02222332 #> AAGATTGGCGGAACGT 0.10224949 -0.001345388 0.01958757 0.7401981 0.45917982 #> AAGGGACAGATTCTGT -0.02897069 -0.001345388 0.01609308 -0.2177647 0.82761246 #> AATATCGAGGGTTCTC 0.10224949 -0.001345388 0.01609308 0.8166176 0.41414701 #> AATGATGATACGCTAT 0.10224949 -0.001345388 0.01609308 0.8166176 0.41414701 #> mean median pysal -log10p -log10p_adj #> AAATTACCTATCGATG Low-High Low-High Low-High 0.08217298 0.0000000 #> AACATATCAACTGGTG Low-High Low-High Low-High 1.65319110 0.8080931 #> AAGATTGGCGGAACGT Low-Low Low-Low Low-Low 0.33801720 0.0000000 #> AAGGGACAGATTCTGT Low-High Low-High Low-High 0.08217298 0.0000000 #> AATATCGAGGGTTCTC Low-Low Low-Low Low-Low 0.38284547 0.0000000 #> AATGATGATACGCTAT Low-Low Low-Low Low-Low 0.38284547 0.0000000 # For colData sfe <- colDataUnivariate(sfe, type = \"localmoran\", features = \"nCounts\", colGraphName = \"visium\" ) head(localResult(sfe, \"localmoran\", \"nCounts\")) #> Ii E.Ii Var.Ii Z.Ii #> AAATTACCTATCGATG 0.53682603 -0.0073375879 0.087243111 1.8423152 #> AACATATCAACTGGTG 0.20017125 -0.0008174853 0.009783652 2.0319883 #> AAGATTGGCGGAACGT 0.13533683 -0.0002992400 0.004361215 2.0538630 #> AAGGGACAGATTCTGT 0.67946203 -0.0182482408 0.214584793 1.5061757 #> AATATCGAGGGTTCTC -0.01287299 -0.0009633914 0.011528171 -0.1109218 #> AATGATGATACGCTAT 0.15331553 -0.0306802864 0.356207210 0.3082880 #> Pr(z != E(Ii)) mean median pysal -log10p #> AAATTACCTATCGATG 0.06542906 High-High High-High High-High 1.18422931 #> AACATATCAACTGGTG 0.04215484 High-High High-High High-High 1.37515260 #> AAGATTGGCGGAACGT 0.03998896 High-High Low-High High-High 1.39805992 #> AAGGGACAGATTCTGT 0.13202207 High-High High-High High-High 0.87935347 #> AATATCGAGGGTTCTC 0.91167838 High-Low High-Low High-Low 0.04015835 #> AATGATGATACGCTAT 0.75786321 High-High High-Low High-High 0.12040917 #> -log10p_adj #> AAATTACCTATCGATG 0.33913127 #> AACATATCAACTGGTG 0.53005456 #> AAGATTGGCGGAACGT 0.61990867 #> AAGGGACAGATTCTGT 0.03425543 #> AATATCGAGGGTTCTC 0.00000000 #> AATGATGATACGCTAT 0.00000000 # For annotGeometries annotGraph(sfe, \"myofiber_tri2nb\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3L, method = \"tri2nb\", dist_type = \"idw\", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, type = \"localG\", features = \"area\", annotGraphName = \"myofiber_tri2nb\", annotGeometryName = \"myofiber_simplified\", zero.policy = TRUE ) head(localResult(sfe, \"localG\", \"area\", annotGeometryName = \"myofiber_simplified\" )) #> localG Gi E(Gi) V(Gi) Z(Gi) #> 1018 -2.3083710 0.0001426229 0.0002238002 1.236681e-09 -2.3083710 #> 1021 -0.8140180 0.0002393084 0.0002665443 1.119477e-09 -0.8140180 #> 1024 0.0508039 0.0002301134 0.0002280492 1.650888e-09 0.0508039 #> 1041 -0.1700897 0.0002715145 0.0002773569 1.179830e-09 -0.1700897 #> 1052 0.1547597 0.0002185310 0.0002133753 1.109810e-09 0.1547597 #> 1058 -0.3688569 0.0002047116 0.0002174315 1.189189e-09 -0.3688569 #> Pr(z != E(Gi)) -log10p -log10p_adj cluster #> 1018 0.02097851 1.67822538 0.9000741 High #> 1021 0.41563466 0.38128824 0.0000000 High #> 1024 0.95948178 0.01796327 0.0000000 High #> 1041 0.86493956 0.06301424 0.0000000 High #> 1052 0.87701073 0.05699509 0.0000000 Low #> 1058 0.71223439 0.14737706 0.0000000 Low"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":null,"dir":"Reference","previous_headings":"","what":"Find clusters of correlogram patterns — clusterCorrelograms","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"Cluster correlograms find patterns length scales spatial autocorrelation. correlograms clustered must computed method number lags. Correlograms clustered jointly across samples.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"","code":"clusterCorrelograms( sfe, features, BLUSPARAM, sample_id = \"all\", method = \"I\", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"sp.correlogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"sfe SpatialFeatureExperiment object correlograms computed features interest. features Features whose correlograms cluster. BLUSPARAM BlusterParam object specifying algorithm use. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. method \"corr\" correlation, \"\" Moran's , \"C\" Geary's C colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"data frame 3 columns: feature features, cluster factor cluster membership features within sample, sample_id sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterCorrelograms.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find clusters of correlogram patterns — clusterCorrelograms","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) sfe <- runUnivariate(sfe, type = \"sp.correlogram\", features = rownames(sfe)[inds], exprs_values = \"counts\", order = 5 ) clust <- clusterCorrelograms(sfe, features = rownames(sfe)[inds], BLUSPARAM = KmeansParam(2) )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Find clusters on the Moran plot — clusterMoranPlot","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"Moran plot plots value location x axis, average neighbors locations y axis. Sometimes clusters can seen Moran plot, indicating different types neighborhoods.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"","code":"clusterMoranPlot( sfe, features, BLUSPARAM, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, swap_rownames = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"sfe SpatialFeatureExperiment object Moran plot computed feature interest. Moran plot feature computed feature sample_id, calculated stored rowData. See calculateUnivariate. features Features whose Moran plot cluster. Features whose Moran plots computed skipped, warning. BLUSPARAM BlusterParam object specifying algorithm use. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"data frame column factor cluster membership feature. column names features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterMoranPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find clusters on the Moran plot — clusterMoranPlot","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Compute moran plot sfe <- runUnivariate(sfe, type = \"moran.plot\", features = rownames(sfe)[1], exprs_values = \"counts\" ) clusts <- clusterMoranPlot(sfe, rownames(sfe)[1], BLUSPARAM = KmeansParam(2) )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":null,"dir":"Reference","previous_headings":"","what":"Cluster variograms of multiple features — clusterVariograms","title":"Cluster variograms of multiple features — clusterVariograms","text":"function clusters variograms features across samples find patterns decays spatial autocorrelation. fitted variograms clustered different samples can different distance bins.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cluster variograms of multiple features — clusterVariograms","text":"","code":"clusterVariograms( sfe, features, BLUSPARAM, n = 20, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"variogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cluster variograms of multiple features — clusterVariograms","text":"sfe SpatialFeatureExperiment object correlograms computed features interest. features Features whose correlograms cluster. BLUSPARAM BlusterParam object specifying algorithm use. n Number points fitted variogram line. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry look features. annotGeometryName Name annotGeometry look features. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cluster variograms of multiple features — clusterVariograms","text":"data frame 3 columns: feature features, cluster factor cluster membership features within sample, sample_id sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/clusterVariograms.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cluster variograms of multiple features — clusterVariograms","text":"","code":"library(SFEData) library(scater) library(bluster) library(Matrix) #> #> Attaching package: ‘Matrix’ #> The following object is masked from ‘package:S4Vectors’: #> #> expand sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) # Just the highly expressed genes gs <- order(Matrix::rowSums(counts(sfe)), decreasing = TRUE)[1:10] genes <- rownames(sfe)[gs] sfe <- runUnivariate(sfe, \"variogram\", features = genes) clusts <- clusterVariograms(sfe, genes, BLUSPARAM = HclustParam(), swap_rownames = \"symbol\") # Plot the clustering plotVariogram(sfe, genes, color_by = clusts, group = \"feature\", use_lty = FALSE, swap_rownames = \"symbol\", show_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":null,"dir":"Reference","previous_headings":"","what":"Get metadata of colData, rowData, and geometries — colFeatureData","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"Results spatial analyses columns colData, rowData, geometries stored metadata, can accessed metadata function. colFeaturedata function allows users directly access results.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"","code":"colFeatureData(sfe) rowFeatureData(sfe) geometryFeatureData(sfe, type, MARGIN = 2L) reducedDimFeatureData(sfe, dimred)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"sfe SFE object. type geometry, can name (character) index (integer) MARGIN Integer, 1 means rowGeometry, 2 means colGeometry, 3 means annotGeometry. Defaults 2, colGeometry. dimred Name dimension reduction, can seen reducedDimNames.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"DataFrame.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/colFeatureData.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get metadata of colData, rowData, and geometries — colFeatureData","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) # Moran's I for colData sfe <- colDataMoransI(sfe, \"nCounts\") colFeatureData(sfe) #> DataFrame with 12 rows and 2 columns #> moran_Vis5A K_Vis5A #> #> barcode NA NA #> col NA NA #> row NA NA #> x NA NA #> y NA NA #> ... ... ... #> sample_id NA NA #> nCounts 0.675416 1.67027 #> nGenes NA NA #> prop_mito NA NA #> in_tissue NA NA"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":null,"dir":"Reference","previous_headings":"","what":"Colorblind friendly palette from dittoSeq — ditto_colors","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"Just get palette without install dependencies dittoSeq.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"","code":"ditto_colors"},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"character vector hex colors palette. 40 colors.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/ditto_colors.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Colorblind friendly palette from dittoSeq — ditto_colors","text":"dittoSeq package.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":null,"dir":"Reference","previous_headings":"","what":"Get beginning and end of palette to center a divergent palette — getDivergeRange","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"function longer used internally unnecessary scico divergent palettes. can useful using divergent palettes outside scico one must specify beginning end midpoint, override default palette.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"","code":"getDivergeRange(values, diverge_center = 0)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"values Numeric vector colored. diverge_center Value center , defaults 0.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"numeric vector length 2, first element beginning, second end. values 0 1.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getDivergeRange.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get beginning and end of palette to center a divergent palette — getDivergeRange","text":"","code":"v <- rnorm(10) getDivergeRange(v, diverge_center = 0) #> [1] 0.1643015 1.0000000"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":null,"dir":"Reference","previous_headings":"","what":"Get parameters used in spatial methods — getParams","title":"Get parameters used in spatial methods — getParams","text":"getParams function allows users access parameters used compute results may stored colFeatureData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get parameters used in spatial methods — getParams","text":"","code":"getParams( sfe, name, local = FALSE, colData = FALSE, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get parameters used in spatial methods — getParams","text":"sfe SpatialFeatureExperiment object. name Name used store results. local Logical, whether results interest come local spatial method. colData Logical, whether results computed column colData(sfe). colGeometryName get results colGeometry. annotGeometryName get results annotGeometry; colGeometry precedence argument ignored colGeometryName specified. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get parameters used in spatial methods — getParams","text":"named list showing parameters","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/getParams.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get parameters used in spatial methods — getParams","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- colDataMoransI(sfe, \"nCounts\") getParams(sfe, \"moran\", colData = TRUE) #> $name #> [1] \"moran\" #> #> $package #> [1] \"spdep\" #> #> $version #> [1] ‘1.3.3’ #> #> $zero.policy #> NULL #> #> $include_self #> [1] FALSE #> #> $graph_params #> $graph_params$FUN #> [1] \"findVisiumGraph\" #> #> $graph_params$package #> $graph_params$package[[1]] #> [1] \"SpatialFeatureExperiment\" #> #> $graph_params$package[[2]] #> [1] ‘1.3.0’ #> #> #> $graph_params$args #> $graph_params$args$style #> [1] \"W\" #> #> $graph_params$args$zero.policy #> NULL #> #> $graph_params$args$sample_id #> [1] \"Vis5A\" #> #> #>"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":null,"dir":"Reference","previous_headings":"","what":"List all spatial methods in Voyager package — listSFEMethods","title":"List all spatial methods in Voyager package — listSFEMethods","text":"package ships many spatial statistics methods SFEMethod objects. user can adapt uniform user interface package spatial methods creating new SFEMethod objects. function lists names methods within Voyager, use type argument calculateUnivariate, calculateBivariate, calculateMultivariate.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"List all spatial methods in Voyager package — listSFEMethods","text":"","code":"listSFEMethods(variate = c(\"uni\", \"bi\", \"multi\"), scope = c(\"global\", \"local\"))"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"List all spatial methods in Voyager package — listSFEMethods","text":"variate Uni-, bi-, multi-variate. scope whether local global.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"List all spatial methods in Voyager package — listSFEMethods","text":"data frame column name another brief description.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listSFEMethods.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"List all spatial methods in Voyager package — listSFEMethods","text":"","code":"listSFEMethods(\"uni\", \"local\") #> name description #> 1 localmoran Local Moran's I #> 2 localmoran_perm Local Moran's I permutation testing #> 3 localC Local Geary's C #> 4 localC_perm Local Geary's C permutation testing #> 5 localG Getis-Ord Gi(*) #> 6 localG_perm Getis-Ord Gi(*) with permutation testing #> 7 LOSH Local spatial heteroscedasticity #> 8 LOSH.mc Local spatial heteroscedasticity permutation testing #> 9 LOSH.cs Local spatial heteroscedasticity Chi-square test #> 10 moran.plot Moran scatter plot"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert listw into sparse adjacency matrix — listw2sparse","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"Edge weights used adjacency matrix. elements matrix 0, using sparse matrix greatly reduces memory use.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"","code":"listw2sparse(listw)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"listw listw object spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"sparse dgCMatrix, whose row represents cell spot whose columns represent neighbors. matrix symmetric. region.id present listw object, row column names output matrix.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/listw2sparse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert listw into sparse adjacency matrix — listw2sparse","text":"","code":"library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache g <- findVisiumGraph(sfe) mat <- listw2sparse(g)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"Values Moran's can take depends spatial neighborhood graph. bounds Moran's given graph, C, given minimum maximum eigenvalues double centered -- .e. subtracting column means row means -- adjacency matrix \\((- \\mathbb{11}^T/n)C(- \\mathbb{11}^T/n)\\), \\(\\mathbb 1\\) vector 1's. implementation follows implementation adespatial uses RSpectra package quickly find minimum maximum eigenvalues without performing unnecessary work find full spectrum done base R's eigen.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"","code":"moranBounds(listw)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"listw listw object spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"numeric vector minimum maximum Moran's given spatial neighborhood graph.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"double centering, adjacency matrix longer sparse, function can take lot memory larger datasets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"de Jong, P., Sprenger, C., & van Veen, F. (1984). extreme values Moran's Geary's C. Geographical Analysis, 16(1), 17-24.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranBounds.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute the bounds of Moran's I given spatial neighborhood graph — moranBounds","text":"","code":"# example code library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache g <- findVisiumGraph(sfe) moranBounds(g) #> Imin Imax #> -0.5825787 0.9725069"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":null,"dir":"Reference","previous_headings":"","what":"Use ggplot to plot the moran.plot results — moranPlot","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"function uses ggplot2 plot Moran plot. plot aesthetically pleasing base R version implemented spdep. addition, contours plotted show point density plot, points can colored variable, clusters. contours may also filled influential points plotted. filled, viridis E option used.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"","code":"moranPlot( sfe, feature, graphName = 1L, sample_id = \"all\", contour_color = \"cyan\", color_by = NULL, colGeometryName = NULL, annotGeometryName = NULL, plot_singletons = TRUE, binned = FALSE, filled = FALSE, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, bins = 100, binwidth = NULL, hex = FALSE, plot_influential = TRUE, bins_contour = NULL, name = \"moran.plot\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"sfe SpatialFeatureExperiment object. feature Name one variable show plot. converted sentence case x axis lower case y axis appended \"Spatially lagged\". One feature time since colors color_by may specific feature (e.g. clusterMoranPlot). graphName Name colGraph annotGraph, spatial neighborhood graph used compute Moran plot. determine points singletons plot differently plot. sample_id One sample_id sample whose graph plot. contour_color Color point density contours, can changed contours stand points. color_by Variable color points . can name column colData, gene, name column colGeometry specified colGeometryName. can vector length number cells/spots sample_id interest. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. plot_singletons Logical, whether plot items spatial neighbors. binned Logical, whether plot 2D histograms. argument precedence filled. filled Logical, whether plot filled contours non-influential points plot influential points points. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. binwidth Width bins, passed geom_bin2d geom_hex. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. plot_influential Logical, whether plot influential points different palette binned = TRUE. bins_contour Number bins point density contour. Use smaller number make sparser contours. name Name Moran plot results stored. default \"moran.plot\". ... arguments pass geom_density2d.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/moranPlot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Use ggplot to plot the moran.plot results — moranPlot","text":"","code":"library(SpatialFeatureExperiment) library(SingleCellExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData(\"full\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[, colData(sfe)$in_tissue] sfe <- logNormCounts(sfe) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- runUnivariate(sfe, type = \"moran.plot\", features = \"Myh1\", swap_rownames = \"symbol\") clust <- clusterMoranPlot(sfe, \"Myh1\", BLUSPARAM = KmeansParam(2), swap_rownames = \"symbol\") moranPlot(sfe, \"Myh1\", graphName = \"visium\", color_by = clust[, 1], swap_rownames = \"symbol\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"sample SFE object separate spatial neighborhood graph. Spatial analyses performed jointly multiple samples require combined spatial neighborhood graph different samples, different samples disconnected components graph. combined adjacency matrix can used MULTISPATI PCA.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"","code":"multi_listw2sparse(listws)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"listws list listw objects.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"sparse dgCMatrix combined spatial neighborhood graph, original spatial neighborhood graphs samples diagonal. input SFE object, rows columns match column names SFE object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multi_listw2sparse.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert multiple listw graphs into a single sparse adjacency matrix — multi_listw2sparse","text":"","code":"# example code"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":null,"dir":"Reference","previous_headings":"","what":"A faster implementation of MULTISPATI PCA — multispati_rsp","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"implementation uses RSpectra package efficiently compute small subset eigenvalues eigenvectors, small subset typically used. Hence much faster memory efficient original implementation adespatial. However, implementation support row column weighting standard ones PCA., adespatial implementation general.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"","code":"multispati_rsp(x, listw, nfposi = 30L, nfnega = 30L, scale = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"x matrix whose columns features rows cells. listw listw object, spatial neighborhood graph cells x. length must equal number row x. nfposi Number positive eigenvalues eigenvectors compute. nfnega Number nega eigenvalues eigenvectors compute. indicate negative spatial autocorrelation. scale Logical, whether scale data.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"matrix cell embeddings spatial PC, attribute loading eigenvectors gene loadings, attribute eig eigenvalues.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"Eigen decomposition fail feature variance zero leading NaN scaled matrix.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"Dray, S., Said, S. Debias, F. (2008) Spatial ordination vegetation data using generalization Wartenberg's multivariate spatial correlation. Journal vegetation science, 19, 45-56.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/multispati_rsp.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A faster implementation of MULTISPATI PCA — multispati_rsp","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) inds <- order(rowSums(logcounts(sfe)), decreasing = TRUE)[1:50] mat <- logcounts(sfe)[inds,] g <- findVisiumGraph(sfe) out <- multispati_rsp(t(mat), listw = g, nfposi = 10, nfnega = 10)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cell density as 2D histogram — plotCellBin2D","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"function plots cell density histological space 2D histograms, especially helpful larger smFISH-based datasets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"","code":"plotCellBin2D( sfe, sample_id = \"all\", bins = 200, binwidth = NULL, hex = FALSE, ncol = NULL, bbox = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"sfe SpatialFeatureExperiment object. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. bins Number bins. Can vector length 2 specify x y axes separately. binwidth Width bins, passed geom_bin2d geom_hex. hex Logical, whether use hexagonal bins. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCellBin2D.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cell density as 2D histogram — plotCellBin2D","text":"","code":"library(SFEData) sfe <- HeNSCLCData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache plotCellBin2D(sfe)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"function recommended instead plotColDataHistogram coloring multiple categories log transforming y axis, causes problems stacked histograms.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"","code":"plotColDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = \"free\", ncol = 1, position = \"identity\" ) plotRowDataFreqpoly( sce, feature, color_by = NULL, subset = NULL, bins = 100, binwidth = NULL, linewidth = 1.2, scales = \"free\", ncol = 1, position = \"identity\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"sce SingleCellExperiment object. feature Names columns colData rowData plot. multiple features specified, plotted separate facets. color_by Name categorical column colData rowData color polygons. subset Name logical column plot subset data. bins Number bins. Overridden binwidth. Defaults 30. binwidth width bins. Can specified numeric value function calculates width unscaled x. , \"unscaled x\" refers original x values data, application scale transformation. specifying function along grouping structure, function called per group. default use number bins bins, covering range data. always override value, exploring multiple widths find best illustrate stories data. bin width date variable number days time; bin width time variable number seconds. linewidth Line width polygons, defaults thicker 1.2. scales scales fixed (\"fixed\", default), free (\"free\"), free one dimension (\"free_x\", \"free_y\")? ncol Number columns facetting. position Position adjustment, either string naming adjustment (e.g. \"jitter\" use position_jitter), result call position adjustment function. Use latter need change settings adjustment.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataFreqpoly.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot frequency polygons for colData and rowData columns — plotColDataFreqpoly","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache plotColDataFreqpoly(sfe, c(\"nCounts\", \"nGenes\"), color_by = \"in_tissue\", bins = 50) plotColDataFreqpoly(sfe, \"nCounts\", subset = \"in_tissue\") sfe2 <- sfe[, sfe$in_tissue] plotColDataFreqpoly(sfe2, c(\"nCounts\", \"nGenes\"), bins = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot histograms for colData and rowData columns — plotColDataHistogram","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"Plot histograms colData rowData columns","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"","code":"plotColDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = \"free\", ncol = 1, position = \"stack\", ... ) plotRowDataHistogram( sce, feature, fill_by = NULL, facet_by = NULL, subset = NULL, bins = 100, binwidth = NULL, scales = \"free\", ncol = 1, position = \"stack\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"sce SingleCellExperiment object. feature Names columns colData rowData plot. multiple features specified, plotted separate facets. fill_by Name categorical column colData rowData fill histogram. facet_by Column colData rowData facet . multiple features plotted, features different facets. case, setting facet_by call facet_grid features rows categories facet_by columns. subset Name logical column plot subset data. bins Numeric vector giving number bins vertical horizontal directions. Set 100 default. binwidth width bins. Can specified numeric value function calculates width unscaled x. , \"unscaled x\" refers original x values data, application scale transformation. specifying function along grouping structure, function called per group. default use number bins bins, covering range data. always override value, exploring multiple widths find best illustrate stories data. bin width date variable number days time; bin width time variable number seconds. scales scales fixed (\"fixed\", default), free (\"free\"), free one dimension (\"free_x\", \"free_y\")? ncol Number columns facetting. position Position adjustment, either string naming adjustment (e.g. \"jitter\" use position_jitter), result call position adjustment function. Use latter need change settings adjustment. ... arguments passed layer(). often aesthetics, used set aesthetic fixed value, like colour = \"red\" size = 3. may also parameters paired geom/stat.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"ggplot object","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColDataHistogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot histograms for colData and rowData columns — plotColDataHistogram","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache plotColDataHistogram(sfe, c(\"nCounts\", \"nGenes\"), fill_by = \"in_tissue\", bins = 50, position = \"stack\") plotColDataHistogram(sfe, \"nCounts\", subset = \"in_tissue\") sfe2 <- sfe[, sfe$in_tissue] plotColDataHistogram(sfe2, c(\"nCounts\", \"nGenes\"), bins = 50)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot spatial graphs — plotColGraph","title":"Plot spatial graphs — plotColGraph","text":"ggplot version spdep::plot.nb, reducing boilerplate SFE objects.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot spatial graphs — plotColGraph","text":"","code":"plotColGraph( sfe, colGraphName = 1L, colGeometryName = 1L, sample_id = \"all\", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL ) plotAnnotGraph( sfe, annotGraphName = 1L, annotGeometryName = 1L, sample_id = \"all\", weights = FALSE, segment_size = 0.5, geometry_size = 0.5, ncol = NULL, bbox = NULL )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot spatial graphs — plotColGraph","text":"sfe SpatialFeatureExperiment object. colGraphName Name graph associated columns gene count matrix plotted. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. weights Whether plot weights. TRUE, transparency (alpha) segments represent edge weights. segment_size Thickness segments represent graph edges. geometry_size Point size (POINT geometries) line thickness (LINESTRING POLYGON) plot geometry background. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. annotGraphName Name annotation graph plot. annotGeometryName Name annotGeometry, associated graph specified annotGraphName, spatial coordinates graph nodes context.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot spatial graphs — plotColGraph","text":"ggplot2 object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotColGraph.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot spatial graphs — plotColGraph","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(sf) #> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) plotColGraph(sfe, colGraphName = \"visium\", colGeometryName = \"spotPoly\") # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, \"myofiber_simplified\") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, \"myofiber_simplified\") <- ag annotGraph(sfe, \"myofibers\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3, method = \"tri2nb\", dist_type = \"idw\" ) plotAnnotGraph(sfe, annotGraphName = \"myofibers\", annotGeometryName = \"myofiber_simplified\", weights = TRUE )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot correlogram — plotCorrelogram","title":"Plot correlogram — plotCorrelogram","text":"Use ggplot2 plot correlograms computed runUnivariate, pulling results rowData. Correlograms multiple genes error bars can plotted, can colored numeric categorical column rowData vector length nrow SFE object. coloring useful correlograms clustered show types length scales patterns decay spatial autocorrelation. method = \"\", error bars twice standard deviation estimated Moran's value.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot correlogram — plotCorrelogram","text":"","code":"plotCorrelogram( sfe, features, sample_id = \"all\", method = \"I\", color_by = NULL, facet_by = c(\"sample_id\", \"features\"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, plot_signif = TRUE, p_adj_method = \"BH\", divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = \"sp.correlogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot correlogram — plotCorrelogram","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. method \"corr\" correlation, \"\" Moran's , \"C\" Geary's C color_by Name column rowData(sfe) featureData colData (see colFeatureData), colGeometry, annotGeometry color correlogram feature. Alternatively, vector length features, data frame clusterCorrelograms. facet_by Whether facet sample_id (default) features. facetting sample_id, different features plotted facet comparison. facetting features, different samples compared feature. Ignored one sample specified. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. plot_signif Logical, whether plot significance symbols: p < 0.001: ***, p < 0.01: **, p < 0.05 *, p < 0.1: ., otherwise symbol. p-values two sided, based assumption estimated Moran's normally distributed mean randomized version data. mean variance come moran.test Moran's geary.test Geary's C. Take results grain salt data normally distributed. p_adj_method Multiple testing correction method p.adjust, correct multiple testing (number lags times number features) Moran's estimates plot_signif = TRUE. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot correlogram — plotCorrelogram","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCorrelogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot correlogram — plotCorrelogram","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(bluster) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) inds <- c(1, 3, 4, 5) features <- rownames(sfe)[inds] sfe <- runUnivariate(sfe, type = \"sp.correlogram\", features = features, exprs_values = \"counts\", order = 5 ) clust <- clusterCorrelograms(sfe, features = features, BLUSPARAM = KmeansParam(2) ) # Color by features plotCorrelogram(sfe, features) # Color by something else plotCorrelogram(sfe, features, color_by = clust$cluster) # Facet by features plotCorrelogram(sfe, features, facet_by = \"features\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cross variogram — plotCrossVariogram","title":"Plot cross variogram — plotCrossVariogram","text":"Equivalent gstat::plot.gstatVariogram, using ggplot2 customizable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cross variogram — plotCrossVariogram","text":"","code":"plotCrossVariogram(res, show_np = TRUE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cross variogram — plotCrossVariogram","text":"res Cross variogram results one sample, calculateBivariate. Global bivariate results stored SFE object. show_np Logical, whether show number pairs cells distance bin.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cross variogram — plotCrossVariogram","text":"ggplot object. Unfortunately figured way collect facet labels top entire plot.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cross variogram — plotCrossVariogram","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = \"cross_variogram\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\") plotCrossVariogram(res)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot cross variogram map — plotCrossVariogramMap","title":"Plot cross variogram map — plotCrossVariogramMap","text":"Equivalent gstat::plot.gstatVariogram, using ggplot2 customizable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot cross variogram map — plotCrossVariogramMap","text":"","code":"plotCrossVariogramMap(res, plot_np = FALSE)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot cross variogram map — plotCrossVariogramMap","text":"res Cross variogram results one sample, calculateBivariate. Global bivariate results stored SFE object. plot_np Logical, whether plot number pairs distance bin instead variance.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot cross variogram map — plotCrossVariogramMap","text":"ggplot object.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotCrossVariogramMap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot cross variogram map — plotCrossVariogramMap","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] sfe <- logNormCounts(sfe) res <- calculateBivariate(sfe, type = \"cross_variogram_map\", feature1 = c(\"Myh1\", \"Myh2\", \"Csrp3\"), swap_rownames = \"symbol\", width = 500, cutoff = 2000) plotCrossVariogramMap(res)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot top PC loadings of genes — plotDimLoadings","title":"Plot top PC loadings of genes — plotDimLoadings","text":"Just like Seurat's VizDimLoadings function. found equivalent SCE find useful. trying reproduce Seurat function exactly. instance, like Seurat imposes ggplot theme, like cowplot theme. Maybe rewrite base R now using Tidyverse.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot top PC loadings of genes — plotDimLoadings","text":"","code":"plotDimLoadings( sce, dims = 1:4, nfeatures = 10, swap_rownames = NULL, reduction = \"PCA\", balanced = TRUE, ncol = 2, sample_id = \"all\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot top PC loadings of genes — plotDimLoadings","text":"sce SingleCellExperiment object, anything inherits SingleCellExperiment. dims Numeric vector specifying PCs plot. MULTISPATI, PCs negative eigenvalues right columns embedding loading matrices. See ElbowPlot. nfeatures Number genes plot. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. reduction Name dimension reduction use. must attribute called either \"percentVar\" \"eig\" eigenvalues. Defaults \"PCA\". balanced Return equal number genes + - scores. FALSE, returns top genes ranked scores absolute values. ncol Number columns facetted plot. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot top PC loadings of genes — plotDimLoadings","text":"ggplot object. Loadings different PCs plotted different facets one ggplot object returned.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotDimLoadings.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot top PC loadings of genes — plotDimLoadings","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- runPCA(sfe, ncomponents = 10, exprs_values = \"counts\") plotDimLoadings(sfe, dims = 1:2)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot geometries without coloring — plotGeometry","title":"Plot geometries without coloring — plotGeometry","text":"Different samples plotted separate facets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot geometries without coloring — plotGeometry","text":"","code":"plotGeometry( sfe, type, MARGIN = 2L, sample_id = \"all\", ncol = NULL, bbox = NULL, image_id = NULL, maxcell = 5e+05 )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot geometries without coloring — plotGeometry","text":"sfe SpatialFeatureExperiment object. type Name geometry associated MARGIN interest compute graph. MARGIN Just like apply, 1 stands row, 2 stands column. , addition, 3 stands annotation, query annotGeometries, nuclei segmentation Visium data sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot geometries without coloring — plotGeometry","text":"ggplot object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotGeometry.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot geometries without coloring — plotGeometry","text":"","code":"library(SFEData) sfe1 <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe2 <- McKellarMuscleData(\"small2\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> downloading 1 resources #> retrieving 1 resource #> loading from cache sfe <- cbind(sfe1, sfe2) sfe <- removeEmptySpace(sfe) plotGeometry(sfe, \"spotPoly\") plotGeometry(sfe, \"myofiber_simplified\", MARGIN = 3)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot local results — plotLocalResult","title":"Plot local results — plotLocalResult","text":"Plot results local spatial analyses space, local Getis-Ord Gi* values.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot local results — plotLocalResult","text":"","code":"plotLocalResult( sfe, name, features, attribute = NULL, sample_id = \"all\", colGeometryName = NULL, annotGeometryName = NULL, ncol = NULL, ncol_sample = NULL, annot_aes = list(), annot_fixed = list(), bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = \"black\", fill = \"gray80\", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, type = name, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot local results — plotLocalResult","text":"sfe SpatialFeatureExperiment object. name local spatial results. Use localResultNames see types results already calculated. features Character vector vectors. see features results given type, see localResultFeatures. attribute field local results type features. result feature vector, argument ignored. result data frame matrix, column name result, \"Ii\" local Moran's . local spatial analysis method, default attribute. See Details. Use localResultAttrs. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. type SFEMethod object string corresponding name one objects environment. localResult interest manually added outside runUnivariate runBivariate, method recorded, type argument can used specify method properly get title labels. default, argument set argument name. method parameters recorded, type argument ignored. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot local results — plotLocalResult","text":"ggplot2 object plotting one feature. patchwork object plotting multiple features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Plot local results — plotLocalResult","text":"Many local spatial analyses return data frame matrix results, whose columns can statistic interest location, variance, expected value permutation, p-value, etc. attribute argument specifies column use multiple columns. defaults local method supported package mean: localmoran localmoran_perm Ii, local Moran's statistic location. localC_perm localC, local Geary C statistic location. localG localG_perm localG, local Getis-Ord Gi Gi* statistic. include_self = TRUE calculateUnivariate runUnivariate called, Gi*. Otherwise Gi. LOSH LOSH.mc Hi, local spatial heteroscedasticity moran.plot wx, average value neighbor location. Moran plot best plotted scatter plot wx vs x. See moranPlot. local methods listed return vectors results. instance, localC returns vector default, local Geary's C statistic.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Plot local results — plotLocalResult","text":"function shares internals plotSpatialFeature, important differences. plotSpatialFeature, annotGeometry indeed used annotation protagonist colGeometry, since easy directly use ggplot2 plot data annotGeometry sf data frames overlaying annotGeometry colGeometry involves complicated code. contrast, function, local results annotGeometry can plotted separately without anything related colGeometry. Note annotGeometry local results plotted without colGeometry, annot_* arguments ignored. Use arguments aesthetics colGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotLocalResult.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot local results — plotLocalResult","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- sfe[,sfe$in_tissue] colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) feature_use <- rownames(sfe)[1] sfe <- logNormCounts(sfe) sfe <- runUnivariate(sfe, \"localmoran\", feature_use) # Which types of results are available? localResultNames(sfe) #> [1] \"localmoran\" # Which features for localmoran? localResultFeatures(sfe, \"localmoran\") #> [1] \"ENSMUSG00000025902\" # Which columns does the localmoran results have? localResultAttrs(sfe, \"localmoran\", feature_use) #> [1] \"Ii\" \"E.Ii\" \"Var.Ii\" \"Z.Ii\" #> [5] \"Pr(z != E(Ii))\" \"mean\" \"median\" \"pysal\" #> [9] \"-log10p\" \"-log10p_adj\" plotLocalResult(sfe, \"localmoran\", feature_use, \"Ii\", colGeometryName = \"spotPoly\" ) # For annotGeometry # Make sure it's type POLYGON annotGeometry(sfe, \"myofiber_simplified\") <- sf::st_buffer(annotGeometry(sfe, \"myofiber_simplified\"), 0) annotGraph(sfe, \"poly2nb_myo\") <- findSpatialNeighbors(sfe, type = \"myofiber_simplified\", MARGIN = 3, method = \"poly2nb\", zero.policy = TRUE ) sfe <- annotGeometryUnivariate(sfe, \"localmoran\", features = \"area\", annotGraphName = \"poly2nb_myo\", annotGeometryName = \"myofiber_simplified\", zero.policy = TRUE ) plotLocalResult(sfe, \"localmoran\", \"area\", \"Ii\", annotGeometryName = \"myofiber_simplified\", size = 0.3, color = \"cyan\" ) plotLocalResult(sfe, \"localmoran\", \"area\", \"Z.Ii\", annotGeometryName = \"myofiber_simplified\" ) # don't use annot_* arguments when annotGeometry is plotted without colGeometry"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot Moran/Geary Monte Carlo results — plotMoranMC","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"Plot simulations density plot histogram compared observed Moran's Geary's C, ggplot2 looks nicer. Unlike plotting function spdep, function can also plot feature different samples facets plot different features samples together comparison.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"","code":"plotMoranMC( sfe, features, sample_id = \"all\", facet_by = c(\"sample_id\", \"features\"), ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, ptype = c(\"density\", \"histogram\", \"freqpoly\"), swap_rownames = NULL, name = \"moran.mc\", ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. facet_by Whether facet sample_id (default) features. facetting sample_id, different features plotted facet comparison. facetting features, different samples compared feature. Ignored one sample specified. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. ptype Plot type, one \"density\", \"histogram\", \"freqpoly\". swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name Monte Carlo results stored, defaults \"moran.mc\". Geary's C Monte Carlo, default \"geary.mc\". ... arguments passed geom_density, geom_histogram, geom_freqpoly, depending ptype.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"ggplot2 object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotMoranMC.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot Moran/Geary Monte Carlo results — plotMoranMC","text":"","code":"library(SpatialFeatureExperiment) library(SFEData) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache colGraph(sfe, \"visium\") <- findVisiumGraph(sfe) sfe <- colDataUnivariate(sfe, type = \"moran.mc\", \"nCounts\", nsim = 100) plotMoranMC(sfe, \"nCounts\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot gene expression in space — plotSpatialFeature","title":"Plot gene expression in space — plotSpatialFeature","text":"Unlike Seurat ggspavis, plotting functions package uses geom_sf whenever applicable.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot gene expression in space — plotSpatialFeature","text":"","code":"plotSpatialFeature( sfe, features, colGeometryName = 1L, sample_id = \"all\", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, annot_aes = list(), annot_fixed = list(), exprs_values = \"logcounts\", bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NA, annot_divergent = FALSE, annot_diverge_center = NA, size = 0.5, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = \"black\", fill = \"gray80\", swap_rownames = NULL, scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot gene expression in space — plotSpatialFeature","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. exprs_values Integer scalar string indicating assay x contains expression values. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot gene expression in space — plotSpatialFeature","text":"ggplot2 object plotting one feature. patchwork object plotting multiple features.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Plot gene expression in space — plotSpatialFeature","text":"documentation function, \"feature\" can gene (whatever entity corresponds rows gene count matrix), column colData, column colGeometry sf data frame specified colGeometryName argument. light theme, continuous variables, Blues palette colorbrewer used divergent = FALSE, roma palette scico package divergent = TRUE. dark theme, nuuk palette scico used divergent = FALSE, berlin palette scico used divergent = TRUE. discrete variables, dittoSeq palette used. annotation, YlOrRd colorbrewer palette used continuous variables light theme. dark theme, acton palette scico used divergent = FALSE vanimo palette scico used divergent = FALSE. end dittoSeq palette used discrete variables. individual palette colorblind friendly, plotting continuous variables coloring colGeometry annotGeometry simultaneously, combination two palettes guaranteed colorblind friendly. addition, plotting image behind geometries, colors image may distort color perception values geometries. theme_void used spatial plots package, units spatial coordinates often arbitrary. can overriden show axes using different theme normally done ggplot2.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotSpatialFeature.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot gene expression in space — plotSpatialFeature","text":"","code":"library(SFEData) library(sf) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache # features can be genes or colData or colGeometry columns plotSpatialFeature(sfe, c(\"nCounts\", rownames(sfe)[1]), exprs_values = \"counts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\" ) # Change fixed aesthetics plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"tissueBoundary\", annot_fixed = list(color = \"blue\", size = 0.3, fill = NA), alpha = 0.7 ) # Make the myofiber segmentations a valid POLYGON geometry ag <- annotGeometry(sfe, \"myofiber_simplified\") ag <- st_buffer(ag, 0) ag <- ag[!st_is_empty(ag), ] annotGeometry(sfe, \"myofiber_simplified\") <- ag # Also plot an annotGeometry variable plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometryName = \"myofiber_simplified\", annot_aes = list(fill = \"area\") ) # Use a bounding box to zoom in bbox <- c(xmin = 5500, ymin = 13500, xmax = 6000, ymax = 14000) plotSpatialFeature(sfe, \"nCounts\", colGeometryName = \"spotPoly\", annotGeometry = \"myofiber_simplified\", bbox = bbox, annot_fixed = list(linewidth = 0.3))"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot variogram — plotVariogram","title":"Plot variogram — plotVariogram","text":"function plots variogram feature fitted variogram models, showing nugget, range, sill model. Unlike plotting functions package automap uses lattice, function uses ggplot2 make prettier customizable plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot variogram — plotVariogram","text":"","code":"plotVariogram( sfe, features, sample_id = \"all\", color_by = NULL, group = c(\"none\", \"sample_id\", \"features\", \"angles\"), use_lty = TRUE, show_np = TRUE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, divergent = FALSE, diverge_center = NULL, swap_rownames = NULL, name = \"variogram\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot variogram — plotVariogram","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. color_by Name column rowData(sfe) featureData colData (see colFeatureData), colGeometry, annotGeometry color correlogram feature. Alternatively, vector length features, data frame clusterCorrelograms. group samples, features, angles show facet comparison multiple. Default \"none\", meaning facet contain one variogram. grouping multiple variograms facet, text model, nugget, sill, range variograms shown. use_lty Logical, whether use linetype point shape distinguish different features samples facet. FALSE, different features samples distinguished patterns shown . show_np Logical, whether show number pairs cells distance bin. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot variogram — plotVariogram","text":"ggplot object. empirical variogram distance bin plotted points, fitted variogram model plotted line feature. number next point number pairs cells distance bin.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogram.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot variogram — plotVariogram","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- colDataUnivariate(sfe, \"variogram\", features = \"nCounts\", model = \"Sph\") plotVariogram(sfe, \"nCounts\") # Anisotropy, will get a message sfe <- colDataUnivariate(sfe, \"variogram\", features = \"nCounts\", model = \"Sph\", alpha = c(30, 90, 150), name = \"variogram_anis\") #> gstat does not fit anisotropic variograms. Variogram model is fitted to the whole dataset. # Facet by angles by default plotVariogram(sfe, \"nCounts\", name = \"variogram_anis\") # Plot angles with different colors plotVariogram(sfe, \"nCounts\", group = \"angles\", name = \"variogram_anis\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot variogram maps — plotVariogramMap","title":"Plot variogram maps — plotVariogramMap","text":"Plot variogram maps show variogram directions grid distances x y coordinates.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot variogram maps — plotVariogramMap","text":"","code":"plotVariogramMap( sfe, features, sample_id = \"all\", plot_np = FALSE, ncol = NULL, colGeometryName = NULL, annotGeometryName = NULL, reducedDimName = NULL, swap_rownames = NULL, name = \"variogram_map\" )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot variogram maps — plotVariogramMap","text":"sfe SpatialFeatureExperiment object. features Features plot, must rownames gene count matrix, colnames colData colGeometry, colnames cell embeddings reducedDim, numeric indices dimension reduction components. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. plot_np Logical, whether plot number pairs distance bin instead variance. ncol Number columns facetting. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. reducedDimName Name dimension reduction, can seen reducedDimNames. colGeometryName annotGeometryName precedence reducedDimName. swap_rownames Column name rowData(object) used identify features instead rownames(object) labeling plot elements. found rowData, rownames gene count matrix used. name Name correlogram results stored, default \"sp.correlogram\".","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot variogram maps — plotVariogramMap","text":"ggplot object.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/plotVariogramMap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot variogram maps — plotVariogramMap","text":"","code":"library(SFEData) sfe <- McKellarMuscleData() #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- colDataUnivariate(sfe, \"variogram_map\", features = \"nCounts\", width = 500, cutoff = 5000) plotVariogramMap(sfe, \"nCounts\")"},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot dimension reduction components in space — spatialReducedDim","title":"Plot dimension reduction components in space — spatialReducedDim","text":"plotting value projection gene expression cell principal component space. present, function work 3D array geographically weighted PCA (GWPCA), future version deal GWPCA results.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot dimension reduction components in space — spatialReducedDim","text":"","code":"spatialReducedDim( sfe, dimred, ncomponents = NULL, components = ncomponents, colGeometryName = 1L, sample_id = \"all\", ncol = NULL, ncol_sample = NULL, annotGeometryName = NULL, annot_aes = list(), annot_fixed = list(), exprs_values = \"logcounts\", bbox = NULL, image_id = NULL, maxcell = 5e+05, aes_use = c(\"fill\", \"color\", \"shape\", \"linetype\"), divergent = FALSE, diverge_center = NULL, annot_divergent = FALSE, annot_diverge_center = NULL, size = 0, shape = 16, linewidth = 0, linetype = 1, alpha = 1, color = NA, fill = \"gray80\", scattermore = FALSE, pointsize = 0, bins = NULL, summary_fun = sum, hex = FALSE, dark = FALSE, ... )"},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot dimension reduction components in space — spatialReducedDim","text":"sfe SpatialFeatureExperiment object. dimred string integer scalar indicating reduced dimension result reducedDims(sfe) plot. ncomponents numeric scalar indicating number dimensions plot, starting first dimension. Alternatively, numeric vector specifying dimensions plotted. components numeric scalar vector specifying dimensions plotted. Use instead ncomponents plotting one dimension. colGeometryName Name colGeometry sf data frame whose numeric columns interest used compute metric. Use colGeometryNames look names sf data frames associated cells/spots. sample_id Sample(s) SFE object whose cells/spots use. Can \"\" compute metric samples; metric computed separately sample. ncol Number columns plotting multiple features. Defaults NULL, means using logic facet_wrap, used patchwork's wrap_plots default. ncol_sample plotting multiple samples facets, many columns facets. distinct ncols, multiple features. plotting multiple features multiple samples, result multi-panel plot panel plot feature facetted samples. annotGeometryName Name annotGeometry SFE object, annotate gene expression plot. annot_aes named list plotting parameters annotation sf data frame. names geom (ggplot2, color fill), values column names annotation sf data frame. Tidyeval supported. annot_fixed Similar annot_aes, fixed aesthetic settings, color = \"gray\". defaults relevant defaults function. exprs_values Integer scalar string indicating assay x contains expression values. bbox bounding box specify smaller region plot, useful dataset large. Can named numeric vector names \"xmin\", \"xmax\", \"ymin\", \"ymax\", order. plotting multiple samples, matrix sample IDs column names \"xmin\", \"ymin\", \"xmax\", \"ymax\" row names. multiple samples plotted bbox vector rather matrix, bounding box used samples. may see points edge geometries intersection bounding box geometry happens point . NULL, entire tissue plotted. image_id ID image plot behind geometries. NULL, plotting images. Use imgData see image IDs present. maxcell Maximum number pixels plot image. image larger, resampled less number pixels save memory faster plotting. recommend reducing number plotting multiple facets. aes_use Aesthetic use discrete variables. continuous variables, always \"fill\" polygons point shapes 21-25. discrete variables, can fill, color, shape, linetype, whenever applicable. specified value changed applicable equivalent. example, geometry point \"linetype\" specified, \"shaped\" used instead. divergent Logical, whether divergent palette used. diverge_center divergent = TRUE, center palette diverge. NULL, centering. annot_divergent Just divergent, annotGeometry case different. annot_diverge_center Just diverge_center, annotGeometry case different. size Fixed size points. points defaults 0.5. Ignored size_by specified. shape Fixed shape points, ignored shape_by specified applicable. linewidth Width lines, including outlines polygons. polygons, defaults 0, meaning outlines. linetype Fixed line type, ignored linetype_by specified applicable. alpha Transparency. color Fixed color colGeometry color_by specified applicable, annotGeometry annot_color_by specified applicable. fill Similar color, fill. scattermore Logical, whether use scattermore package greatly speed plotting numerous points. used POINT colGeometries. geometry POINT, centroids used. Recommended plotting hundreds thousands cells cell polygons seen plotted due large number cells small plot size plotting multiple panels multiple features. pointsize Radius rasterized point scattermore. Default 0 single pixels (fastest). bins binning colGeometry space due large number cells spots, number bins, passed geom_bin2d geom_hex. NULL (default), colGeometry plotted without binning. binning, point geometry recommended. geometry point, centroids used. summary_fun Function summarize feature value colGeometry binned. hex Logical, whether use geom_hex. Note geom_hex broken ggplot2 version 3.4.0. Please update ggplot2 getting horizontal stripes hex = TRUE. dark Logical, whether use dark theme. using dark theme, palette lighter color represent higher values glowing dark. intended plotting gene expression top fluorescent images. ... arguments passed wrap_plots.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot dimension reduction components in space — spatialReducedDim","text":"plotSpatialFeature. ggplot2 object plotting one component. patchwork object plotting multiple components.","code":""},{"path":[]},{"path":"https://pachterlab.github.io/voyager/dev/reference/spatialReducedDim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot dimension reduction components in space — spatialReducedDim","text":"","code":"library(SFEData) library(scater) sfe <- McKellarMuscleData(\"small\") #> see ?SFEData and browseVignettes('SFEData') for documentation #> loading from cache sfe <- logNormCounts(sfe) sfe <- runPCA(sfe, ncomponents = 2) spatialReducedDim(sfe, \"PCA\", ncomponents = 2, \"spotPoly\", annotGeometryName = \"tissueBoundary\", divergent = TRUE, diverge_center = 0 ) # Basically PC1 separates spots not on tissue from those on tissue."},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute variograms — variogram-internal","title":"Compute variograms — variogram-internal","text":"Wrapper automap::autofitVariogram facilitate computing variograms multiple genes SFE objects EDA tool. functions written conform uniform format univariate methods called internally. functions exported, documentation written show users extra arguments use alling calculateUnivariate runUnivariate.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute variograms — variogram-internal","text":"","code":".variogram(x, coords_df, formula = x ~ 1, scale = TRUE, ...) .variogram_bv(x, y, coords_df, scale = TRUE, map = FALSE, ...) .cross_variogram(x, y, coords_df, scale = TRUE, ...) .cross_variogram_map(x, y, coords_df, width, cutoff, scale = TRUE, ...) .variogram_map(x, coords_df, formula = x ~ 1, width, cutoff, scale = TRUE, ...)"},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute variograms — variogram-internal","text":"x numeric vector whose variogram computed. coords_df sf data frame geometry regressors variogram modeling. formula formula defining response vector (possible) regressors, case absence regressors, use x ~ 1. scale Logical, whether scale x. Defaults TRUE variogram easier interpret comparable features different magnitudes length scale spatial autocorrelation interest. ... arguments passed automap::autofitVariogram model variogram alpha anisotropy. Note gstat fit ansotropic models get warning specify alpha. Nevertheless, plotting empirical anisotropic variograms comparing variogram fitted entire dataset can useful EDA tool. y bivariate, another numeric vector whose variogram computed. map logical; TRUE, cutoff width given, variogram map returned. requires package sp. Alternatively, map can passed, class SpatialDataFrameGrid (see sp docs) width width subsequent distance intervals data point pairs grouped semivariance estimates cutoff spatial separation distance point pairs included semivariance estimates; default, length diagonal box spanning data divided three.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/reference/variogram-internal.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute variograms — variogram-internal","text":"autofitVariogram object.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-131-05152023","dir":"Changelog","previous_headings":"","what":"Version 1.3.1 (05/15/2023)","title":"Version 1.3.1 (05/15/2023)","text":"Removed functions arguments deprecated 1.2.0","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-127-09192023","dir":"Changelog","previous_headings":"","what":"Version 1.2.7 (09/19/2023)","title":"Version 1.2.7 (09/19/2023)","text":"Polygon boundaries show despite linewidth = 0 Windows users. Set color = NA polygons linewidth = 0 default work Windows.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-126-09192023","dir":"Changelog","previous_headings":"","what":"Version 1.2.6 (09/19/2023)","title":"Version 1.2.6 (09/19/2023)","text":"Fixed bug plotColGraph one multiple samples plotted. Allow 16 bit images spatial plotting functions. Removed adespatial Suggests ’s used reference unit tests got removed CRAN.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-125-08182023","dir":"Changelog","previous_headings":"","what":"Version 1.2.5 (08/18/2023)","title":"Version 1.2.5 (08/18/2023)","text":"Use imgRaster getter rather S4 -@image get images plot, latter longer work SFE 1.2.3 wraps SpatRaster images saving RDS. Reading RDS won’t unwrap images need unwrapped ’re needed.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-124-07042023","dir":"Changelog","previous_headings":"","what":"Version 1.2.4 (07/04/2023)","title":"Version 1.2.4 (07/04/2023)","text":"Remove useNames = NA warning calling MULTISPATI; warning comes generic colVars. Use algebraic eigenvalues MULTISPATI either nfposi nfnega 0 Added bins_contour argument moranPlot change number bins cell density contours","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-123-05042023","dir":"Changelog","previous_headings":"","what":"Version 1.2.3 (05/04/2023)","title":"Version 1.2.3 (05/04/2023)","text":"Fix bug plotting feature illegal name alongside another feature legal name Make sure runBivariate calculateBivariate use gene symbols results even Ensembl IDs specified swap_rownames set Change secondary sequential palette light theme YlOrRd ’s distinguishable Blues primary palette low values","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-122-04262023","dir":"Changelog","previous_headings":"","what":"Version 1.2.2 (04/26/2023)","title":"Version 1.2.2 (04/26/2023)","text":"minor bugs: runBivariate gets correct feature names feature1 specified swap_rownames used show gene symbol Correct output cross variogram maps one pair genes Added default_attr localmoran_bv’s SFEMethod Don’t plot attribute localResult vector ’s default attr plotting multiple features, panels follow order features specified Allow illegal characters names colData reducedDims plots Plot one component spatialReducedDim components argument Deprecate plotColDataBin2D plotRowDataBin2D","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1112-04222023","dir":"Changelog","previous_headings":"","what":"Version 1.1.12 (04/22/2023)","title":"Version 1.1.12 (04/22/2023)","text":"Plot image behind geometries functions plot geometries Added dark theme support functions plot geometries","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1111-04052023","dir":"Changelog","previous_headings":"","what":"Version 1.1.11 (04/05/2023)","title":"Version 1.1.11 (04/05/2023)","text":"Added MULTISPATI PCA Added multivariate local Geary’s C Anselin 2019 Added calculateMultivariate unified user interface multivariate spatial analyses Variogram variogram map gstat related plotting functions Allow non-standard names local results plotLocalResult","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1110-03072023","dir":"Changelog","previous_headings":"","what":"Version 1.1.10 (03/07/2023)","title":"Version 1.1.10 (03/07/2023)","text":"Record parameters used get spatial results Force users use new name running method different parameters","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-119-02122023","dir":"Changelog","previous_headings":"","what":"Version 1.1.9 (02/12/2023)","title":"Version 1.1.9 (02/12/2023)","text":"Deprecated show_symbol argument, replacing swap_rownames consistent scater","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-117","dir":"Changelog","previous_headings":"","what":"Version 1.1.7","title":"Version 1.1.7","text":"Added bbox argument spatial plotting functions zoom bounding box","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-1010-02232023","dir":"Changelog","previous_headings":"","what":"Version 1.0.10 (02/23/2023)","title":"Version 1.0.10 (02/23/2023)","text":"Added plotColDataFreqpoly y axis needs log transformed. doesn’t work stacked histograms using position = “identity” causes bars covered.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-109-02032023","dir":"Changelog","previous_headings":"","what":"Version 1.0.9 (02/03/2023)","title":"Version 1.0.9 (02/03/2023)","text":"Fixed bug hardcoded ncol plotDimLoadings.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-108-01262023","dir":"Changelog","previous_headings":"","what":"Version 1.0.8 (01/26/2023)","title":"Version 1.0.8 (01/26/2023)","text":"Flipped divergent palettes warm color means high value.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-107-01112023","dir":"Changelog","previous_headings":"","what":"Version 1.0.7 (01/11/2023)","title":"Version 1.0.7 (01/11/2023)","text":"Fixed bug assigning local results sample colData, colGeometry, annotGeometry.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-105-12022022","dir":"Changelog","previous_headings":"","what":"Version 1.0.5 (12/02/2022)","title":"Version 1.0.5 (12/02/2022)","text":"Removed aes_string(), deprecated. Fixed bug show_symbol = TRUE “symbol” column absent rowData.","code":""},{"path":"https://pachterlab.github.io/voyager/dev/news/index.html","id":"version-100-11022022","dir":"Changelog","previous_headings":"","what":"Version 1.0.0 (11/02/2022)","title":"Version 1.0.0 (11/02/2022)","text":"First version Bioconductor Univariate local global spatial statistics based spdep Plotting functions: gene expression metadata space, results local spatial analyses, plot dimension reductions space, plot correlograms Monte Carlo simulation results","code":""}]