diff --git a/articles/Intro_qsvaR.html b/articles/Intro_qsvaR.html index 9be159e..f3eb38d 100644 --- a/articles/Intro_qsvaR.html +++ b/articles/Intro_qsvaR.html @@ -99,7 +99,7 @@

Leonardo Development
lcolladotor@gmail.com -

26 November 2024

+

27 November 2024

Source: vignettes/Intro_qsvaR.Rmd @@ -618,9 +618,9 @@

Reproducibilitylibrary("knitr") knit("Intro_qsvaR.Rmd", tangle = TRUE)

Date the vignette was generated.

-
#> [1] "2024-11-26 20:48:38 UTC"
+
#> [1] "2024-11-27 17:02:29 UTC"

Wallclock time spent generating the vignette.

-
#> Time difference of 54.017 secs
+
#> Time difference of 54.052 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 #>  setting  value
@@ -632,7 +632,7 @@ 

Reproducibility#> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz UTC -#> date 2024-11-26 +#> date 2024-11-27 #> pandoc 3.1.1 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── @@ -711,7 +711,7 @@

Reproducibility#> plyr 1.8.9 2023-10-02 [1] RSPM (R 4.3.0) #> png 0.1-8 2022-11-29 [1] RSPM (R 4.3.0) #> purrr 1.0.2 2023-08-10 [2] RSPM (R 4.3.0) -#> qsvaR * 1.9.1 2024-11-26 [1] Bioconductor +#> qsvaR * 1.9.1 2024-11-27 [1] Bioconductor #> R6 2.5.1 2021-08-19 [2] RSPM (R 4.3.0) #> ragg 1.3.0 2024-03-13 [2] RSPM (R 4.3.0) #> Rcpp 1.0.12 2024-01-09 [2] RSPM (R 4.3.0) diff --git a/index.html b/index.html index 882f351..792b157 100644 --- a/index.html +++ b/index.html @@ -120,21 +120,29 @@

Example "https://s3.us-east-2.amazonaws.com/libd-brainseq2/rse_tx_unfiltered.Rdata", x = bfc ) +#> adding rname 'https://s3.us-east-2.amazonaws.com/libd-brainseq2/rse_tx_unfiltered.Rdata' ## Now that we have the data in our computer, we can load it. load(rse_file, verbose = TRUE) #> Loading objects: #> rse_tx

-

In this next step we subset for the transcripts associated with degradation. These were determined by Joshua M. Stolz et al, 2022. We have provided three models to choose from. Here the names "cell_component", "top1500", and "standard" refer to models that were determined to be effective in removing degradation effects. The "standard" model involves taking the union of the top 1000 transcripts associated with degradation from the interaction model and the main effect model. The "top1500" model is the same as the "standard" model except the union of the top 1500 genes associated with degradation is selected. The most effective of our models, "cell_component", involved deconvolution of the degradation matrix to determine the proportion of cell types within our studied tissue. These proportions were then added to our model.matrix() and the union of the top 1000 transcripts in the interaction model, the main effect model, and the cell proportions model were used to generate this model of qSVs. In this example we will choose "cell_component" when using the getDegTx() and select_transcripts() functions.

-
-The above venn diagram shows the overlap between transcripts in each of the previously mentioned models.

-The above venn diagram shows the overlap between transcripts in each of the previously mentioned models. -

-
+

In this next step, we subset to the transcripts associated with degradation. qsvaR provides significant transcripts determined in four different linear models of transcript expression against degradation time, brain region, and potentially cell-type proportions:

+
    +
  1. exp ~ DegradationTime + Region
  2. +
  3. exp ~ DegradationTime * Region
  4. +
  5. exp ~ DegradationTime + Region + CellTypeProp
  6. +
  7. exp ~ DegradationTime * Region + CellTypeProp
  8. +
+

select_transcripts() returns degradation-associated transcripts and supports two parameters. First, top_n controls how many significant transcripts to extract from each model. When cell_component = TRUE, all four models are used; otherwise, just the first two are used. The union of significant transcripts from all used models is returned.

+

As an example, we’ll subset our RangedSummarizedExperiment to the union of the top 1000 significant transcripts derived from each of the four models.

-## Next we get the degraded transcripts for qSVA from the "cell_component"
-## model
-DegTx <- getDegTx(rse_tx, type = "cell_component")
+#   Subset 'rse_tx' to the top 1000 significant transcripts from the four
+#   degradation models
+DegTx <- getDegTx(
+    rse_tx,
+    sig_transcripts = select_transcripts(top_n = 1000, cell_component = TRUE)
+)
+#> Using 2496 degradation-associated transcripts.
 
 ## Now we can compute the Principal Components (PCs) of the degraded
 ## transcripts
@@ -147,19 +155,25 @@ 

Example) k <- k_qsvs(DegTx, mod, "tpm") print(k) -#> [1] 34

+#> [1] 20

Now that we have our PCs and the number we need we can generate our qSVs.

 ## Obtain the k qSVs
 qsvs <- get_qsvs(pcTx, k)
 dim(qsvs)
-#> [1] 900  34
+#> [1] 900 20

This can be done in one step with our wrapper function qSVA which just combinds all the previous mentioned functions.

 ## Example use of the wrapper function qSVA()
-qsvs_wrapper <- qSVA(rse_tx = rse_tx, type = "cell_component", mod = mod, assayname = "tpm")
+qsvs_wrapper <- qSVA(
+    rse_tx = rse_tx,
+    sig_transcripts = select_transcripts(top_n = 1000, cell_component = TRUE),
+    mod = mod,
+    assayname = "tpm"
+)
+#> Using 2496 degradation-associated transcripts.
 dim(qsvs_wrapper)
-#> [1] 900  34
+#> [1] 900 20

Differential Expression @@ -190,20 +204,20 @@

Differential Expression ## Explore the top results head(sigTx) -#> logFC AveExpr t P.Value adj.P.Val -#> ENST00000553142.5 -0.06547988 2.0390889 -5.999145 2.921045e-09 0.0005786386 -#> ENST00000552074.5 -0.12911383 2.4347985 -5.370828 1.009549e-07 0.0099992338 -#> ENST00000510632.1 0.08994392 0.9073516 4.920042 1.037016e-06 0.0473143146 -#> ENST00000530589.1 -0.10297938 2.4271711 -4.918806 1.043399e-06 0.0473143146 -#> ENST00000572236.1 -0.05358333 0.6254025 -4.819980 1.697403e-06 0.0473143146 -#> ENST00000450454.6 0.08446871 1.0042440 4.816539 1.726143e-06 0.0473143146 +#> logFC AveExpr t P.Value adj.P.Val +#> ENST00000484223.1 -0.17439018 1.144051 -6.685583 4.099898e-11 8.121610e-06 +#> ENST00000344423.9 0.09212678 1.837102 6.449533 1.855943e-10 1.838246e-05 +#> ENST00000399808.4 0.28974369 4.246788 6.320041 4.165237e-10 2.233477e-05 +#> ENST00000467370.5 0.06313938 0.301711 6.307179 4.509956e-10 2.233477e-05 +#> ENST00000264657.9 0.09913353 2.450684 5.933186 4.280565e-09 1.375288e-04 +#> ENST00000415912.6 0.09028757 1.736581 5.918230 4.671963e-09 1.375288e-04 #> B -#> ENST00000553142.5 10.200286 -#> ENST00000552074.5 6.767821 -#> ENST00000510632.1 4.524039 -#> ENST00000530589.1 4.518145 -#> ENST00000572236.1 4.051142 -#> ENST00000450454.6 4.035041

+#> ENST00000484223.1 14.338379 +#> ENST00000344423.9 12.865110 +#> ENST00000399808.4 12.077344 +#> ENST00000467370.5 11.999896 +#> ENST00000264657.9 9.811110 +#> ENST00000415912.6 9.726142

Finally, you can compare the resulting t-statistics from your differential expression model against the degradation time t-statistics adjusting for the six different brain regions. This type of plot is called DEqual plot and was shown in the initial qSVA framework paper (Jaffe et al, PNAS, 2017). We are really looking for two patterns exemplified here in Figure 1 (cartoon shown earlier). A direct positive correlation with degradation shown in Figure 1 on the right tells us that there is signal in the data associated with qSVs. An example of nonconfounded data or data that has been modeled can be seen in Figure 1 on the right with its lack of relationship between the x and y variables.

Cartoon showing patterns in DEqual plots

diff --git a/pkgdown.yml b/pkgdown.yml index f6f3e2b..1416518 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -3,5 +3,5 @@ pkgdown: 2.0.9 pkgdown_sha: ~ articles: Intro_qsvaR: Intro_qsvaR.html -last_built: 2024-11-26T20:47Z +last_built: 2024-11-27T17:01Z diff --git a/reference/figures/README-DEqual-1.png b/reference/figures/README-DEqual-1.png index 0a9a1e3..b59d5e3 100644 Binary files a/reference/figures/README-DEqual-1.png and b/reference/figures/README-DEqual-1.png differ diff --git a/reference/figures/README-DEqual-no-qSVs-1.png b/reference/figures/README-DEqual-no-qSVs-1.png index 59628a4..6dc8e35 100644 Binary files a/reference/figures/README-DEqual-no-qSVs-1.png and b/reference/figures/README-DEqual-no-qSVs-1.png differ