Skip to content

Commit

Permalink
markdown files for simulation analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
princyparsana committed Feb 12, 2019
1 parent 4720979 commit 34644f1
Show file tree
Hide file tree
Showing 6 changed files with 332 additions and 140 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: "Simulation matched to sample and gene numbers in empirical analysis with GTEx"
author: "Princy Parsana"
date: "February 8, 2019"
output: html_document
---

```{r setup, include=FALSE}
library(Matrix)
```
Here, we first simulated data from a multivariate gaussian distribution based on an underlying scale-free network structure in the precision matrix. We tried to match the number of samples to the number of samples in our empirical experiments done with GTEx data. This simulation included 5000 genes across 350 samples. We found that the number of false discoveries is much higher in confounded data. Further, this analysis shows that PC correction increased the number of true positives in the inferred networks - numbers that were similar to networks obtained from original simulated data(without confounding).
```{r}
dat.theta <- readRDS("simulated_network.Rds")
sim.net <- readRDS("networks_data.Rds")
sim.confounded.net <- readRDS("networks_confounded.Rds")
sim.corrected.net <- readRDS("networks_corrected.Rds")
a=3
## Count the total number of edges in the network
true_ecount <- sum(dat.theta == 1 & col(dat.theta) < row(dat.theta))
confounded_ecount <- sapply(sim.confounded.net, function(x) sum(x == 1 & col(x) < row(x)))
sim_ecount <- sapply(sim.net,function(x) sum(x == 1 & col(x) < row(x)))
corrected_ecount <- sapply(sim.corrected.net, function(x) sum(x == 1 & col(x) < row(x)))
gc()
## count the number of edges common with the simulated network
confounded_common <- sapply(sim.confounded.net, function(x,y) {sum((x+y) == 2 & col(x) < row(x))}, dat.theta)
sim_common <- sapply(sim.net, function(x,y) {sum((x+y) == 2 & col(x) < row(x))}, dat.theta)
corrected_common <- sapply(sim.corrected.net, function(x,y) {sum((x+y) == 2 & col(x) < row(x))}, dat.theta)
print(paste("The number of edges in the true network:", true_ecount))
print(paste("The number of edges in the inferred network", sim_ecount[a]))
print(paste("The number of edges in the confounded network", confounded_ecount[a]))
print(paste("The number of edges in the corrected network", corrected_ecount[a]))
print(paste("The number common edges in the confounded and true network", confounded_common[a]))
print(paste("The number common edges in the inferred and true network", sim_common[a]))
print(paste("The number common edges in the inferred and true network", corrected_common[a]))
```
Next, we checked the proportion of edges from original simulated data networks that were also identified in PC corrected networks, and proportion of edges from original simulated data networks that were also identified in confounded networks. We find that the networks obtained from PC corrected data were more similar to those obtained from original simulated data compared to confounded data.
```{r}
sim_corrected_common <- mapply(function(x,y){
sum(x+y == 2 & col(x) < row(x))
}, sim.net, sim.corrected.net)
sim_corrected_common/(sim_ecount+1)
sim_confounded_common <- mapply(function(x,y){
sum(x+y == 2 & col(x) < row(x))
}, sim.net, sim.confounded.net)
sim_confounded_common/(sim_ecount+1)
```
```{r, fig.width=6, fig.height=6}
plot(sim_ecount, sim_common, col = "darkred", pch = "*", cex = 1.5, xlab = "# of edges", ylab = "True positives")
points(corrected_ecount, corrected_common, col = "cornflowerblue", pch = "*", cex = 1.5)
points(confounded_ecount, confounded_common, col = "darkgreen", pch = "*", cex = 1.5)
legend("topleft", c("original", "confounded", "PC corrected"), col = c("darkred", "cornflowerblue", "darkgreen"), pch = "*")
```


Large diffs are not rendered by default.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

0 comments on commit 34644f1

Please sign in to comment.