-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
132 lines (90 loc) · 5.38 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
```
```{r packages, include=FALSE}
require(devtools)
#install_github("https://github.com/mason-linscott/epcTools")
require(dplyr)
require(bayou)
require(phytools)
require(geiger)
require(BAMMtools)
require(dplyr)
require(PCMBase)
require(castor)
require(adephylo)
require(dentist)
require(spatstat.utils)
require(doParallel)
require(epcTools)
require(dentist)
require(optimx)
```
# epcTools
<!-- badges: start -->
[![status](https://img.shields.io/badge/status:-v0.1c-green)]()
[![status: support](https://img.shields.io/badge/support:-yes-green)]()
<!-- badges: end -->
epcTools (**e**co**p**hysiological **c**onstrained trait **Tools**), is a R package designed to identify how the environment influences parameters of Ornstein-Uhlenbeck models of trait evolution. This approach is currently oriented around ecophysiologically constrained traits, or traits whose expression is modulated by a certain level of environmental resource availability, but it can be used for other time varying processes. The current implementation is for single traits.
## Installation
'epcTools' can be installed from this github repository. Simply enter the following code:
``` r
# install.packages("devtools")
devtools::install_github("mason-linscott/epcTools")
```
## General usage
The general workflow for empirical datasets is as follows: create an EPC cache object -> run epc.max.lik search
While it may sound simple, all the data needs to be in the right format. We have included an example empirical dataset with the package to help users understand the required input format.
```{r workflow}
data(diatom_data)
#diatom_cache<-cache.epc(mtree=diatom_data$diatom_tree,epc_params = "alpha",environment_data = #diatom_data$diatom_environment,n_slice = 10, trait_data = diatom_data$diatom_traits[,1,drop=FALSE])
#diatom_epc_search<-epc.max.lik(diatom_cache)
```
## A simulated example
We will need to simulate a phylogeny and environmental vector. We can then use these two to simulate an EPC cache object which will vary with the trait of interest.
The sim.epc function requires several arguments. First, the tree and environmental values but also including other parameters documented elsewhere such as the type of EPC relationship and what variable(s) are affected by the environment. Here we simulate a cache object with alpha and theta varying with the environment and ten time slices.
```{r simulate}
#Tree and environment
library(epcTools)
tree_example<-epcTools::sim.tree(100,200,100,1)
env_example<-(c(6,8,11,13,9,8,5,4,8,10))
#Parameters to simulate under
base_sig2=0.5
base_b0_t=1
base_b1_t=2
base_b0_a=0.01
base_b1_a=0.03
base_slice=10
sim0.5f_200b_at_linear<-sim.epc(tree_example,epc_params=c("alpha","theta"),m_type="linear",X=env_example,n_slice=base_slice,1,b0_a=base_b0_a,b1_a=base_b1_a,sig2=base_sig2,b0_t=base_b0_t,b1_t=base_b1_t)
```
## Visualize how the traits are varying with the environment
```{r phenogram}
#EPC phenogram
sim.epc.phenogram(sim0.5f_200b_at_linear[[1]],10)
```
## epcTools maximum likelihood search
Now, lets see if we can recover the parameters we simulated our model under. This requires invoking the epc.max.lik() function. Note that this function can take a vector of starting parameters fed by the user or find one automatically using the start.searcher function.
```{r search}
#EPC phenogram
example_results<-epc.max.lik(sim0.5f_200b_at_linear[[1]])
summary(example_results)
```
Looking at the summarized output, we can see that the maximum likelihood estimate did indeed recover the linear relationship of the environment on alpha and theta. However, how confident can we be in these paremeter estimates? To solve this issue, 'epcTools' uses 'dentist' to generate the 95% CI around each of the ML parameter estimates which we can see in the summary function. However, how do these 95% CI estimates look on a likelihood surface? Are the parameter estimates on a ridge (not desirable) or on a well defined peak (desirable). Let's see what these look like:
## Dentist parameter visualization
```{r visualization}
#Dentist output
plot(example_results)
```
Well, some parameters certainly look more peak like than others but that is expected. Sig2 and alpha are known to have a ridge like relationship with each other and this is true for the constituent EPC parameters that go into either of these (b0a and b1a here). Notice, however, that the slope parameter (b1a) has a much more defined peak than the intercept parameter (b0a) which suggests support for a linear relationship. To really understand if this is real however, we need to compare the AIC of the EPC-AT model to a single peak OU or BM model. Here, we will just do OU to save time.
## EPC-AT vs. OU comparison
```{r OU comparison}
#First, we have to change the cache epc params to be "OU" to run an OU search. Then we will skip dentist as we only want the AIC output.
sim0.5f_200b_at_linear[[1]]$epc_params<-"OU"
OU_example<-epc.max.lik(sim0.5f_200b_at_linear[[1]],skip_dentist = TRUE)
cat(paste0("EPC-AT: ",example_results$fit$AIC, "OU: ",OU_example$AIC))
```
As you can see, the EPC-AT model has a much lower AIC than the OU model, indiciating that there is strong support for an EPC-environment relationship on the trait.