(work in progress)
Kaphi settings are read in from a user-created YAML file. This document outlines the format that is required by the YAML parser.
The examples given within this document are specific to the birth-death speciation model. Additional examples of YAML files can be found in Kaphi/pkg/examples.
priors:
parameter_1:
dist: ''
hyperparameters:
-
constraints:
- ''
proposals:
parameter_1:
dist: ''
parameters:
-
smc:
nparticle: 1000
nsample: 5
ess.tolerance: 50.0
final.accept.rate: 0.05
final.epsilon: 0.05
quality: 0.95
step.tolerance: 1.e-5
norm.mode: 'NONE'
distances:
'kernel.dist':
package: 'Kaphi'
weight: 0.8
decay.factor: 0.2
rbf.variance: 100.0
sst.control: 1.0
A prior distribution is chosen for each model parameter, from which the particle values are sampled.
Each parameter's prior is specified by the following:
- parameter name
- dist - one of the distributions in the stats package. Only the root of the distribution should be used (ex. for the gamma distribution, write 'gamma' instead of 'pgamma', 'rgamma', 'dgamma', 'qgamma').
- hyperparameters - a list of the hyperparameter names and associated values for the chosen distribution.
The example below is assigning a gamma distribution prior to the parameter lambda and a uniform distribution to mu.
Distribution | Keyword | Parameters |
---|---|---|
Beta | beta | shape1, shape2, ncp |
Binomial | binom | size, prob |
Cauchy | cauchy | location, scale |
Chi-squared | chisq | df, ncp |
Exponential | exp | rate |
F distribution | f | df1, df2, ncp |
Gamma | gamma | rate, shape |
Geometric | geom | prob |
Hypergeometic | hyper | m, n, k |
Log-normal | lnorm | meanlog, sdlog |
Multinomial | multinom | size, prob |
Negative bionomial | nbinom | size, prob, mu |
Normal | norm | mean, sd |
Poisson | pois | lambda |
Student's t | t | df, ncp |
Uniform | unif | min, max |
Weibull | weibull | shape, scale |
priors:
lambda:
dist: 'gamma'
hyperparameters:
- rate: 1.0
- shape: 2.0
mu:
dist: 'unif'
hyperparameters:
- min: 0.0
- max: 0.05
constraints:
- 'mu < lambda'
The priors are followed by the constraints section. This section allows the user to enter a string containing any conditions that must be upheld during sampling (or will otherwise leave vacant).
As shown in the example above, constraints are of the form: "param1 operator param2"
.
A proposal distribution is specified for each parameter. The values sampled from these distributions dictate how a particle is updated.
The format is similar to that of the priors, with the exception that parameters are specified rather than hyperparameters.
proposals:
lambda:
dist: 'norm'
parameters:
- mean: 0
- sd: 0.1
mu:
dist: 'norm'
parameters:
- mean: 0
- sd: 0.1
The Sequential Monte Carlo (SMC) settings allow for adjustments to be made to different functions within smcABC.R.
smc:
nparticle: 1000
nsample: 5
ess.tolerance: 50.0
final.accept.rate: 0.05
final.epsilon: 0.05
quality: 0.95
step.tolerance: 1.e-5
norm.mode: 'NONE'
Parameter | Description |
---|---|
nparticle | The number of particles to be run. |
nsample | The number of samples per particle. |
ess.tolerance | Effective Sample Size tolerance. |
final.accept.rate | The accept rate that will stop the algorithm. |
final.epsilon | The value of epsilon that will stop the algorithm. |
quality | A tuning parameter (alpha) that corresponds to the ratio of current effective sample size (ESS) over previous ESS. If alpha is near 1, then ESS is held constant over time and particles stay diverse but are slower to converge. If alpha is near 0, then particles can collapse to a single point that is not necessarily accurate. |
step.tolerance | The convergence tolerance for finding the next epsilon. |
Distance metrics are used to compare two trees. Kaphi was created to be used with the kernel distance, but many other distances are accepted (see Distance Metrics). Many difference distance metrics may be used in conjunction, as shown in the examples below.
There are two ways to specify distance metrics: as a YAML dictionary (first example) or as a string indicating a linear combination of distances. (second example).
distances:
'kernel.dist':
package: 'Kaphi'
weight: 0.8
decay.factor: 0.2
rbf.variance: 100.0
sst.control: 1.0
'sackin':
package: 'Kaphi'
weight: 0.05
use.branch.lengths: FALSE
'colless':
package: 'Kaphi'
weight: 0.10
'KF.dist':
package: 'phangorn'
weight: 0.05
check.labels: 'FALSE'
distances:
'0.8*kernel.dist(decay.factor=0.2,rbf.variance=100.0,sst.control=1.0)+0.1*sackin+0.3*colless'