-
Notifications
You must be signed in to change notification settings - Fork 1
/
02-model-choices.qmd
187 lines (94 loc) · 13.9 KB
/
02-model-choices.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# Modeling choices, notes and best practices
## Selecting maximum age (population and data)
The comments below are based on a team discussion on how to select a population and data maximum age for modeling:
**Choice 1, Population length bins:** should extend beyond something like the 97.5% quantile of the length at age distribution for the largest age (easy to add more bins to see if it makes any difference).
**Choice 2, Max age:** should be large enough for growth to be asymptotic, and at least as big as the largest data age bins. It's easy to test the impact of changing to different value, just requires updating ageing error matrices. Look at the stable age-distribution plot and examine the number of fish in the plus group. Additionally, the period of time when data are available (after extensive exploitation), the data plus group may be lower.
**Choice 3, Data length bins** should probably be big enough that the plus group doesn't have much more in it than the previous bin. We should check to make sure that the selection of length and age bins are consistent with each other. Typically, we often have max length bins where there are only a few fish, but a larger proportion of data in the data age plus group.
**Choice 4, Data age bins** should probably be big enough that the plus group doesn't have much more in it than the previous bin. In regards to population maximum age, there is no negative repercussions within SS3 for having a larger value, beyond a slower run time. Revising the age bins based on the data, rather than a priori rules about how to set this, may be considered "data mining". Could create a check within the survey and PacFIN comp codes that creates a flag when the data plus group has more than a certain percentage (i.e., 5\%). Also, add a warning to [{r4ss}](https://github.com/r4ss/r4ss) about the percentage in the population and data plus groups.
## Determining prior for natural mortality
Owen's Advice on $M$; July 6, 2022
1. I prefer using age data alone to estimate the natural mortality rate (see accompanying document: M2017_for_Methods_Review_Hamel; Hamel and Cope (in review)), except in cases where getting a reasonable estimate of the maximum age is problematic.
2. The age based prior is simply: $$\text{lognormal}(\ln(5.4/\text{max age}), 0.31).$$
3. The fixed value for $M$ if not estimating is even more simply the median of that prior: $$M = 5.4/\text{max age}$$
4. Can explore a range approaches for M estimates and priors [here](http://barefootecologist.com.au/shiny_m); reference is Cope and Hamel, in press.
## Length-weight relationships
Use the [getWLpars()](https://pfmc-assessments.github.io/PacFIN.Utilities/reference/getWLpars.html) function in {PacFIN.Utilities} which provides estimates for the parameters required by Stock Synthesis. It is common to rely only on data from the WCGBT Survey to get the parameters used in the model under the assumption that this survey is most representative of the population as a whole. These parameters should always be fixed in the model as no data type is available in SS3 to accurately estimate them internally.
## Estimating ageing error
See the [{nwfscAgeingError}](https://github.com/pfmc-assessments/nwfscAgeingError) package.
## Maturity
Talk to Melissa Head about maturity. In 2017, she provided estimated maturity curves based on samples from the NWFSC survey for many species.
Data from the NWFSC survey on maturity includes a column indicating mature or immature and another indicating spawning and not spawning. The latter considers all "mature" fish with over 25% atresia as not spawning (along with all immature fish). The spawning/not spawning column is the one we commonly use to estimate the maturity curve since that is really what we care about. In some cases a simple logistic will fit, but if there is much skip spawning/atresia for older/larger females, a logistic type curve which asymptotes to a lower value or a non-parametric fit is more appropriate. A column with percent atresia is also provided if you with to use a percentage other than 25% for the cutoff. Finally, the mature/immature column can be used instead if the atresia/skip spawning is taken into account in specifying the fecundity relationship.
An additional column has been added to the NWFSC survey table indicating uncertainty in the designation. This can be used to weight or exclude data.
Note: John Field has expressed concern that we are too focused on recent samples from the NWFSC survey, so if you aren't going to include samples from past collections, think about a justification for that choice.
## Fecundity
[Dick et al. (2017)](http://www.sciencedirect.com/science/article/pii/S0165783616303745) has estimates for the $a$ and $b$ parameters in the functional form $F = aL^b$ for many rockfish. The estimates are based on length in mm and predicted number of eggs, fitted in log-space. If you use length in cm (like in SS3), and don't want huge spawning biomass values (eggs), you can convert the values in the paper to units of cm and millions, billions, or trillions of eggs. First, find the values "$\exp(a)$" (referred to as $a'$ below) and "$b$" from Table 6 in Dick et al. for your species (or subgenus, if no estimate for your species is reported). If your subgenus is not reported, you can use the "*Sebastes*" values.
The correct value of the "$a$" parameter in Stock Synthesis (using fecundity at length option 2: eggs=a*L^b) with length in cm and spawning output in **millions of eggs** is:
$$a = \frac{a'\cdot10^b}{1000}$$
The division is by 1 thousand instead of 1 million because recruitment in SS3 is in thousands. The value of $b$ is unchanged, and can be used directly in the assessment.
Other options include spawning output in **billions of eggs** via:
$$a = \frac{a'\cdot10^b}{10^6}$$
or spawning output in **trillions of eggs** (e.g. 2017 Yellowtail Rockfish):
$$a = \frac{a'\cdot10^b}{10^9}.$$
The [{r4ss}](https://github.com/r4ss/r4ss) figures in the "Bio" tab show the resulting fecundity relationship as a function of length, weight, and age so you can use that to determine whether your parameter values produce the correct final relationship by comparing to original studies on the relationship between fecundity and length.
## Modeling selectivity
There was a CAPAM workshop on selectivity in 2013. Report, presentations, and recordings are available at [here](http://www.capamresearch.org/current-projects/selectivity) and the associated Fisheries Research special issue is [here](https://www.sciencedirect.com/journal/fisheries-research/vol/158/suppl/C).
### Conceptual reasoning for using different approaches to selectivity
**The following set of circumstances might cause selectivity to be dome-shaped:**
1. Contact selectivity causing older fish to outswim the trawl, or escape the gillnet/hooks
2. Incomplete spatial coverage in terms of depth or untrawlable habitat
3. Spatial heterogeneity in fishing intensity (see Sampson paper). This probably applies more to fishery selectivity than surveys.
**Reasons for justifying asymptotic selectivity**
1. It can help estimate $L_\infty$ and variability in growth because the mode of the length comps is often representative of where the oldest fish are piling up.
2. It prevents the estimation of a large amounts of cryptic biomass
### General advice on selectivity in SS3
1. Start with a functional form that is commonly used (double normal)
2. Find some initial values using either the Excel spreadsheets or the [{r4ss}](https://github.com/r4ss/r4ss) widgets
3. Put the initial values in the model and run without estimating anything until you get a model that runs. This can be done by setting the maximum phase in the starter file to 0 or (better), by using the command line inputs: `-maxfn 0 -nohess`.
4. Read the output into R using SS_output and plot the selectivity in {r4ss} using either `SS_plots(model, plot=2)` or `SSplotSelex(model, subplot=1)`, where `model` is the object created by `SS_output()`.
5. Set the PHASE for as many parameters to negative values as possible so that you start with estimating relatively few parameters (such as parameters 1 and 3 of the double normal, which control the peak and ascending slope).
### Guidelines for SS3 double normal initial setup
- Fix parameters 5 and 6 (negative phase).
- If selectivity is thought to be zero at the youngest/smallest or the oldest/biggest fish set the value to zero (e.g -15)
- If selectivity is thought to be larger than zero at the youngest/smallest or the oldest/biggest fish set the value to -999 (does not scale the selectivity for the youngest or oldest age, independently from the normal curve).
- Fix the plateau (parameter 2) to be small values (e.g. -15).
- Set the initial value for the peak (parameter 1) at the age/length equal to the mode of the composition data
- Set the ascending (parameter 3) and descending (parameter 4) slopes at $log(8 \cdot (a_{peak}-a_{min}))$ and $log(8 \cdot (a_{max}-a_{peak}))$ (substitute min and max lengths and length at peak when modeling length-base selectivity).
- Don't estimate selectivity at youngest age/size (parameter 5) unless there are observations of fish in the smallest age- or length-bins, either fix at -5 or -999
- Use the double normal instead of the logistic for asymptotic selectivity to have flexibility of dome shape without major changes to control file. This also provides control over selectivity at the youngest age. To force a logistic shape, you can do one of the following (where parameters 2, 4, and 6 should not be estimated under any of the options):
- Fix descending slope (parameter 4) at a large number (e.g. 15)
- Alternative 1: Fix plateau (parameter 2) to a large number (e.g. 15)
- Alternative 2: Fix selectivity of the oldest age (parameter 6) at a large number (e.g. 15).
## Modeling recruitment deviations
1. Choices to be made:
a. allow recruitment deviations or not?
b. range of years?
c. breaking into "early", "main", "late" vectors?
i. early and late vectors are intended to add uncertainty to the model for years with little or no data with information about recruitment
2. What was done in the 2011 assessments? Graphical description is [here](https://docs.google.com/a/noaa.gov/file/d/0B5lSVg5Eq86uX250WHJXUWt0T1E/edit).
3. Guidance on bias adjustment settings
a. Multiple simulation analyses have shown that applying the bias adjustment settings given by the `r4ss::SS_fitbiasramp()` perform well on average. However, there's no guarantee that this will work well in any given circumstance. User discretion is advised.
4. What to do about $\sigma_R$.
1. Simulation in [Methot and Taylor (2011)](http://www.nrcresearchpress.com/doi/abs/10.1139/f2011-092) looked at a few options.
i. Estimating $\sigma_R$ . **Performed well under ideal circumstances.**
ii. Tune $\sigma_R$ to match the observed variability in recruitment deviations. $$ {\sigma_R}^2 = \text{sd}(r')^2 $$ **Performed less well.**
iii. Tune $\sigma_R$ so that $${\sigma_R}^2 = \text{sd}(r')^2 + \text{mean}(\text{SE}(r'))^2$$ where $\text{sd}(r')$ is the standard deviation of the vector of estimated recruitments over a range of years that seem reasonably well informed by the data, and $\text{mean}(\text{SE}(r'))$ is the mean of the estimated variability of those values. **Performed best.**
5. MCMC
a. If you can get MCMC to work for your model, all the worry about bias adjustment goes away. You would need to either estimate $\sigma_R$ in the MCMC and hope it converges, or fix it at a value that has been tuned to the MLE.
6. Autocorrelated recruitment deviations
a. Estimate `# SR_autocorr` (Stock-Recruit parameter in control file)
b. Here are results from hake in 2019 from the ss_new file: `-1 1 -0.155106 0 99 0 6 0 0 0 0 0 0 0 # SR_autocorr`
c. [Johnson et al. (2016)](https://www.sciencedirect.com/science/article/abs/pii/S0165783616301928) estimated autocorrelation in a simulation context
## Rockfish steepness profile
Up until the 2019 assessment cycle, a meta-analysis approach was used to develop a prior for steepness for West Coast rockfish species. This method was originally developed to develop this prior by Martin Dorn in 2007, and updated by Dorn in 2009/2011. It was then revised and recoded by James Thorson, who updated it in 2013/2015/2017, with Chantel Wetzel conducting the update in 2019. When the meta-analysis was updated post the 2017 assessment cycle, the estimated mean from the prior distribution was considered unreasonably high, particularly for rockfish species, and the new prior was not approved by the SSC for use in future West Coast rockfish assessments. In the instance that the SSC does not approve a new estimate (or methodology), the approved approach reverts to the previous approved estimate. In this instance that is the prior estimated for use in the 2017 assessment cycle of a mean = 0.72 with a standard deviation of 0.158.
Below is detailed information regarding the meta-analysis approach applied between 2007 - 2017:
- The method is fully described in [Thorson, Dorn, and Hamel (2018) Fisheries Research](https://doi.org/10.1016/j.fishres.2018.03.014)
- The method can be replicated for any data set 2007/2009/2011/2013/2015/2017 using R package [{SebastesSteepness}](https://github.com/James-Thorson/SebastesSteepness)
- The rationale for excluding a species when developing a prior for a given focal species (termed "Type-C" meta-analysis) is explained [in this paper](http://www.nrcresearchpress.com/doi/abs/10.1139/f04-245)
- If you are assessing any of the following species you will need to obtain a Type-C prior: aurora, black, bocaccio, canary, chilipepper, darkblotched, gopher, splitnose, widow, yelloweye, yellowtail north
- The estimated prior by year was as follows:
- 2007: mean = 0.58, sd = 0.181
- 2009: mean = 0.69, sd = 0.218
- 2011: mean = 0.76, sd = 0.170
- 2013: mean = 0.78, sd = 0.152
- 2015: mean = 0.77, sd = 0.147
- 2017: mean = 0.72, sd = 0.158