forked from SurgicalInformatics/healthyr_book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
10_survival.Rmd
636 lines (487 loc) · 24.1 KB
/
10_survival.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
# Time-to-event data and survival{#chap10-h1}
\index{time-to-event / survival@\textbf{time-to-event / survival}}
> The reports of my death have been greatly exaggerated.
> Mark Twain
In healthcare, we deal with a lot of binary outcomes.
Death yes/no or disease recurrence yes/no for instance.
These outcomes are often easily analysed using binary logistic regression as described in the previous chapter.
When the time taken for the outcome to occur is important, we need a different approach.
For instance, in patients with cancer, the time taken until recurrence of the cancer is often just as important as the fact it has recurred.
## The Question
We will again use the classic "Survival from Malignant Melanoma" dataset included in the **boot** package which we have used previously.
The data consist of measurements made on patients with malignant melanoma.
Each patient had their tumour removed by surgery at the Department of Plastic Surgery, University Hospital of Odense, Denmark, during the period 1962 to 1977.
We are interested in the association between tumour ulceration and survival after surgery.
```{r echo=FALSE, message=FALSE}
library(knitr)
library(kableExtra)
mykable = function(x, caption = "CAPTION", ...){
kable(x, row.names = FALSE, align = c("l", "l", "r", "r", "r", "r", "r", "r", "r"),
booktabs = TRUE, caption = caption,
linesep = c("", "", "\\addlinespace"), ...) %>%
kable_styling(latex_options = c("scale_down", "hold_position"))
}
```
## Get and check the data
```{r message=FALSE}
library(tidyverse)
library(finalfit)
melanoma <- boot::melanoma #F1 here for help page with data dictionary
```
```{r eval=FALSE}
glimpse(melanoma)
missing_glimpse(melanoma)
ff_glimpse(melanoma)
```
As was seen before, all variables are coded as numeric and some need recoding to factors.
This is done below for those we are interested in.
## Death status
`status` is the patient's status at the end of the study.
* 1 indicates that they had died from melanoma;
* 2 indicates that they were still alive and;
* 3 indicates that they had died from causes unrelated to their melanoma.
There are three options for coding this.
* Overall survival: considering all-cause mortality, comparing 2 (alive) with 1 (died melanoma)/3 (died other);
* Cause-specific survival: considering disease-specific mortality comparing 2 (alive)/3 (died other) with 1 (died melanoma);
* Competing risks: comparing 2 (alive) with 1 (died melanoma) accounting for 3 (died other); see more below.
## Time and censoring
\index{time-to-event / survival@\textbf{time-to-event / survival}!censoring}
`time` is the number of days from surgery until either the occurrence of the event (death) or the last time the patient was known to be alive.
For instance, if a patient had surgery and was seen to be well in a clinic 30 days later, but there had been no contact since, then the patient's status would be considered alive at 30 days.
This patient is censored from the analysis at day 30, an important feature of time-to-event analyses.
## Recode the data
```{r}
library(dplyr)
library(forcats)
melanoma <- melanoma %>%
mutate(
# Overall survival
status_os = if_else(status == 2, 0, # "still alive"
1), # "died of melanoma" or "died of other causes"
# Diease-specific survival
status_dss = if_else(status == 2, 0, # "still alive"
if_else(status == 1, 1, # "died of melanoma"
0)), # "died of other causes is censored"
# Competing risks regression
status_crr = if_else(status == 2, 0, # "still alive"
if_else(status == 1, 1, # "died of melanoma"
2)), # "died of other causes"
# Label and recode other variables
age = ff_label(age, "Age (years)"), # ff_label table friendly labels
thickness = ff_label(thickness, "Tumour thickness (mm)"),
sex = factor(sex) %>%
fct_recode("Male" = "1",
"Female" = "0") %>%
ff_label("Sex"),
ulcer = factor(ulcer) %>%
fct_recode("No" = "0",
"Yes" = "1") %>%
ff_label("Ulcerated tumour")
)
```
## Kaplan Meier survival estimator
\index{time-to-event / survival@\textbf{time-to-event / survival}!Kaplan Meier estimator}
\index{Kaplan Meier estimator}
\index{time-to-event / survival@\textbf{time-to-event / survival}!log-rank test}
\index{log-rank test}
We will use the excellent **survival** package to produce the Kaplan Meier (KM) survival estimator (@therneau2000, @therneau2020).
This is a non-parametric statistic used to estimate the survival function from time-to-event data.
```{r}
library(survival)
survival_object <- melanoma %$%
Surv(time, status_os)
# Explore:
head(survival_object) # + marks censoring, in this case "Alive"
# Expressing time in years
survival_object <- melanoma %$%
Surv(time/365, status_os)
```
\index{functions@\textbf{functions}!Surv}
### KM analysis for whole cohort
### Model
The survival object is the first step to performing univariable and multivariable survival analyses.
If you want to plot survival stratified by a single grouping variable, you can substitute "survival_object ~ 1" by "survival_object ~ factor"
```{r}
# Overall survival in whole cohort
my_survfit <- survfit(survival_object ~ 1, data = melanoma)
my_survfit # 205 patients, 71 events
```
\index{functions@\textbf{functions}!survfit}
### Life table
\index{time-to-event / survival@\textbf{time-to-event / survival}!life table}
A life table is the tabular form of a KM plot, which you may be familiar with.
It shows survival as a proportion, together with confidence limits.
The whole table is shown with, `summary(my_survfit)`.
```{r}
summary(my_survfit, times = c(0, 1, 2, 3, 4, 5))
# 5 year overall survival is 73%
```
## Kaplan Meier plot
\index{time-to-event / survival@\textbf{time-to-event / survival}!Kaplan Meier plot}
We can plot survival curves using the **finalfit** wrapper for the package **survminer**.
There are numerous options available on the help page.
You should always include a number-at-risk table under these plots as it is essential for interpretation.
As can be seen, the probability of dying is much greater if the tumour was ulcerated, compared to those that were not ulcerated.
```{r, fig.width = 4, fig.height = 4}
dependent_os <- "Surv(time/365, status_os)"
explanatory <- c("ulcer")
melanoma %>%
surv_plot(dependent_os, explanatory, pval = TRUE)
```
\index{functions@\textbf{functions}!surv\_plot}
\index{plotting@\textbf{plotting}!surv\_plot}
## Cox proportional hazards regression
\index{time-to-event / survival@\textbf{time-to-event / survival}!Cox proportional hazards regression}
\index{Cox proportional hazards regression}
The Cox proportional hazards model is a regression model similar to those we have already dealt with.
It is commonly used to investigate the association between the time to an event (such as death) and a set of explanatory variables.
Cox proportional hazards regression can be performed using `survival::coxph()` or the all-in-one `finalfit()` function.
The latter produces a table containing counts (proportions) for factors, mean (SD) for continuous variables and a univariable and multivariable CPH regression.
### `coxph()`
CPH using the `coxph()` function produces a similar output to `lm()` and `glm()`, so it should be familiar to you now.
It can be passed to `summary()` as below, and also to `broom::tidy()` if you want to get the results into a tibble.
```{r}
library(survival)
coxph(Surv(time, status_os) ~ age + sex + thickness + ulcer, data = melanoma) %>%
summary()
```
The output shows the number of patients and the number of events.
The coefficient can be exponentiated and interpreted as a **hazard ratio**, `exp(coef)`.
Helpfully, 95% confidence intervals are also provided.
A hazard is the term given to the rate at which events happen.
The probability that an event will happen over a period of time is the hazard multiplied by the time interval.
An assumption of CPH is that hazards are constant over time (see below).
For a given predictor then, the hazard in one group (say males) would be expected to be a constant proportion of the hazard in another group (say females).
The ratio of these hazards is, unsurprisingly, the hazard ratio.
The hazard ratio differs from the relative risk and odds ratio.
The hazard ratio represents the difference in the risk of an event at any given time, whereas the relative risk or odds ratio usually represents the cumulative risk over a period of time.
### `finalfit()`
Alternatively, a CPH regression can be run with **finalfit** functions.
This is convenient for model fitting, exploration and the export of results.
```{r eval=FALSE}
dependent_os <- "Surv(time, status_os)"
dependent_dss <- "Surv(time, status_dss)"
dependent_crr <- "Surv(time, status_crr)"
explanatory <- c("age", "sex", "thickness", "ulcer")
melanoma %>%
finalfit(dependent_os, explanatory)
```
```{r include=FALSE}
dependent_os <- "Surv(time, status_os)"
dependent_dss <- "Surv(time, status_dss)"
dependent_crr <- "Surv(time, status_crr)"
explanatory <- c("age", "sex", "thickness", "ulcer")
melanoma %>%
finalfit(dependent_os, explanatory) %>%
mykable(caption = "Univariable and multivariable Cox Proportional Hazards: Overall survival following surgery for melanoma by patient and tumour variables.")
```
The labelling of the final table can be adjusted as desired.
```{r eval=FALSE}
melanoma %>%
finalfit(dependent_os, explanatory, add_dependent_label = FALSE) %>%
rename("Overall survival" = label) %>%
rename(" " = levels) %>%
rename(" " = all)
```
```{r echo=FALSE}
melanoma %>%
finalfit(dependent_os, explanatory, add_dependent_label = FALSE) %>%
rename("Overall survival" = label) %>%
rename(" " = levels) %>%
rename(" " = all) %>%
mykable(caption = "Univariable and multivariable Cox Proportional Hazards: Overall survival following surgery for melanoma by patient and tumour variables (tidied).")
```
### Reduced model
If you are using a backwards selection approach or similar, a reduced model can be directly specified and compared.
The full model can be kept or dropped.
```{r eval=FALSE}
explanatory_multi <- c("age", "thickness", "ulcer")
melanoma %>%
finalfit(dependent_os, explanatory,
explanatory_multi, keep_models = TRUE)
```
```{r echo=FALSE}
explanatory_multi <- c("age", "thickness", "ulcer")
melanoma %>%
finalfit(dependent_os, explanatory,
explanatory_multi, keep_models = TRUE) %>%
mykable(caption = "Cox Proportional Hazards: Overall survival following surgery for melanoma with reduced model.")
```
### Testing for proportional hazards
\index{time-to-event / survival@\textbf{time-to-event / survival}!assumptions}
\index{time-to-event / survival@\textbf{time-to-event / survival}!testing for proportional hazards}
An assumption of CPH regression is that the hazard (think risk) associated with a particular variable does not change over time.
For example, is the magnitude of the increase in risk of death associated with tumour ulceration the same in the early post-operative period as it is in later years?
The `cox.zph()` function from the **survival** package allows us to test this assumption for each variable.
The plot of scaled Schoenfeld residuals should be a horizontal line.
The included hypothesis test identifies whether the gradient differs from zero for each variable.
No variable significantly differs from zero at the 5% significance level.
```{r, fig.width = 4, fig.height = 4}
explanatory <- c("age", "sex", "thickness", "ulcer", "year")
melanoma %>%
coxphmulti(dependent_os, explanatory) %>%
cox.zph() %>%
{zph_result <<- .} %>%
plot(var=5)
zph_result
```
\index{functions@\textbf{functions}!cox.zph}
### Stratified models
\index{time-to-event / survival@\textbf{time-to-event / survival}!stratified models}
One approach to dealing with a violation of the proportional hazards assumption is to stratify by that variable.
Including a `strata()` term will result in a separate baseline hazard function being fit for each level in the stratification variable. It will be no longer possible to make direct inference on the effect associated with that variable.
This can be incorporated directly into the explanatory variable list.
```{r eval=FALSE}
explanatory <- c("age", "sex", "ulcer", "thickness",
"strata(year)")
melanoma %>%
finalfit(dependent_os, explanatory)
```
```{r echo=FALSE}
explanatory <- c("age", "sex", "ulcer", "thickness",
"strata(year)")
melanoma %>%
finalfit(dependent_os, explanatory) %>%
mykable(caption = "Cox Proportional Hazards: Overall survival following surgery for melanoma stratified by year of surgery.")
```
\index{functions@\textbf{functions}!strata}
### Correlated groups of observations
\index{time-to-event / survival@\textbf{time-to-event / survival}!correlated groups}
\index{time-to-event / survival@\textbf{time-to-event / survival}!mixed effects}
\index{time-to-event / survival@\textbf{time-to-event / survival}!random effects}
\index{time-to-event / survival@\textbf{time-to-event / survival}!multilevel}
\index{time-to-event / survival@\textbf{time-to-event / survival}!cluster}
\index{time-to-event / survival@\textbf{time-to-event / survival}!frailty}
As a general rule, you should always try to account for any higher structure in your data within the model.
For instance, patients may be clustered within particular hospitals.
There are two broad approaches to dealing with correlated groups of observations.
Adding a `cluster()` term is similar to a generalised estimating equations (GEE) approach (something we're not covering in this book).
Here, a standard CPH model is fitted but the standard errors of the estimated hazard ratios are adjusted to account for correlations.
A `frailty()` term implies a mixed effects model, where specific random effects term(s) are directly incorporated into the model.
Both approaches achieve the same goal in different ways.
Volumes have been written on GEE vs mixed effects models and we won't rehearse them in this introductory book.
We favour the latter approach because of its flexibility and our preference for mixed effects modelling in generalised linear modelling.
Note `cluster()` and `frailty()` terms cannot be combined in the same model.
```{r eval=FALSE}
# Simulate random hospital identifier
melanoma <- melanoma %>%
mutate(hospital_id = c(rep(1:10, 20), rep(11, 5)))
# Cluster model
explanatory <- c("age", "sex", "thickness", "ulcer",
"cluster(hospital_id)")
melanoma %>%
finalfit(dependent_os, explanatory)
```
```{r echo=FALSE}
# Simulate random hospital identifier
melanoma <- melanoma %>%
mutate(hospital_id = c(rep(1:10, 20), rep(11, 5)))
# Cluster model
explanatory <- c("age", "sex", "thickness", "ulcer",
"cluster(hospital_id)")
melanoma %>%
finalfit(dependent_os, explanatory) %>%
mykable(caption = "Cox Proportional Hazards: Overall survival following surgery for melanoma with robust standard errors (cluster model).")
```
```{r eval=FALSE}
# Frailty model
explanatory <- c("age", "sex", "thickness", "ulcer",
"frailty(hospital_id)")
melanoma %>%
finalfit(dependent_os, explanatory)
```
```{r echo=FALSE}
# Frailty model
explanatory <- c("age", "sex", "thickness", "ulcer",
"frailty(hospital_id)")
melanoma %>%
finalfit(dependent_os, explanatory) %>%
mykable(caption = "Cox Proportional Hazards: Overall survival following surgery for melanoma (frailty model).")
```
The `frailty()` method here is being superseded by the **coxme** package, and we look forward to incorporating this in the future.
### Hazard ratio plot
\index{time-to-event / survival@\textbf{time-to-event / survival}!hazard ratio plot}
\index{plots@\textbf{plots}!hazard ratio plot}
A plot of any of the above models can be produced using the `hr_plot()` function.
```{r eval=FALSE}
melanoma %>%
hr_plot(dependent_os, explanatory)
```
```{r fig.height=3, fig.width=7, message=FALSE, warnings=FALSE, fig.cap="Hazard ratio plot", include=FALSE}
library(ggplot2)
melanoma %>%
hr_plot(dependent_os, explanatory, table_text_size = 3.5,
title_text_size = 16,
plot_opts=list(xlab("HR, 95% CI"), theme(axis.title = element_text(size=12))))
```
## Competing risks regression
\index{time-to-event / survival@\textbf{time-to-event / survival}!competing risks regression}
Competing-risks regression is an alternative to CPH regression.
It can be useful if the outcome of interest may not be able to occur simply because something else (like death) has happened first.
For instance, in our example it is obviously not possible for a patient to die from melanoma if they have died from another disease first.
By simply looking at cause-specific mortality (deaths from melanoma) and considering other deaths as censored, bias may result in estimates of the influence of predictors.
The approach by Fine and Gray is one option for dealing with this.
It is implemented in the package **cmprsk**.
The `crr()` syntax differs from `survival::coxph()` but `finalfit` brings these together.
It uses the `finalfit::ff_merge()` function, which can join any number of models together.
```{r eval=FALSE}
explanatory <- c("age", "sex", "thickness", "ulcer")
dependent_dss <- "Surv(time, status_dss)"
dependent_crr <- "Surv(time, status_crr)"
melanoma %>%
# Summary table
summary_factorlist(dependent_dss, explanatory,
column = TRUE, fit_id = TRUE) %>%
# CPH univariable
ff_merge(
melanoma %>%
coxphmulti(dependent_dss, explanatory) %>%
fit2df(estimate_suffix = " (DSS CPH univariable)")
) %>%
# CPH multivariable
ff_merge(
melanoma %>%
coxphmulti(dependent_dss, explanatory) %>%
fit2df(estimate_suffix = " (DSS CPH multivariable)")
) %>%
# Fine and Gray competing risks regression
ff_merge(
melanoma %>%
crrmulti(dependent_crr, explanatory) %>%
fit2df(estimate_suffix = " (competing risks multivariable)")
) %>%
select(-fit_id, -index) %>%
dependent_label(melanoma, "Survival")
```
```{r echo=FALSE}
explanatory <- c("age", "sex", "thickness", "ulcer")
dependent_dss <- "Surv(time, status_dss)"
dependent_crr <- "Surv(time, status_crr)"
melanoma %>%
# Summary table
summary_factorlist(dependent_dss, explanatory, column = TRUE, fit_id = TRUE) %>%
# CPH univariable
ff_merge(
melanoma %>%
coxphmulti(dependent_dss, explanatory) %>%
fit2df(estimate_suffix = " (DSS CPH univariable)")
) %>%
# CPH multivariable
ff_merge(
melanoma %>%
coxphmulti(dependent_dss, explanatory) %>%
fit2df(estimate_suffix = " (DSS CPH multivariable)")
) %>%
# Fine and Gray competing risks regression
ff_merge(
melanoma %>%
crrmulti(dependent_crr, explanatory) %>%
fit2df(estimate_suffix = " (competing risks multivariable)")
) %>%
select(-fit_id, -index) %>%
dependent_label(melanoma, "Survival") %>%
mykable(caption = "Cox Proportional Hazards and competing risks regression combined.")
```
## Summary
So here we have presented the various aspects of time-to-event analysis which are commonly used when looking at survival.
There are many other applications, some of which may not be obvious: for instance we use CPH for modelling length of stay in hospital.
Stratification can be used to deal with non-proportional hazards in a particular variable.
Hierarchical structure in your data can be accommodated with cluster or frailty (random effects) terms.
Competing risks regression may be useful if your outcome is in competition with another, such as all-cause death, but is currently limited in its ability to accommodate hierarchical structures.
## Dates in R
### Converting dates to survival time
In the melanoma example dataset, we already had the time in a convenient format for survival analysis - survival time in days since the operation.
This section shows how to convert dates into "days from event".
First we will generate a dummy operation date and censoring date based on the melanoma data.
```{r, message = FALSE}
library(lubridate)
first_date <- ymd("1966-01-01") # create made-up dates for operations
last_date <- first_date +
days(nrow(melanoma)-1) # every day from 1-Jan 1966
operation_date <-
seq(from = first_date,
to = last_date, by = "1 day") # create dates
melanoma$operation_date <- operation_date # add sequence to melanoma dataset
```
Now we will create a 'censoring' date by adding `time` from the melanoma dataset to our made up operation date.
Remember the censoring date is either when an event occurred (e.g., death) or the last known alive status of the patient.
```{r}
melanoma <- melanoma %>%
mutate(censoring_date = operation_date + days(time))
# (Same as doing:):
melanoma$censoring_date <- melanoma$operation_date + days(melanoma$time)
```
Now consider if we only had the `operation date` and `censoring date`.
We want to create the `time` variable.
```{r}
melanoma <- melanoma %>%
mutate(time_days = censoring_date - operation_date)
```
The `Surv()` function expects a number (`numeric` variable), rather than a `date` object, so we'll convert it:
```{r eval=FALSE}
# This doesn't work
# Surv(melanoma$time_days, melanoma$status==1)
melanoma <- melanoma %>%
mutate(time_days_numeric = as.numeric(time_days))
# This works as exepcted.
Surv(melanoma$time_days_numeric, melanoma$status.factor == "Died")
```
## Exercises
### Exercise {#chap10-ex1}
Using the above scripts, perform a univariable Kaplan Meier analysis to determine if `ulcer` influences overall survival. Hint: `survival_object ~ ulcer`.
Try modifying the plot produced (see Help for ggsurvplot). For example:
* Add in a median survival line: `surv.median.line="hv"`
* Alter the plot legend: `legend.title = "Ulcer Present", legend.labs = c("No", "Yes")`
* Change the y-axis to a percentage: `ylab = "Probability of survival (%)", surv.scale = "percent"`
* Display follow-up up to 10 years, and change the scale to 1 year: `xlim = c(0,10), break.time.by = 1)`
### Exercise {#chap10-ex2}
Create a new CPH model, but now include the variable `thickness` as a variable.
* How would you interpret the output?
* Is it an independent predictor of overall survival in this model?
* Are CPH assumptions maintained?
## Solutions
Solution to Exercise \@ref(chap10-ex1):
```{r, echo=F, fig.width=6, fig.height=6, message = FALSE}
# Fit survival model
my_survfit.solution <- survfit(survival_object ~ ulcer, data = melanoma)
summary(my_survfit.solution, times=c(0,1,2,3,4,5))
# Plot results
library(survminer)
my_survplot.solution = ggsurvplot(my_survfit.solution,
data = melanoma,
palette = "Dark2",
risk.table = TRUE,
ggtheme = theme_bw(),
conf.int = TRUE,
pval=TRUE,
# Add in a medial survival line.
surv.median.line="hv",
# Alter the plot legend (change the names)
legend.title = "Ulcer Present",
legend.labs = c("No", "Yes"),
# Change the y-axis to a percentage
ylab = "Probability of survival (%)",
surv.scale = "percent",
# Display follow-up up to 10 years, and change the scale to 1 year
xlab = "Time (years)",
# present narrower X axis, but not affect survival estimates.
#xlim = c(0,10),
# break X axis in time intervals by 1 year
break.time.by = 1)
my_survplot.solution
```
Solution to Exercise \@ref(chap10-ex2):
```{r, eval=F}
# Fit model
my_hazard = coxph(survival_object ~ sex + ulcer + age + thickness, data=melanoma)
summary(my_hazard)
# Melanoma thickness has a HR 1.11 (1.03 to 1.18).
# This is interpretted as a 11% increase in the
# risk of death at any time for each 1 mm increase in thickness.
# Check assumptions
ph = cox.zph(my_hazard)
ph
# GLOBAL shows no overall violation of assumptions.
# Plot Schoenfield residuals to evaluate PH
plot(ph, var=4)
```