Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ard_survival_survfit fails when stratification variable level includes a "=" #252

Open
dchiu911 opened this issue Jan 6, 2025 · 1 comment · May be fixed by #253
Open

ard_survival_survfit fails when stratification variable level includes a "=" #252

dchiu911 opened this issue Jan 6, 2025 · 1 comment · May be fixed by #253
Assignees
Labels

Comments

@dchiu911
Copy link

dchiu911 commented Jan 6, 2025

What happened?

In L384 we see that the pivoting of the survfit result on the strata levels matches all "=" delimiters, but sometimes, we include "=" itself in the strata levels, e.g. comparing age "<60" vs. ">=60". The reprex below demonstrates the error thrown.

library(survival)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(cardx)
lung2 <- lung %>% 
  mutate(age_bin = factor(ifelse(age < 60, "<60", ">=60")))
survfit(Surv(time, status) ~ age_bin, data = lung2) %>% 
  ard_survival_survfit(times = 100)
#> Error in `tidyr::separate_wider_delim()`:
#> ! Expected 2 pieces in each element of `strata`.
#> ! 5 values were too long.
#> ℹ Use `too_many = "debug"` to diagnose the problem.
#> ℹ Use `too_many = "drop"/"merge"` to silence this message.

Created on 2025-01-06 with reprex v2.1.1

A more conservative approach to splitting the stratification variable name and its levels could be to separate by the first "=" sign:

ret %>% tidyr::separate_wider_regex("strata",
                                    patterns = c(
                                      group1 = ".*",
                                      "(?<=[[:alnum:]])\\=(?=.*)",
                                      group1_level = ".*"
                                    ))

There's probably better ways to capture the regex.

@ddsjoberg
Copy link
Collaborator

Thanks @dchiu911 for the report! I can't recall the exact details, but I thought we were using stats::terms() to parse the variable names. @edelarua when you have a moment, can you investigate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants