Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with rbind when using different dataset (it works when using same data.frame) #828

Open
Melkiades opened this issue Feb 22, 2024 · 8 comments

Comments

@Melkiades
Copy link
Contributor

Melkiades commented Feb 22, 2024

library(rtables)
#> Loading required package: formatters
#> Loading required package: magrittr
#> 
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#> 
#>     str
    library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
    tbl <- basic_table() %>%
        summarize_row_groups() %>% 
        build_table(starwars)
    tbl2 <- basic_table() %>%
        split_rows_by("vs") %>%
        analyze("cyl") %>%
        build_table(mtcars %>% mutate(vs = factor(vs)))
    rbind(tbl, tbl2)
#> Error in chk_compat_cinfos(x[[1]], xi): Column structures not compatible: 2nd column structure has non-matching, non-null column counts

Created on 2024-02-22 with reprex v2.1.0

Similarly:

library(rtables)
#> Loading required package: formatters
#> Loading required package: magrittr
#> 
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#> 
#>     str
  library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
  tbl <- basic_table() %>%
      build_table(mtcars)
  tbl2 <- basic_table() %>%
      split_rows_by("vs") %>%
      analyze("cyl") %>%
      build_table(mtcars %>% mutate(vs = factor(vs)))
  rbind(tbl, tbl2)
#> Error in (function (cond) : error in evaluating the argument 'x' in selecting a method for function 'toString': all(mapdf$row_num == rinfo$abs_rownumber) is not TRUE

Created on 2024-02-22 with reprex v2.1.0

@lanlanlaura
Copy link

Hi, @Melkiades ,
Have you solved this problem? I have the same problem. Thank you.

@Melkiades
Copy link
Contributor Author

Not directly. Lets say that usually we use rbind when we do not how to create a table, but ideally it should be used only when the tables contain different data. For example, second issue can be solved w/

tbl <- basic_table() %>%
  # build_table(mtcars)
# tbl2 <- basic_table() %>%
  split_rows_by("vs") %>%
  analyze("cyl") %>%
  build_table(mtcars %>% mutate(vs = factor(vs)))

In general keep in mind to use nest = FALSE to start again the split process (i.e. to unnest the row) and tables_name = "<something>" to get more analyze in the same table (not strictly necessary). I am taking a look at it anyway

@superliqianqian1
Copy link

superliqianqian1 commented Nov 12, 2024

Not directly. Lets say that usually we use rbind when we do not how to create a table, but ideally it should be used only when the tables contain different data. For example, second issue can be solved w/

tbl <- basic_table() %>%
  # build_table(mtcars)
# tbl2 <- basic_table() %>%
  split_rows_by("vs") %>%
  analyze("cyl") %>%
  build_table(mtcars %>% mutate(vs = factor(vs)))

In general keep in mind to use nest = FALSE to start again the split process (i.e. to unnest the row) and tables_name = "<something>" to get more analyze in the same table (not strictly necessary). I am taking a look at it anyway

Following with this question. If still I need to use two separate alt_counts_df inside one table, how to solve the rbind issue? `nest = FALSE does not work properly in my table. I have one table contains two parts, each part need to use different denominators by using alt_counts_df inside build_table(df, alt_counts_df = col_n). How could solve such problem in the end. and please indicate how to modify parameter of these two tables, so rbind could work properly. Below is an example

DM_modify <- DM %>%
  dplyr::mutate(
    label_STRATA_B = (STRATA1 == 'B'),
    label_STRATA_C = (STRATA1 == 'C')
  )%>%
  formatters::var_relabel(
    label_STRATA_B = "Participants in STRAT: B",
    label_STRATA_C = "Participants in STRAT: C"
  )
  
N_1 <- DM_modify %>%  
  dplyr::filter(ARM == 'A: Drug X') %>%
  dplyr::select(ARM, RACE, ID,AGE ) %>%
  dplyr::distinct()

N_2 <- DM_modify %>%
  dplyr::filter(RACE != "WHITE") %>%
  dplyr::select(ARM, RACE, ID,AGE) %>%
  dplyr::distinct()

tbl_1 <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  tern::count_patients_with_flags(
    var = "ID",
    flag_variables = "label_STRATA_B",
    denom = c("N_col"),
    .stats = c("count_fraction")
  ) %>%
  build_table( DM_modify, alt_counts_df = N_1)

tbl_2 <- basic_table() %>%
  split_cols_by(var = "ARM") %>%
  tern::count_patients_with_flags(
    var = "ID",
    flag_variables = "label_STRATA_C",
    denom = c("N_col"),
    .stats = c("count_fraction")
  ) %>%
  build_table( DM_modify, alt_counts_df = N_2)

tbl_1
tbl_2

tbl <- rbind(tbl_1,tbl_2)

error show: Error in chk_compat_cinfos(x[[1]], xi) :
Column structures not compatible: 2nd column structure has non-matching, non-null column counts
image

@shajoezhu
Copy link
Collaborator

hi @superliqianqian1 , please consider the following code

> basic_table() %>%
+     split_cols_by(var = "ARM") %>%
+     split_rows_by(var = "STRATA1") %>%
+     tern::summarize_num_patients(
+         var = "ID", .stats = ("unique")
+     ) %>%
+     build_table( DM_modify)
                                             A: Drug X    B: Placebo   C: Combination
—————————————————————————————————————————————————————————————————————————————————————
Number of patients with at least one event   36 (29.8%)   33 (31.1%)     45 (34.9%)  
Number of patients with at least one event   41 (33.9%)   40 (37.7%)     38 (29.5%)  
Number of patients with at least one event   44 (36.4%)   33 (31.1%)     46 (35.7%)  

@superliqianqian1
Copy link

superliqianqian1 commented Nov 12, 2024

hi @superliqianqian1 , please consider the following code

> basic_table() %>%
+     split_cols_by(var = "ARM") %>%
+     split_rows_by(var = "STRATA1") %>%
+     tern::summarize_num_patients(
+         var = "ID", .stats = ("unique")
+     ) %>%
+     build_table( DM_modify)
                                             A: Drug X    B: Placebo   C: Combination
—————————————————————————————————————————————————————————————————————————————————————
Number of patients with at least one event   36 (29.8%)   33 (31.1%)     45 (34.9%)  
Number of patients with at least one event   41 (33.9%)   40 (37.7%)     38 (29.5%)  
Number of patients with at least one event   44 (36.4%)   33 (31.1%)     46 (35.7%)  

Thanks for the reply. but the percentage of .stats = ("unique") differ with my final result. I just provide a simple example for your checking purpose. The real case of my table need more flexible denominators than denominators of tern::summarize_num_patients() , so this is the main reason of need combining results from different build_table(df, alt_counts_df ). But rbind does not work. I think some information kept in final tables, so rbind refuse to bind. But I do not know to to remove such extra information.
image

@lanlanlaura
Copy link

lanlanlaura commented Nov 12, 2024

Hi, @Melkiades @shajoezhu ,
Thank you very much for prompt reply. The table which I need to create that is very special, I created it separately by 2 parts, because each of part need to use different denominator. I think if the structure of 2 table trees are same, rbind function should work. Please kindly find my dummy code, I think it can describe my problem in more detail.
You can see "tbl_recipe1" and "tbl_recipe2", they used N_col as denominator and the structure of table trees are same, but N_col is different.
image
image

When I combined those 2 tables together, there is an error.
image

I know there may be some limitations when we use rbind function to combine 2 tables. I'm not very familiar to use tern and rtables packages. Do you have any suggestions to generate this special table? Thank you very much.

library(random.cdisc.data)
library(dplyr)
library(rtables)
library(haven)
library(purrr)
library(stringr)
library(tern)
library(forcats)
library(formatters)

adsl_cdisc1 <- random.cdisc.data::cadsl
adex_cdisc1 <- random.cdisc.data::cadex

adsl_cdisc2 <- dplyr::filter(adsl_cdisc1, SAFFL == "Y") |> 
  dplyr::select(USUBJID, TRT01A) |> 
  dplyr::mutate(
    TRT01A = forcats::fct_inorder(TRT01A)
  )

adex_cdisc2 <- dplyr::select(
  adex_cdisc1, USUBJID, SAFFL, STRATA1, AVISIT) |>
  dplyr::left_join(adsl_cdisc2, by = "USUBJID") |> 
  dplyr::mutate(
    EXP = dplyr::case_when(
      !is.na(AVISIT) ~ "Number of exposed participants",
      TRUE ~ NA_character_
    ),
    STRA = dplyr::case_when(
      STRATA1 == "B" ~ "Number of records with STRATA1 = B by visit",
      TRUE ~ NA_character_
    ),
    SORT = paste(USUBJID, AVISIT)
  ) |> 
  dplyr::filter(SAFFL == "Y", !is.na(AVISIT))

adex_cdisc3 <- dplyr::filter(adex_cdisc1, SAFFL == "Y") |>
  dplyr::select(USUBJID, AVISIT) |>
  dplyr::right_join(adsl_cdisc2, by = "USUBJID") |> 
  dplyr::mutate(
    SORT = paste(USUBJID, AVISIT)
  ) |> 
  distinct(SORT, TRT01A)

tbl_recipe1 <- basic_table() |> 
  add_colcounts() |>
  split_cols_by("TRT01A") |> 
  count_occurrences("EXP",
                    id = "USUBJID",
                    .stats = "count_fraction"
                    ) |>
  rtables::build_table(adex_cdisc2, alt_counts_df = adsl_cdisc2)

tbl_recipe2 <- basic_table() |>
  add_colcounts() |>
  split_cols_by("TRT01A") |>
  count_occurrences("STRA",
                    id = "SORT",
                    .stats = "count_fraction"
                    ) |>
  rtables::build_table(adex_cdisc2, alt_counts_df = adex_cdisc3) 

tbl <- rbind(
  tbl_recipe1,
  rtables::rrow(""),
  tbl_recipe2
) 
```r

@shajoezhu
Copy link
Collaborator

hi @lanlanlaura , can I suggest we move the discussion for this topic to StackOverflow, https://stackoverflow.com/questions/tagged/nest-rtables, you can create a new topic there, and we can follow-up there . We are trying to build a user community to support each other in the future. thanks you

@lanlanlaura
Copy link

hi, @shajoezhu , I post this topic to StackOverflow with tag [nest-rtables] and it's in process of pending review. Thank you.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants