Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider adding option to include missing report for all variables #23

Open
ericpgreen opened this issue Jan 6, 2020 · 2 comments
Open
Assignees

Comments

@ericpgreen
Copy link

The nhanes_2010 dataset has 1417 obs, but the summary table will indicate a total of 940 obs (complete cases) unless we specify na.rm=FALSE. The documentation says this about the na.rm parameter:

when set to FALSE it also shows how many missing values are in the data for each categorical variable being summarized

And that's what this does...

library(furniture)
library(tidyverse)
data("nhanes_2010")

nhanes_2010 %>%
  furniture::table1("Age Mean (SD)" = age,                
         "Health" = gen_health,
         "Sex" = gender,
         "Cancer" = cancer, 
         "Asthma" = asthma,
         test = TRUE,                            
         output = "html",
         na.rm = FALSE,
         total = TRUE,
         type = "condense")                      

We see that 155 obs are missing on gen_health. But I'm wondering if there could be an option to also show missing for all variables. This table gives me the impression that we're only missing data for the gen_health variable, but that's not the case.

table(nhanes_2010$cancer, useNA = "always") returns 344 missing

table(nhanes_2010$asthma, useNA = "always") returns 2 missing

There are no missing on the numeric variable age, but that would be interesting to know too.

@TysonStanley
Copy link
Owner

Yes, this is something I've been wanting to work on but haven't had the time yet. Your post will help push it toward the top of the priority list. For a short-term fix (bc it would be more straightforward with formatting), I was thinking of adding the missing for the continuous variables next to the name of the variable. Something like:

----------------------------------
                      Mean/SD
Var1 (missing = 34)   2.5 (3.2)
----------------------------------

What do you think?

@ericpgreen
Copy link
Author

ericpgreen commented Jan 6, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants