Skip to content

Using the story maker R script to create tables

Suzanne Childress edited this page Apr 9, 2020 · 17 revisions

tldr version

  1. Clone or the code here: https://github.com/psrc/travel-studies/tree/master/2019/story_maker
  2. Create a file like this template and fill in the variables you want: https://github.com/psrc/travel-studies/blob/master/2019/story_maker/telework.R

What is the Story Maker Script?

The story maker script allows you to create one, two, or three dimensional tables using fields from the household travel survey. It is built for people with at least an introductory to intermediate knowledge of both travel survey and R. The code connects to data of your choosing, but currently is set to the 2017/2019 data on Elmer. It takes care of any weighting or joining for you. You simply feed in where you want the tables to output, where the code is running, and which data items you want to summarize. It uses similar functions from the Shiny App Data Explorer, but you don't need to click through so much to get what you want; you can just type.

Where is the code?

The code is part of this repository: https://github.com/psrc/travel-studies To get started, clone or download the repository. The code for story making specifically is in this directory: https://github.com/psrc/travel-studies/tree/master/2019/story_maker

What software do I need to run this script?

You need R installed, and it is recommended to use RStudio to help you navigate the code. The code uses the following libraries: data.table, tidyverse, DT, openxlsx, odbc, DBI.

How do I use the script?

  1. Create an R project in the local directory where you have downloaded the code that originages here: https://github.com/psrc/travel-studies/tree/master/2019/story_maker
  2. Make a new .R file to do your work. At the top of the file, source these two .R files:

source('global.R') source('travel_crosstab.R')

  1. In this .R file, next set the directories where you are running the code, for example:

#where you are running your R code

wrkdir<- "C:/Users/SChildress/Documents/GitHub/travel-studies/2019/story_maker"

#where you want to output tables

file_loc<- 'C:/Users/SChildress/Documents/HHSurvey/race_story'

  1. Start summarizing!

I've created a simple template here you can use if you'd like. For your ease of use, I have checked in the codebook here: https://github.com/psrc/travel-studies/blob/master/2019/story_maker/Combined_Codebook_022020.xlsx

There are two functions that you can use: summarize_simple_tables and summarize_cross_tables

summarize_simple_tables

function input: a list of variable names from the survey

function output: a set of single dimension tables in csv form to your output directory that includes, sample counts, weighted estimates, shares, and margin of error

example use:

race_data <-c('race_category')

summarize_simple_tables(race_data)

example output (partial because the table is big):

How often telecommutes sample_count Region estimate Region share Region hhid Region in Region MOE Region N_HH Region total Region estMOE Region sample_count Seattle estimate Seattle share Seattle hhid Seattle in Seattle MOE Seattle N_HH Seattle total Seattle estMOE Seattle
1 1 day a week 520 113374.259314765 0.0582333464715612 495 0.000110792371374312 0.0173149336339523 495 1946895.8283229 33710.3720596296 292 28606.0886711888 0.102371825076877 278 0.000330546167292468 0.0299076276616451 278 279433.219537766 8357.18468623026
2 2 days a week 236 67848.6152580758 0.0348496382143477 228 0.000147522547941565 0.0199799825523841 228 1946895.8283229 38898.9446812009 115 8782.51204210112 0.0314297350065572 109 0.000279283548293347 0.0274908760095146 109 279433.219537766 7681.8639912522
3 3 days a week 111 28528.3593085256 0.0146532541153476 109 0.000132463635405308 0.0189327734100857 109 1946895.8283229 36860.1375706787 68 7924.19879756404 0.0283581129354345 66 0.000417483793426904 0.0336113311564424 66 279433.219537766 9392.12247799474
4 4 days a week 65 22304.1203120341 0.0114562474209251 65 0.000174230797168549 0.0217134265584228 65 1946895.8283229 42273.7795851891 29 6648.64185701527 0.0237933122912635 29 0.000800937606257721 0.0465548835888734 29 279433.219537766 13008.9810064448
5 5 days a week 266 93152.9039995621 0.0478468866409797 234 0.000194690436238226 0.0229529341854488 234 1946895.8283229 44686.9718134205 114 9444.4572308484 0.0337986201013296 103 0.000317051197869666 0.0292907573769485 103 279433.219537766 8184.8106365403
6 6-7 days a week 70 26982.7025698519 0.0138593458249358 59 0.000231648548444756 0.0250369080220626 59 1946895.8283229 48744.2517822578 34 5384.97526435472 0.0192710633090169 30 0.00062998964759856 0.0412888330683115 30 279433.219537766 11537.4715552357
7 A few times per month 1037 237312.962533719 0.121892994520485 964 0.000111032253534558 0.0173336682173986 964 1946895.8283229 33746.8463419865 551 39983.3936634056 0.143087474458282 506 0.000242319069390182 0.0256070587875017 506 279433.219537766 7155.46287988445
8 Less than monthly 1331 289308.64950506 0.148599963745506 1194 0.00010596148619794 0.0169332345016769 1194 1946895.8283229 32967.2436113282 704 52982.684407784 0.18960767977203 629 0.000244287134409375 0.0257108360986206 629 279433.219537766 7184.46170804536
9 Never 2647 920634.582270353 0.47287305713907 2146 0.000116152902596002 0.0177288651144777 2146 1946895.8283229 34516.2535322761 1207 106547.209081957 0.381297575349865 1011 0.000233342961802353 0.0251283085027865 1011 279433.219537766 7021.68414647186
10 Not applicable 423 147448.673250956 0.075735265906842 347 0.000201727479552916 0.023364066488032 347 1946895.8283229 45487.4035782084 216 13129.0585215476 0.0469846016993449 178 0.00025155645450842 0.0260905740605903 178 279433.219537766 7290.57310933928

summarize_cross_tables

function input: two lists of variable names from the survey, and optionally, a third list of variable names, with one variable value (for creating three-dimensional data summaries)

function output: a set of two dimension tables in csv form to your output directory that includes, sample counts, weighted estimates, shares, and margin of error

example use:

2- dimensional summary

telecommute_data <-c('telecommute_freq') age_cat <-c('age_category') summarize_cross_tables(age_cat, telecommute_data)

3- dimensional summary

#Person Race by Trip Mode by Household Income

race_data <-c('race_category') mode<-c('mode_simple') inc<-c('hhincome_broad')

incomes<-values.lu[variable=='hhincome_broad']'value_text' for(this_income in incomes){ summarize_cross_tables(race_data,mode,inc, this_income) }

example output:

How often telecommutes sample_count_18-64 years Region sample_count_65 years+ Region estimate_18-64 years Region estimate_65 years+ Region estMOE_18-64 years Region estMOE_65 years+ Region share_18-64 years Region share_65 years+ Region MOE_18-64 years Region MOE_65 years+ Region N_HH_18-64 years Region N_HH_65 years+ Region sample_count_18-64 years Seattle sample_count_65 years+ Seattle estimate_18-64 years Seattle estimate_65 years+ Seattle estMOE_18-64 years Seattle estMOE_65 years+ Seattle share_18-64 years Seattle share_65 years+ Seattle MOE_18-64 years Seattle MOE_65 years+ Seattle N_HH_18-64 years Seattle N_HH_65 years+ Seattle
1 1 day a week 501 19 102942.473518851 10431.7857959136 2422.9247869864 2422.9247869864 0.907988057792276 0.0920119422077238 0.0213710307933263 0.0213710307933263 495 495 283 9 28135.1437275104 470.944943678394 359.131204885348 359.131204885348 0.983536898417269 0.0164631015827311 0.0125543624300883 0.0125543624300883 278 278
2 2 days a week 227 9 67237.3971637205 611.218094355364 698.396455494415 698.396455494411 0.990991443347363 0.00900855665263724 0.0102934518683679 0.0102934518683678 228 228 109 6 8506.42191278489 276.090129316232 241.46353435663 241.463534356631 0.968563649216451 0.0314363507835488 0.0274936752946214 0.0274936752946214 109 109
3 3 days a week 104 7 26412.658883157 2115.70042536855 1177.83864989767 1177.83864989767 0.925838692562447 0.0741613074375534 0.0412865891501052 0.0412865891501052 109 109 66 2 7876.53634896627 47.6624485977687 124.065134368841 124.06513436884 0.993985202817928 0.00601479718207227 0.0156564893862808 0.0156564893862808 66 66
4 4 days a week 61 4 18806.5006452907 3497.61966674349 1654.81523938094 1654.81523938094 0.843185043040843 0.156814956959157 0.0741932529160582 0.0741932529160582 65 65 27 2 3635.60404925739 3013.03780775788 1011.01473022206 1011.01473022206 0.546819053792364 0.453180946207636 0.152063346464555 0.152063346464555 29 29
5 5 days a week 262 4 90446.8743096509 2706.02968991126 1682.36813393908 1682.36813393908 0.970950667411034 0.0290493325889655 0.0180602864935589 0.0180602864935589 234 234 112 2 9355.43326920797 89.0239616404351 147.921976466957 147.921976466958 0.990573946234871 0.00942605376512865 0.0156623057155471 0.0156623057155472 103 103
6 6-7 days a week 65 5 24060.403248464 2922.29932138781 1795.782338474 1795.782338474 0.891697308161676 0.108302691838324 0.066553093924715 0.066553093924715 59 59 33 1 5365.4161086914 19.5591556633224 97.2930976608127 97.2930976608125 0.996367828132324 0.00363217186767638 0.0180675105983929 0.0180675105983929 30 30
7 A few times per month 1018 19 222581.253880809 14731.7086529095 3033.87887725373 3033.87887725373 0.937922865672303 0.0620771343276973 0.0127842948183779 0.0127842948183779 964 964 542 9 38402.6013562446 1580.79230716093 569.781537500883 569.781537500883 0.96046377852594 0.0395362214740601 0.0142504546336788 0.0142504546336788 506 506
8 Less than monthly 1295 36 275236.015880044 14072.633625016 2962.81176693688 2962.81176693688 0.951357715543275 0.0486422844567247 0.0102410065236749 0.0102410065236749 1194 1194 687 17 45071.8079804677 7910.87642731634 1238.52668886997 1238.52668886997 0.850689399456814 0.149310600543186 0.0233760652695055 0.0233760652695055 629 629
9 Never 2511 136 854257.32438008 66377.2578902727 8455.80682183013 8455.80682183013 0.927900538206395 0.0720994617936049 0.00918475906149157 0.00918475906149157 2146 2146 1162 45 100200.639898613 6346.56918334403 1304.6531924368 1304.6531924368 0.940434205287703 0.059565794712297 0.0122448368538049 0.0122448368538049 1011 1011
10 Not applicable 392 31 137774.99975972 9673.67349123615 3223.9038938171 3223.9038938171 0.934392943130986 0.065607056869014 0.0218645839446113 0.0218645839446113 347 347 199 17 12670.8684654835 458.190056064084 297.085819281298 297.085819281298 0.965101072913027 0.0348989270869726 0.0226281129597921 0.022628112959792 178 178