Skip to content

An R package to standardise methods used in Public Health Scotland

Notifications You must be signed in to change notification settings

Tina815/phsmethods

 
 

Repository files navigation

phsmethods

GitHub release (latest by date) Build Status codecov

phsmethods contains functions for commonly undertaken analytical tasks in Public Health Scotland (PHS):

  • age_group() categorises ages into groups
  • chi_check() assesses the validity of a CHI number
  • chi_pad() adds a leading zero to nine-digit CHI numbers
  • sex_from_chi() extracts the sex of a person from a CHI number
  • file_size() returns the names and sizes of files in a directory
  • fin_year() assigns a date to a financial year in the format YYYY/YY
  • match_area() converts geography codes into area names
  • postcode() formats improperly recorded postcodes
  • qtr(), qtr_end(), qtr_next() and qtr_prev() assign a date to a quarter

phsmethods can be used on both the server and desktop versions of RStudio.

Installation

To install phsmethods, the package remotes is required, and can be installed with install.packages("remotes").

You can then install phsmethods on RStudio server from GitHub with:

remotes::install_github("Public-Health-Scotland/phsmethods",
  upgrade = "never"
)

Network security settings may prevent remotes::install_github() from working on RStudio desktop. If this is the case, phsmethods can be installed by downloading the zip of the repository and running the following code (replacing the section marked <>, including the arrows themselves):

remotes::install_local("<FILEPATH OF ZIPPED FILE>/phsmethods-master.zip",
  upgrade = "never"
)

Using phsmethods

Load phsmethods using library():

library(phsmethods)

To access the help file for any of phsmethods' functions, type ?function_name into the RStudio console after loading the package:

?fin_year
?postcode

age_group

a <- c(54, 7, 77, 1, 26, 101)

# By default age_group goes in 5 year increments from 0 to 90+
age_group(a)
#> [1] "50-54" "5-9"   "75-79" "0-4"   "25-29" "90+"

# But these settings can be changed
age_group(a, from = 0, to = 80, by = 10)
#> [1] "50-59" "0-9"   "70-79" "0-9"   "20-29" "80+"

chi_check

# chi_check returns specific feedback on why a CHI number might be invalid
library(dplyr)
b <- tibble(chi = c("0101011237", "0101336489", "3213201234", "123456789", "12345678900", "010120123?", NA))
b %>% mutate(validity = chi_check(chi))
#> # A tibble: 7 x 2
#>   chi         validity                    
#>   <chr>       <chr>                       
#> 1 0101011237  Valid CHI                   
#> 2 0101336489  Valid CHI                   
#> 3 3213201234  Invalid date                
#> 4 123456789   Too few characters          
#> 5 12345678900 Too many characters         
#> 6 010120123?  Invalid character(s) present
#> 7 <NA>        Missing

chi_pad

# Only nine-digit characters comprised exclusively of numeric digits are prefixed with a zero
chi_pad(c("101011237", "101201234", "123223", "abcdefghi", "12345tuvw"))
#> [1] "0101011237" "0101201234" "123223"     "abcdefghi"  "12345tuvw"

sex_from_chi

# By default it will check that the CHI is valid before extracting the sex.
library(dplyr)
b <- tibble(chi = c("0101011237", "0101336489", "123456789", "12345678900", "010120123?", NA))

b %>% mutate(chi_sex = sex_from_chi(chi))
#> # A tibble: 6 x 2
#>   chi         chi_sex
#>   <chr>         <int>
#> 1 0101011237        1
#> 2 0101336489        2
#> 3 123456789        NA
#> 4 12345678900      NA
#> 5 010120123?       NA
#> 6 <NA>             NA

# Use custom values for male and female
b %>% mutate(chi_sex = sex_from_chi(chi, male_value = "M", female_value = "F"))
#> Using custom values: Male = M Female = F.
#> The return variable will be character.
#> # A tibble: 6 x 2
#>   chi         chi_sex
#>   <chr>       <chr>  
#> 1 0101011237  M      
#> 2 0101336489  F      
#> 3 123456789   <NA>   
#> 4 12345678900 <NA>   
#> 5 010120123?  <NA>   
#> 6 <NA>        <NA>

# Alternatively return the result as a factor (with labels 'Male' and 'Female')
b %>% mutate(chi_sex = sex_from_chi(chi, as_factor = TRUE))
#> # A tibble: 6 x 2
#>   chi         chi_sex
#>   <chr>       <fct>  
#> 1 0101011237  Male   
#> 2 0101336489  Female 
#> 3 123456789   <NA>   
#> 4 12345678900 <NA>   
#> 5 010120123?  <NA>   
#> 6 <NA>        <NA>

file_size

# Names and sizes of all files in the tests/testthat/files folder
file_size(testthat::test_path("files"))
#> # A tibble: 8 x 2
#>   name             size       
#>   <chr>            <chr>      
#> 1 airquality.xls   Excel 26 KB
#> 2 bod.xlsx         Excel 5 KB 
#> 3 iris.csv         CSV 4 KB   
#> 4 mtcars.sav       SPSS 4 KB  
#> 5 plant-growth.rds RDS 316 B  
#> 6 puromycin.txt    Text 418 B 
#> 7 stackloss.fst    FST 897 B  
#> 8 swiss.tsv        TSV 1 KB

# Names and sizes of Excel files only in the tests/testthat/files folder
file_size(testthat::test_path("files"), pattern = "\\.xlsx?$")
#> # A tibble: 2 x 2
#>   name           size       
#>   <chr>          <chr>      
#> 1 airquality.xls Excel 26 KB
#> 2 bod.xlsx       Excel 5 KB

fin_year

c <- lubridate::dmy(c(21012017, 04042017, 17112017))
fin_year(c)
#> [1] "2016/17" "2017/18" "2017/18"

match_area

match_area("S13002781")
#> [1] "Ayr North"

d <- tibble(code = c("S02000656", "S02001042", "S08000020", "S12000013", "S13002605"))
d %>% mutate(name = match_area(code))
#> # A tibble: 5 x 2
#>   code      name               
#>   <chr>     <chr>              
#> 1 S02000656 Govan and Linthouse
#> 2 S02001042 Peebles North      
#> 3 S08000020 Grampian           
#> 4 S12000013 Na h-Eileanan Siar 
#> 5 S13002605 Steòrnabhagh a Deas

postcode

# The default is pc7 format
postcode("G26QE")
#> [1] "G2  6QE"

# But pc8 format can also be applied
postcode(c("KA89NB", "PA152TY"), format = "pc8")
#> [1] "KA8 9NB"  "PA15 2TY"

# postcode accounts for irregular spacing and lower case letters
e <- tibble(pc = c("G 4 2 9 B A", "g207al", "Dg98bS", "DD37J    y"))
e %>% mutate(pc = postcode(pc))
#> # A tibble: 4 x 1
#>   pc     
#>   <chr>  
#> 1 G42 9BA
#> 2 G20 7AL
#> 3 DG9 8BS
#> 4 DD3 7JY

qtr, qtr_end, qtr_next and qtr_prev

f <- lubridate::dmy(c(26032012, 04052012, 23092012))

# qtr returns the current quarter and year
# The default is long format
qtr(f)
#> [1] "January to March 2012"  "April to June 2012"     "July to September 2012"

# But short format can also be applied
qtr(f, format = "short")
#> [1] "Jan-Mar 2012" "Apr-Jun 2012" "Jul-Sep 2012"


# qtr_end returns the last month in the quarter
qtr_end(f)
#> [1] "March 2012"     "June 2012"      "September 2012"
qtr_end(f, format = "short")
#> [1] "Mar 2012" "Jun 2012" "Sep 2012"


# qtr_next returns the next quarter
qtr_next(f)
#> [1] "April to June 2012"       "July to September 2012"  
#> [3] "October to December 2012"
qtr_next(f, format = "short")
#> [1] "Apr-Jun 2012" "Jul-Sep 2012" "Oct-Dec 2012"


# qtr_prev returns the previous quarter
qtr_prev(f)
#> [1] "October to December 2011" "January to March 2012"   
#> [3] "April to June 2012"
qtr_prev(f, format = "short")
#> [1] "Oct-Dec 2011" "Jan-Mar 2012" "Apr-Jun 2012"

Contributing to phsmethods

At present, the maintainers of this package are David Caldwell and Lucinda Lawrie.

This package is intended to be in continuous development and contributions may be made by anyone within PHS. If you would like to make a contribution, please first create an issue on GitHub and assign both of the package maintainers to it. This is to ensure that no duplication of effort occurs in the case of multiple people having the same idea. The package maintainers will discuss the issue and get back to you as soon as possible.

While the most obvious and eyecatching (as well as intimidating) way of contributing is by writing a function, this isn't the only way to make a useful contribution. Fixing typos in documentation, for example, isn't the most glamorous way to contribute, but is of great help to the package maintainers. Please see this blogpost by Jim Hester for more information on getting started with contributing to open source software.

When contributing, please create a branch in this repository and carry out all work on it. Please ensure you have linked RStudio to your GitHub account using usethis::edit_git_config() prior to making your contribution. When you are ready for a review, please create a pull request and assign both of the package maintainers as reviewers. One or both of them will conduct a review, provide feedback and, if necessary, request changes prior to merging your branch.

Please be mindful of information governance when contributing to this package. No data files (aside from publicly available and downloadable datasets or unless explicitly approved), server connection details, passwords or person identifiable or otherwise confidential information should be included anywhere within this package or any other repository (whether public or private) used within PHS. This includes within code and code commentary. For more information on security when using git and GitHub, and on using git and GitHub for version control more generally, please see the Transforming Publishing Programme's Git guide and GitHub guidance.

Please feel free to add yourself to the 'Authors' section of the Description file when contributing. As a rule of thumb, please assign your role as author ("aut") when writing an exported function, and as contributor ("ctb") for anything else.

phsmethods will, as much as possible, adhere to the tidyverse style guide and the rOpenSci package development guide. The most pertinent points to take from these are:

  • All function names should be in lower case, with words separated by an underscore
  • Put a space after a comma, never before
  • Put a space before and after infix operators such as <-, == and +
  • Limit code to 80 characters per line
  • Function documentation should be generated using roxygen2
  • All functions should be tested using testthat
  • The package should always pass devtools::check()

It's not necessary to have experience with GitHub or of building an R package to contribute to phsmethods. If you wish to contribute code then, as long as you can write an R function, the package maintainers can assist with error handling, writing documentation, testing and other aspects of package development. It is advised, however, to consult Hadley Wickham's R Packages book prior to making a contribution. It may also be useful to consult the documentation and tests of existing functions within this package as a point of reference.

Please note that this README may fail to 'Knit' at times as a result of network security settings. This will likely be due to the badges for the package's release version, continuous integration status and test coverage at the top of the document. If you are editing the README.Rmd document and are unable to successfully get it to 'Knit', please contact the package maintainers for assistance.

About

An R package to standardise methods used in Public Health Scotland

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%