Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace globalvariables use with maintainable approach #165

Open
bailliem opened this issue May 7, 2021 · 7 comments
Open

Replace globalvariables use with maintainable approach #165

bailliem opened this issue May 7, 2021 · 7 comments
Assignees
Labels
discussion Discussion around design choices and architecture issue Used for project filtering

Comments

@bailliem
Copy link
Collaborator

bailliem commented May 7, 2021

Work through the list of variables referenced in visR globalvariable to set those variables to NULL within the function call, where possible. At the moment we are using this to suppress note in R CMD check "no visible binding". Ideally, we resolve this at the function level.

See here for more details on the alternative approach STAT545-UBC/Discussion#451

@bailliem bailliem added the issue Used for project filtering label May 7, 2021
@bailliem bailliem self-assigned this May 7, 2021
@timtreis
Copy link
Collaborator

timtreis commented Nov 2, 2021

In the spirit of closing down issues:

Apparently the tidyr approved way is the usage of the .data pronoun as described here. Shall I integrate it?

@SHAESEN2
Copy link
Collaborator

SHAESEN2 commented Nov 2, 2021

Can you give an example of how this looks like for us?

@timtreis
Copy link
Collaborator

timtreis commented Nov 2, 2021

Didn't run any tests with it yet, but I assume we'd have to replace all implicit piping by explicit .data piping. Yucky

So instead of
adtte %>% dplyr::mutate(TRTDUR = TRTDUR*2)
we should then use
adtte %>% dplyr::mutate(TRTDUR = .data$TRTDUR*2)

If we don't want to take this step, I'd be in favour of closing this issue

@SHAESEN2
Copy link
Collaborator

SHAESEN2 commented Nov 2, 2021

If we want to do that I would be very explicit as '$' does partial matching.

adtte %>% dplyr::mutate(TRTDUR = .data[["TRTDUR"]]*2)

@timtreis
Copy link
Collaborator

timtreis commented Nov 2, 2021

But do we want to do that? Doesn't feel like it adds value, is very tedious work and will probably pop up again every time one of us writes a piping statement. At least I'm fairly sure I'd forget it.

Also, the issues says "maintainable" which the .data approach very much is not 👀

@SHAESEN2 SHAESEN2 added the discussion Discussion around design choices and architecture label Nov 3, 2021
@SHAESEN2
Copy link
Collaborator

As proposed by Daniel, we should use .data to be very explicit inside all our functions which increases traceability but also ensure that the correct environment is passed throughout functions

am <- blah

mtcars %>% select(am) = mtcars %>% select(.data$am)
mtcars %>% select(.env(am))

@ddsjoberg
Copy link
Collaborator

+1 for using the .data and .env prefixes. The tidyverse functions are super easy to use with the unquoted variables names, but that simplicity comes with some ambiguity. Building on the example @SHAESEN2 listed above.

am <- "mpg"
mtcars |> dplyr::select(.data$am, .env$am) |> head()
#>                   am  mpg
#> Mazda RX4          1 21.0
#> Mazda RX4 Wag      1 21.0

Without the prefixes, the selecting is not 100% precise. This is a bit of a contrived example, but I do often find myself running into issues while writing functions with data frames having column names that match R object names outside the data frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion around design choices and architecture issue Used for project filtering
Projects
None yet
Development

No branches or pull requests

4 participants