-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A small comment about new variables in data frames #42
Comments
I've always wondered why does This should / could have been enough, IMO: within(mtcars, hwratio <- hp/wt) EDIT: I've actually put together a quick function in the development version of package admisc, and it seems to work: mt <- mtcars
inside(mt, hwratio <- hp/wt)
dim(mtcars) # 32 11
dim(mt) # 32 12 |
Within the tidyverse, I would go for the shorthand:
However, there is no reason to do so, because a very simple assignment in base R does the trick neatly. However, a pipe over multiple lines may be something where I would go for a "tidy" version. I can add a single line or out-comment one, and still have linearly legible code. 🤷 |
I think, strictly speaking, tidy proponents don't even favour the |
One of the easiest ways to make a program difficult to understand is to modify an object without using an assignment operator. That is deliberately difficult in R. |
I agree, but really, what is the purpose of creating a new variable without overwriting the object? These are all equivalent, in my mind: mtcars$hwratio <- mtcars$hp / mtcars$wt
# two assignment operators are definitely more difficult to understand for beginners
mtcars <- with(mtcars, hwratio <- hp/wt)
# perhaps this is more comprehensive
mtcars$hwratio <- with(mtcars, hp/wt)
# or even better
inside(mtcars, hwratio <- hp/wt) Note the later does have an assignment operator that signals (or should signal) creating a new variable. Anyways, if such a function is clearly documented, users should be aware and decide accordingly. |
|
Oh my... |
No it doesn't modify-in-place. I was just referring to the tidyverse discourse. 👍 I think the semantics in R does not provide an advantage to such modifications since, internally, every data frame is copied once it's modified. |
I really don't think use of within() is standard base R. I've never seen an R book or tutorial use it. And I certainly would not recommend teaching it, for exactly the same reason. Sorry for the long delay in replying. |
Some examples give this code for creating new variables in data frame:
mtcars$hwratio <- mtcars$hp / mtcars$wt
.The corresponding Tidy version is this:
mtcars %>% mutate(hwratio=hp/wt) -> mtcars
.Maybe it's not a "beginner" topic, but I think the typical base R way is this:
mtcars <- within(mtcars, hwratio <- hp/wt)
.There's really no difference between that and the Tidy version, since as far as I can tell, mutate and within do the same thing. Tidy insists the
%>%
operator though. Without that, it's nearly identical (mtcars <- mutate(mtcars, hwaatio=hp/wt)
), and one wonders why you would add a slew of dependencies simply to rename "within" to "mutate".The text was updated successfully, but these errors were encountered: