-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remapping categorical variables in base and tidy #38
Comments
I still think that the nested ifelse() calls is preferable over the Teaching The point is that the R-base documentation is not as sexy, and does not appeal to newcomers. |
@markvanderloo that's a very useful trick! I think we need a function in Base-R that does this efficiently. |
Another solution in base R: library(admisc)
recode(mtcars$gear, "3 = three; 4 = four; 5 = five")
# [1] "four" "four" "four" "three" "three" "three" "three" "four" "four" "four" "four"
# [12] "three" "three" "three" "three" "three" "three" "four" "four" "four" "three" "three"
# [23] "three" "three" "three" "four" "five" "five" "five" "five" "five" "four" Or, since this is a categorical variable, why not properly declare it as categorical: library(declared)
gear <- declared(mtcars$gear, label = "Number of gears", labels = c("Three" = 3, "Four" = 4, "Five" = 5))
gear
# <declared<integer>[32]> Number of gears
# [1] 4 4 4 3 3 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 3 3 3 3 3 4 5 5 5 5 5 4
#
# Labels:
# value label
# 3 Three
# 4 Four
# 5 Five
w_table(gear)
# fre rel per cpd
# ----------------------
# Three 15 0.469 46.9 46.9
# Four 12 0.375 37.5 84.4
# Five 5 0.156 15.6 100.0
# ----------------------
# 32 1.000 100.0 |
This is another nice approach but I think this is so basic that we ought to have it natively in the language. Or does anyone know of such a solution? I remember trying to get this done with the |
@BroVic: that requires a question about what "base" R means: is it anything which is not using tidy, or is it the R package I believe this is about the difference between the "normal" R language (including contributed R packages that use the normal R language), and the "tidy" R language / dialect. It would be pointless to expect solving anything using just the |
@dusadrian Yes, I agree with you. But if you ask me, an operation as basic as this should be available by default. |
Once again, keep it simple. I think use of named vectors falls into the realm, yes. |
Hi Norm,
I just read your base-vs-tidy document. Very insightful, thank you! I'm sharing it with our internal R users.
Regarding the last section on relabeling classes. Have you considered this option?
Using named vectors as a way to map values onto each other has been an extremely useful trick for me in many cases, although I admit that it may not be very intuitive for beginning users.
A more common use case seems to me is mapping
character
tocharacter
, in which case one simply does something likeyou even get
NA
for unmapped values, which seems to me desired behavior in many cases.Thanks again for writing all this up!
Best,
Mark
The text was updated successfully, but these errors were encountered: