Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in tapply() example #39

Open
john-d-fox opened this issue Sep 17, 2022 · 6 comments
Open

error in tapply() example #39

john-d-fox opened this issue Sep 17, 2022 · 6 comments

Comments

@john-d-fox
Copy link

Dear Norm,

I've enjoyed the various versions of your tidyverse critique and largely agree with it. I noticed the following error in the current version, which I don't believe has been flagged before:

Your tapply() example doesn't handle NAs consistently.

> aggregate(airquality[, "Ozone"], 
+             list(Month = airquality[, "Month"]), 
+             mean, na.rm = TRUE)
  Month        x
1     5 23.61538
2     6 29.44444
3     7 59.11538
4     8 59.96154
5     9 31.44828

>   aq <- na.omit(airquality)
> tapply(aq$Ozone,aq$Month,mean)
       5        6        7        8        9 
24.12500 29.44444 59.11538 60.00000 31.44828 

The following would be consistent with the tidyverse solution and aggregate():

> tapply(airquality$Ozone, airquality$Month, mean, na.rm=TRUE)
       5        6        7        8        9 
23.61538 29.44444 59.11538 59.96154 31.44828 

``

Actually, I'd prefer

with(airquality, tapply(Ozone, Month, mean, na.rm=TRUE))
5 6 7 8 9
23.61538 29.44444 59.11538 59.96154 31.44828


Though it requires more explanation, it encourages what I believe to be a better habit.

Best,
 John
@dusadrian
Copy link

I planned to write almost exactly the same thing.
Although very efficient, the function tapply() can be quite cryptic for many users especially when splitting by more than one factor, when the split argument has to be a list.
There is an alternative in the most recent version of package admisc (0.30), which I find a lot more intuitive and easier to remember:

using(airquality, mean(Ozone, na.rm = TRUE), split.by = Month)

   mean 
5 23.615
6 29.444
7 59.115
8 59.962
9 31.448

Additionally, instead of:

mtcars$gear_char <-
 ifelse(mtcars$gear == 3,
   "three",
   ifelse(mtcars$gear == 4,
   "four",
   "five")
 )

this is arguably also more intuitive:

mtcars$gear_char <- recode(mtcars$gear, "3 = three; 4 = four; 5 = five")

@john-d-fox
Copy link
Author

I think the object was to do this without loading non-base-R packages. If that requirement is relaxed, there's also the Tapply() function in the car package, which provides a formula interface to tapply().

@dusadrian
Copy link

Indeed.
To me, "base R" is anything not related to the tidyverse dialect, using classic, traditional R code. The base package surely cannot do everything, and comparing it (alone) with the whole tidyverse is more than unfair.

@john-d-fox
Copy link
Author

It's my impression that 'base R' typically refers not just to the base package but to the R packages loaded by default at start-up or the packages in the standard R distribution.

@dusadrian
Copy link

You are correct, that should be the interpretation of the 'base R'.
But even so, the tidyverse dialect is orders of magnitude bigger, so that comparing (I believe) is still unfair without contributed packages using standard R code.
If my understanding is correct, the point of the TidyverseSkeptic is to make a fair comparison between 'traditional' R and the tidyverse dialect.

@matloff
Copy link
Owner

matloff commented Mar 11, 2023

Great discussions, and again, sorry I'm late to it. I just today looked at the Issues posts.

Once again, though, my overriding goal is to make things easy for beginners. That excludes using other packages, for instance.

As to tapply(), I'm not offering it as a panacea, just something I think is easier for noncoders to learn and use.

If tapply() doesn't quite work, I recommend that beginners--the horror!--write a loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants