-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inaccurate discussion of dplyr #9
Comments
Sorry, I disagree, based on long experience teaching programming and even English. |
You did not address the main issue that you misrepresent dplyr and purrr. I don't want to veer into wild speculation, as your blog post does, but this description seems willfully off-base, especially since you cite a large number to presumably shock/scare readers rather than giving the actual details. In reality, dplyr relies on six verbs and the teaching materials always start with those six verbs. This is far less complex than base R. People can move on to more complex variants of those verbs, which naturally provides a scaffolded learning experience. Furthermore, you are giving the "you're wrong because I think you're wrong and I have some supposed credentials" argument. I too have been an educator (high school ELA, undergrad stats, got awards for both) and though I have had experience with teaching I also know that pedagogy research is more reliable than my experience of one. |
Not sure what to say here. The "sifting through" a large number of functions actually represents what happened to me personally recently in a discussion about pipes. No matter what the function count is, in the end it's more than in base-R, where one need only know how [,] works. Hence my point about "teach a person to fish." An essay by definition is one's own opinion, informed by one's own experiences. I hope we can at least agree on that. |
How can you say that all you need to know with base R is how [,] works when you just told me I should be using tapply? Of course an essay is opinion-based (this is why I have not opened any issues on your, imo, overblown opinions about the impact of the tidyverse on the future of R), but that does not give one carte blanche to misrepresent facts. At the end of the day, dplyr is six core functions that are easy to learn. You're right, if you only teach people base R, they will be fishing more - fishing for the right solution, that is. |
The comment on brackets pertained only to dplyr.
…On Mon, Jul 15, 2019, 4:12 PM Ludmila Janda ***@***.***> wrote:
How can you say that all you need to know with base R is how [,] works
when you just told me I should be using tapply?
Of course an essay is opinion-based (this is why I have not opened any
issues on your, imo, overblown opinions about the impact of the tidyverse
on the future of R), but that does not give one carte blanche to
misrepresent facts. At the end of the day, dplyr is six core functions that
are easy to learn. You're right, if you only teach people base R, they will
be fishing more - fishing for the right solution, that is.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=ABZ34ZKWDN4WLVPF2CS3OU3P7SAU5A5CNFSM4ICMIUE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ52G7A#issuecomment-511419260>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABZ34ZP7SIKNZFRWNV2TP6DP7SAU5ANCNFSM4ICMIUEQ>
.
|
You write that dplyr "consists of 263 functions." Though you do state that "a user initially need not use more than a small fraction of them" you then say "the high complexity is clear". This is not an accurate or responsible discussion. Dplyr has six core functions - mutate, select, filter, summarise, arrange, and group_by - that are by far most commonly needed. You then state "every time a user needs some variant of an operation, she must sift through those hundreds of functions for one suited to her current need, which is also inaccurate since the majority of the added functions, eg mutate_if(), mutate_all(), and mutate_at(), are simply clear variants of a core verb, eg mutate() that can be easily referenced within autofill or the help documentation.
I would suggest you at least add a discussion of the six core dplyr verbs or rewrite this section as such:
Tidyverse students are being asked to learn a [smaller] volume of material, which is [potentially good] pedagogy. See "The Tidyverse Curse" [a post that covers two concerns with Tidyverse that are not related to what is listed here], in which the author says inter alia that he uses "only" 60 Tidyverse functions -- 60! The "star" of the Tidyverse, dplyr, consists of 263 functions. While a user initially need not use more than a small fraction of them, [since there are six core verbs/functions - mutate, select, filter, summarise, arrange, and group_by] the
highcomplexity is [limited]. Every time a user needs some variant of an operation, she [has no need to] sift through those [functions that can be easily referenced within autofill or the help documentation and are usefully named] for one suited to her current need. [Furthermore, many of the added functions, eg mutate_if(), mutate_all(), and mutate_at(), are simply clear variants of a core verb, eg mutate().]Also, you do the same number of functions citing with purrr, which once again has a small core of functions (most people use some variant of map()). It is not good practice to just give numbers rather than give the actual details.
Furthermore, in terms of pedagogy, there is a lot of evidence that humans learn things more easily though narrative devices, and it is reasonable to argue that the core dplyr verbs are narrative-driven and memorable, thus making them easier to learn than the base R or data.table syntax (especially to the many R users that are researchers and don't have a CS background or exposure to other programming languages, but arguably easy for most people).
The text was updated successfully, but these errors were encountered: