Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dat does not work independently since dplyr 0.7.5 #6

Open
phainom opened this issue Jun 4, 2018 · 4 comments
Open

dat does not work independently since dplyr 0.7.5 #6

phainom opened this issue Jun 4, 2018 · 4 comments

Comments

@phainom
Copy link

phainom commented Jun 4, 2018

With dplyr 0.7.5, you have to import functions you use for example in ´mutate´, ´select´ or ´filter´, like ´lead´, ´lag´, ´starts_with´ etc.

Using dat, I now have to import these functions from dplyr to be able to use ´mutar´ properly. To me this seems to be a strange behavior, since I like to use mutar without explicitly loading dplyr.

@wahani
Copy link
Owner

wahani commented Jun 5, 2018

I see also this problem. However I am not sure how to deal with it.

Prior to 0.7.5 dplyr basically implemented it's own scoping rules. So scoping when using dplyr was different than outside of these functions. Now that they have fixed it, we realize that we rely on this feature more than we should. What really troubles me is that because of the work-around for non-standrad-evaluation using formulas (in dplyr or dat), R CMD check cannot warn us properly for namespace problems. If we want to have these tools we are paying a price. I am not sure that I am happy with it now that I see the bill.

Anyways, what we can do inside of dat is to reexport certain functions. E.g. lead, lag, starts_with etc. This would also not conflict with the case when we attach dplyr. Of course we can also implement the old scoping rules of 'dplyr', I am not sure if we want that though (maybe you have a different take on that?). However, dat was not really meant to be a drop in replacement for everything dplyr has to offer. It solves 90% of the awkward nse mechanism when using dplyr in packages.

My first intend of the package was to make my code independent of the concrete framework in the background and eventually switch to data.table. I now realize that this was a good idea but is impossible without braking a lot of code. The way mutar is implemented is tightly coupled to the way dplyr works: Otherwise you could have never used lead and friends the why we used to. Since this is not a goal anymore, because not feasible, we can also change the package into a layer on top of dplyr. This may be the way it is perceived anyways.

How would you like to see that dat works?

@phainom
Copy link
Author

phainom commented Jun 5, 2018

I dont have a very strong opinion on this.

Its nice that dplyr changed its strange scoping rules. I guess I would expect from dat that the mentioned function work in the same way as they do with dplyr. That could be easily achieved by reexporting the relevant functions. I dont think that dat should reimplement the old, dysfunctional principals from dplyr, as there are reasons why they were thrown away.

I am not sure how I feel about the idea of exchanging the underlying technology of the package, since to me it seems like quite an effort. And if data.table is not only faster but one likes the philosophy more, why not change to data.table entirely?

I really like the functionality of mutar, especially the way one can work with | and hope there is an easy and consistent solution for this.

@wahani
Copy link
Owner

wahani commented Jun 5, 2018

Do you have an overview how many or which functions we need to export? I see lag, lead, n but would assume we need more.

By the way: starts_with can be replaced with: dat::mutar(data.frame(xx = 1, yx = 2), "^x")

@phainom
Copy link
Author

phainom commented Jun 10, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants