-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle class #6
Comments
Hi David, Very good question! The answer is yes and no. The object returned by > y <- iris[1:3,]
> x <- y
> x$Sepal.Width <- as.character(x$Sepal.Width)
> x$Sepal.Length <- as.factor(x$Sepal.Length)
> str(patch_data(y, diff_data(y, x)))
'data.frame': 3 obs. of 5 variables:
$ Sepal.Length: Factor w/ 3 levels "4.7","4.9","5.1": 3 2 1
$ Sepal.Width : chr "3.5" "3" "3.2"
$ Petal.Length: num 1.4 1.4 1.3
$ Petal.Width : num 0.2 0.2 0.2
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 However the Coopy Diff Format does not know about type changes so storing a diff and loading it, won't give the correct patch. x <- y <- iris[1:3,]
x$Sepal.Width <- as.character(x$Sepal.Width)
x$Sepal.Length <- as.factor(x$Sepal.Length)
write_diff(diff_data(y, x), file = "diff.csv")
d_yx <- read_diff("diff.csv")
str(patch_data(y, d_yx)) The author of the Coopy Diff Format welcomes the addition of type information, which is a good thing. Note however that this probably will not handle factor/character switches, since this is very R specific. |
I will try to add class changes to the diff format, but this will not land into the master branch before april (due to time constraints).
|
One way to do this with the existing diff format would be to generate two diffs: the existing one comparing content, and a second one that compares metadata. So for a table like this:
There would also be another (imaginary) table of type information like this:
Where the columns are a flattened version of whatever parameters it takes to fully describe types in R. Diffs of the second table would then give a clear record of type changes. |
@paulfitz I like your idea of creating a metadiff, which is very flexible. However I see the following issues:
Maybe we should have both: encode simple type information in the coopy highlighter diff format and extended type information in a meta diff Shall we move this discussion to http://dataprotocols.org/tabular-diff-format ? |
Perhaps it would be easiest to implement a second function ('diff_meta'?) to handle metadata changes. A simplistic way to handle (Changes in) Factor levels is to concatenate them into a single string with some separator, e.g.
|
Hi,
This is not really a bug, but a feature request: is it able to detect class changes?
Thanks!
David
The text was updated successfully, but these errors were encountered: