-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend Coopy Highlighter Diff format with column type changes #3
Comments
From @paulfitz on February 26, 2015 22:46 Thanks @edwindj. There's also work on refining the types in json table scheme in #159. What do you think about leaving types in a separate optional row, like:
I'm thinking that the spec could leave space for meta data associated with columns via a series of The advantage of the separate rows is that the cells can behave exactly as in ordinary rows and be parsed in just the same way. |
From @paulfitz on February 26, 2015 23:9 Also, I understand from edwindj/daff#6 that you like having a single file for expressing diffs, and that may be the way to go. But just as the Tabular Data Package spec proposes data in csv and schema in json, there may be something to be said for expressing schema differences in a hierachical format like json rather than trying to flatten types out. |
From @edwindj on February 27, 2015 7:8 @paulfitz I like your the syntax for extra lines that may be ignored by consumers. Regarding type changes in one file or two: should we follow the diff paradigm of storing all changes in one text or should we follow the json table schema paradigm of describing meta data (changes) in a json file? The last option would force all users to use json table schema which I find too strict. May be we should support both with a preference for json table schema. When a schema is available it should be used, otherwise a less expressive form can be used with the Note that a solution in the spirit of datapackage probably would not calculate a diff, but just reference two resources: table remote and table local. |
From @paulfitz on February 28, 2015 4:0 I agreed it would make sense to stick the new syntax in. I could take a shot also at adding support for it in This feature should make diffs more useful within an environment with a single kind of data source, even if it wouldn't be very useful for interchange between different kinds of data sources. |
From @edwindj on March 1, 2015 9:59 Great! I will follow your changes and implement them in daff for R. |
From @paulfitz on October 10, 2015 15:50 I implemented a version of this some time back, and then got distracted working on a demo for it with sqlite. Suppose we have a
And we modify the type of a column, add another column, and add a row:
Then daff would report this diff: To use this in R, you'd need to implement some code that reports the properties of each column that you care about. That is sufficient for diffing. For patching, you'd need to be able to accept a description of the changes in a particular format and make them happen. I'll need to document this better if you're still interested in pursuing this @edwindj. |
From @edwindj on February 25, 2015 14:10
A useful addition to the coopy highlighter diff format would be column type changes.
For example:
and
The Coopy Diff is:
A typed version of the format could be:
In which the schema row can contain a column type change. IMHO type information is not obligatory, but should be interpreted by an implementation as a type suggestion, since types differs across programming languages. The types of json table schema seems like a good candidate for denoting common types.
Copied from original issue: frictionlessdata/datapackage#164
The text was updated successfully, but these errors were encountered: