Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide path for to_json/from_json of units for front-end #12

Open
bollwyvl opened this issue Oct 8, 2015 · 1 comment
Open

provide path for to_json/from_json of units for front-end #12

bollwyvl opened this issue Oct 8, 2015 · 1 comment

Comments

@bollwyvl
Copy link
Contributor

bollwyvl commented Oct 8, 2015

One of the marquee users of traitlets that users would encounter is widgets, either by way of interact or directly, as they provide one of the best ways to access the interactive Jupyter/IPython magic.

As such, for many science and engineering applications, having the units persist all the way to the front-end is enormously useful.

To talk to the front end, the entire meaning of the traitlet value at one time must be communicated over serialization to JSON: JSON is actually poorly represented vs other serialization formats, i.e. XML in providing rich types. The reference implementation, for example, of providing an identifier for Widget itself uses a magic string notation to an ephemeral id, IPY_MODEL_, which frontends can only manipulate very mechanically.

For units, serializing the number to a string would introduce a whole set of issues, as the pidgin language of every unit library is slightly different.

A solution to this would be to utilize JSON for Linking Data, and leverage extensive existing work into solving this non-trivial problem... or at least representing it in a way that is not overly-opinionated.

Consider:

from astropy import units as u

class Sphere(Widget):
    radius = NumericalTrait(convertible_to=u.m)

s = sphere(radius=1.21)
print(s.radius)
>>> 1.21 meters

In JSON, and in JSON-LD, a number is a number:

{"radius": 1.21}

But in JSON-LD, a number can also be an object with an @value:

{"radius": {"@value": 1.21}}

This level of indirection gives us a place to put other metadata about the value.

The simplest possible approach would be to continue to treat the value as a literal, and only introduce @type:

{"radius": {"@value": 1.21, "@type": "meter"}}

Hooray, we've picked a type. But we've made up our own name for it. Where do we look the values up? What about derived units, domain, preferred display units, etc.?

Here's what it could look like by utilizing the UN/CEFACT codes, which have been notionally adopted by schema.org, a large driver of linked data adoption:

{
  "radius": {
     "@context":  "http://schema.org/",
     "@type": "QuantitativeValue",
     "value": 1.21,
     "unitCode": "MTR"
  }
}

While we have said more explicitly (i.e. not unilaterally) that this value is of a type, and are playing by the rules of a standards body, this is somewhat unsatisfying:

  • MTR is basically a random code from an excel spreadsheet
  • derived units, if not already present in the several thousand already defined, would be difficult to describe some of the more interesting types possible, as a user may have created them on the fly through a series of manipulations.

To solve some of these issues, adoption of the QUDT vocabularies would provide a more robust conceptual model:

{
  "radius": {
    "@context": [
      "http://schema.org/",
      {
        "ex": "http://example.com#",
        "radius": "ex:radius",
        "unit": "http://qudt.org/1.1/vocab/unit#"
      }
    ],
     "@type": "QuantitativeValue",
     "value": 1.21,
     "unitCode": "unit:Meter"
  }
}

This adds a "thing not a string" to the unitCode, itself which can be traced back to a robust set of models.
QUDT can also support vectors of exponentiated dimension types, etc.and comes with a very large library, written by and used within an organization with a seriously multi-scale perspective (NASA).

Data Shapes

A whole other story. JSON-LD is pretty bad at labeling columns of arrays, and indeed URIs can't start with numerals. Some approach for listing columns and their types would be necessary.

Implementation

TBD... probably something like ipywidgets, i.e. numtraits.widget_serialization, which exposed a to_json and from_json functions that consulted the canonical data format and called the appropriate things in the upstream unit library (i.e. astropy, pint).

Dependencies

Generating and interpreting JSON-LD requires no additional libraries. A JSON Schema library (which already ships with jupyter) would be sufficient to provide sufficient serialization robustness, even if it couldn't do full type-checking of the resources.

The canonical lists are available for download as XML or turtle, and these could be converted to canonical JSON.

Front-end

Out of scope for this issue, but... in the near term, a set of base widgets (sliders, text boxes, etc.) which didn't simply fall over would be a good start.

As to serious implementations on the front-end parsing side of this, several quantity libraries exist including math.js and quantities.js. There is nothing as flexible as any of the python implementations, but this could be an excellent driver for the creation of such a library, driven by a canonical representation format.

Related:

  • This proposal is mainly concerned with the DataFrame representation.
@westurner
Copy link

See:

Data Shapes

  • CSVW (csvw:) (tabular data)
  • W3C Data Cubes (qb:) (pivot tables)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants