JSON-array support, fractional seconds in strptime/strftime, and other minor features
This is a relatively minor release of Miller, containing feature requests and bugfixes while I've been working on the Windows port (which is nearly complete).
Features:
- JSON arrays: as described here, Miller being a tabular data processor isn't well-position to handle arbitrary JSON. (See jq for that.) But as of 5.1.0, arrays are converted to maps with integer keys, which are then at least processable using Miller. Details are here. The short of it is that you now have three options for the main mlr executable:
--json-map-arrays-on-input Convert JSON array indices to Miller map keys. (This is the default.)
--json-skip-arrays-on-input Disregard JSON arrays.
--json-fatal-arrays-on-input Raise a fatal error when JSON arrays are encountered in the input.
This resolves #133.
-
The new mlr fraction verb makes possible in a few keystrokes what was only possible before using two-pass DSL logic: here you can turn numerical values down a column into their fractional/percentage contribution to column totals, optionally grouped by other key columns.
-
The DSL functions strptime and strftime now handle fractional seconds. For parsing, use %S format as always; for formatting, there are now %1S through %9S which allow you to configure a specified number of decimal places. The return value from strptime is now floating-point, not integer, which is a minor backward incompatibility not worth labeling this release as 6.0.0. (You can work around this using int(strptime(...)).) The DSL functions gmt2sec and sec2gmt, which are keystroke-savers for strptime and strftime, are similarly modified, as is the sec2gmt verb. This resolves #125.
-
A few nearly-standalone programs -- which do not have anything to do with record streams -- are packaged within the Miller. (For example, hex-dump, unhex, and show-line-endings commands.) These are described here.
-
The stats1 and merge-fields verbs now support an antimode aggregator, in addition to the existing mode aggregator.
-
The join verb now by default does not require sorted input, which is the more common use case. (Memory-parsimonious joins which require sorted input, while no longer the default, are available using -s.) This another minor backward incompatibility not worth making a 6.0.0 over. This resolves #134.
-
mlr nest has a keystroke-saving --evar option for a common use case, namely, exploding a field by value across records.
Documentation:
-
The DSL reference now has per-function descriptions.
-
There is a new feature-counting example in the cookbook.
Bugfixes: