Result file IO abstractions #69

ceholden · 2015-12-03T00:38:12Z

Motivation

Right now we're using NumPy saved files that store structural arrays for the results but this might change in the future (see #14), especially to accommodate some visualization utilities that would benefit from having all the results for an image in one container indexed intelligently.

Another annoyance is that each CLI utility uses duplicate code to open/inspect/read/write from/to the result files. Ideally this should be refactored into some common set of functions.

Proposal

Implement a "drivers" for each format (so just NumPy for now) that contains the logic for inspecting/reading/writing to/etc. each format. Eventually this will necessitate updating the configuration files to specify what result storage driver should be used.

The currently implemented iter_records, for example, would still iterate over result records, but would do so in a way that makes sense for the format. For the current NumPy saved files, we'd yield one row worth of records at a time. If we used something that stores results in blocks, maybe it would be chunks of data irregardless of the row:

driver = drivers.register(result_format)

for rec in driver.iter_records(config):
    # do stuff

We usually want to perform a query on the records based on the segment dates, so there could be some higher level API access that would perform a query optimized for the format (NumPy files would just use simple np.where against them but we could use in kernel searches if using pytables):

driver = drivers.register(result_format)

for matching_rec in driver.query_records(config, start='2000-01-01', end='2001-01-01'):
    # do more stuff

Justification

If we refactor out all of the result IO from the CLI scripts, we'll make testing much easier and probably reduce the overall amount of code. Refactoring out just the NumPy format probably won't take too much time and would set us up to easily transition to a better file format.

The text was updated successfully, but these errors were encountered:

ceholden added backward_incompatible YEP: YATSM Enhancement Proposal labels Dec 3, 2015

ceholden added a commit that referenced this issue Dec 21, 2015

Refactor some IO to io mod #69

e557a3e

ceholden modified the milestone: v0.7.0 May 12, 2016

ceholden mentioned this issue May 12, 2016

Release planning: v0.7.0 #90

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result file IO abstractions #69

Result file IO abstractions #69

ceholden commented Dec 3, 2015

Result file IO abstractions #69

Result file IO abstractions #69

Comments

ceholden commented Dec 3, 2015

Motivation

Proposal

Justification