This repository has been archived by the owner on Jun 18, 2023. It is now read-only.
v0.7.0
No due date
57% complete
This release will be focused on incorporation of new tech that will greatly enhance the possibilities within YATSM. These improvements can be structured into three general categories:
-
Increase dataset IO flexibility by utilizing
xarray
for labeled nd-arrays- Define one or more datasets that provide a set of labeled dataset bands (e.g., Landsat provides …
This release will be focused on incorporation of new tech that will greatly enhance the possibilities within YATSM. These improvements can be structured into three general categories:
- Increase dataset IO flexibility by utilizing
xarray
for labeled nd-arrays- Define one or more datasets that provide a set of labeled dataset bands (e.g., Landsat provides red, nir, swir while PRISM provides meteorological data)
xarray
dataset will allow multiple datasets to be analyzed in one object, easing attempts at fusing data from multiple sensors- Labeling the band dimension of our datasets will ease how we refer to these time series data
xarray
datasets are easily to serialize and will resolve many of the sticking points we currently have with "caching" our time series
- Enhance result storage capabilities by using an indexed, hierarchical data storage format (likely
pytables
)- Indexing our results storage format will greatly increase the speed with which we can extract information from these results (see #69)
- Hierarchical data storage will provide separation of science model results from one another while still allowing these results to nest within another (e.g., temporal segmentation at the top of the hierarchy with long term phenology estimates and classification labels nested within the segmentation results)
- A more robust serialization format will allow for "picking up" or "resuming" of model runs if a user wants to add another step in their analysis
- Allow users to easily chain together science models in a data analysis pipeline (likely with luigi)
- Right now we have something of a pipeline -- running CCDCesque, fixing change results with the "commission test", re-estimating time series model attributes using the refitting steps, estimating phenology attributes for each segment, and then classifying the land cover condition for each segment
- Unfortunately, this existing pipeline is very much hard coded and is a poor framework for adding additional steps
- Leverage existing pipeline technology if possible (luigi) to allow users to define pipeline step "requirements" and "outputs"
- Requirements and outputs may be either "data" (time series observations) or "record" (time series segment model outputs) information
- Once a new storage format is implemented, the pipeline will be able to resume from existing model results by checking if any given step has its requirements satisfied in these stored results
Each of these general tasks will be discussed in further detail as "YATSM Enhancement Proposals" (YEPs) issues