Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeseries/other DSN access to STATE/specl #60

Open
rburghol opened this issue Mar 27, 2023 · 3 comments
Open

Add timeseries/other DSN access to STATE/specl #60

rburghol opened this issue Mar 27, 2023 · 3 comments

Comments

@rburghol
Copy link

rburghol commented Mar 27, 2023

  • Code:
  • timeseries in hdf5 need to be accessible
  • hsp2 already grabs timeseries inputs and outputs (from EXT SOURCES and EXT TARGETS), for the reach in question.
    • see: read_ts()
    • in main.py: get_flows() pulls in upstream inflowshttps://github.com/respec/HSPsquared/blob/65cdd8a0315546a0d71dd855a890728a6efae2bf/HSP2IO/io.py#L75
    • Code that performs read is in HSP2IO/hdf.py : read_ts()
      • Expects HS category, operations and segment as inputs
    • For state/ops desire a path reader since those are already stashed in state_paths
      • Solution - add new method read_path_ts() in hdf.py:
	def read_path_ts(self, 
			path=None) -> pd.DataFrame:
		try:
			return read_hdf(self._store, path)
		except KeyError:
			return pd.DataFrame()
  • Need to provide wider access to include DSNs that are NOT included anywhere
    • like DSN10 for subshed runoff)
    • And for cross domain data access
    • Need to be able to know what the root of a given domain, i.e., RCHRES_R001 or RCHRES_R002, there is currently no data that contains that information passed into hydr!
  • Function get_timeseries() requires information akin to that in the EXT SOURCES block lines. Specifically, it uses:
    • MFACTOR - multiplier (default is 1.0, which means no conversion)
    • TMEMN - member name (see UCI EXT SOURCES) - only used in transform to guess conversion if how is False
    • TRAN - the transformation method (passed to transform as how)
    • Calls transform(ts, name, how, siminfo)
  • Does the hdf5 already have everything in at the right timestep? I don't think so... otehrwise, there would be no conversion needed in get_timeseries()

1st Draft

  • in utilities_specl.py

def load_sim_ts(f, state_ix, ts_ix):
  tsq = f['TIMESERIES/'] # f is the hdf5 file 
  # this code replicates a piece of the function of get_timeseries, and loads the full timeseries into a Dict
  # the dict will be keyed by a simple integer, and have values at the time step only.  The actual time 
  # at that step will be contained in ts_ts 
  ts_tables = list(tsq.keys())
  for i in ts_tables:
      if i == 'SUMMARY': continue # skip this non-numeric table
      var_path = '/TIMESERIES/' + i
      ix = set_state(state_ix, state_paths, var_path, 0.0)
      ts_ix[ix] = np.asarray(tsq[i]['table']['values'], dtype="float32")

Standalone

import os
import h5py
import numpy as np
import pandas as pd
from pandas.io.pytables import read_hdf
from collections import defaultdict
from pandas import DataFrame, read_hdf, HDFStore
from numba.typed import Dict
# now load HSP2 stuff
import HSP2.utilities
import HSP2IO
from HSP2IO.hdf import HDF5
from HSP2IO.io import IOManager

os.chdir('C:/usr/local/home/git/HSPsquared/')
fpath = 'C:/Workspace/modeling/cbp/hsp2/river/PM7_4581_4580.h5' 
# close the already opened file first if we are re-running
# f.close()
f = h5py.File(fpath,'a') # use mode 'a' which allows read, write, modify
dset = f['TIMESERIES/TS011/table']

# Try this, transform(ts, name, how, siminfo)
# name = 'NA' since this should only be used if we don't give a `how`
# how = 'SAME', means just resample - I think this is good for a Quantity, maybe not rate?
hdf5_instance = HDF5(fpath)
io_manager = IOManager(hdf5_instance)
uci_obj = io_manager.read_uci()
siminfo = uci_obj.siminfo
# note, siminfo delt is specific to each module being run (or is potentially varying)
# so it is NOT set at first read of UCI, rather, set by "operation"
# delt = time step in minutes
siminfo['delt'] = 60
# hsp2 adds tindex and computes steps based on start/stop and delt for each sequence
siminfo['tindex'] = date_range(siminfo['start'], siminfo['stop'], freq=Minute(siminfo['delt']))[1:]
siminfo['steps'] = len(siminfo['tindex'])

# dset from direct grab of the table path does not work (see above)
# using method from HSP2IO/hdf.py 
dstore = pd.HDFStore(fpath)
dset = read_hdf(dstore, '/TIMESERIES/TS011/')
t = transform(dset, 'NA', 'SAME', siminfo)

# the same can be achieved directly using the hdf5_instance: (note Category is set somewhere in the loaded hsp2 modules)
ts10 = hdf5_instance.read_ts(Category.INPUTS, None, "TS010")
# same can be achieved with a direct call to the io_manager:
ts10 = io_manager.read_ts(Category.INPUTS, None, "TS010")

@rburghol
Copy link
Author

rburghol commented Mar 28, 2023

Import a ts and transform into step indexed, timestep uniform data frame.

import os
os.chdir('C:/usr/local/home/git/HSPsquared/')

import h5py
import numpy as np
import pandas as pd
from collections import defaultdict
from pandas import DataFrame, read_hdf, HDFStore
from numba.typed import Dict
# now load HSP2 stuff
#import HSP2.utilities # note: we need to import this first, otherwise we get errors with HSP2IO.io and HSPFIO.hdf 
from HSP2.utilities import * # need this to access transform, tho maybe more pointed imports would suffice
import HSP2IO
from HSP2IO.hdf import HDF5
from HSP2IO.io import IOManager, SupportsReadTS, Category

fpath = 'C:/Workspace/modeling/cbp/hsp2/river/PM7_4581_4580.h5' 
# Load simulation info 
hdf5_instance = HDF5(fpath)
io_manager = IOManager(hdf5_instance)
uci_obj = io_manager.read_uci()
siminfo = uci_obj.siminfo
# note, siminfo delt is specific to each module being run (or is potentially varying)
# so it is NOT set at first read of UCI, rather, set by "operation"
# delt = time step in minutes
siminfo['delt'] = 60
siminfo['tindex'] = date_range(siminfo['start'], siminfo['stop'], freq=Minute(siminfo['delt']))[1:]
siminfo['steps'] = len(siminfo['tindex'])

# close the already opened file first if we are re-running
# f.close()
f = h5py.File(fpath,'a') # use mode 'a' which allows read, write, modify
dset = f['TIMESERIES/TS011/table']

# Try this, transform(ts, name, how, siminfo)
# name = 'NA' since this should only be used if we don't give a `how`
# how = 'SAME', which means just resample - I think this is good for a Quantity, maybe not rate?
ts10 = io_manager.read_ts(Category.INPUTS, None, "TS010")
t = transform(ts10, 'NA', 'SAME', siminfo)

@rburghol
Copy link
Author

rburghol commented Dec 26, 2023

Work session 12/26/2023

  • trying to access SPECL data and understand how to handle the dictionary that hsp2 has during runtime
# use above code to load libs etc
fpath = 'C:/Workspace/modeling/hspf/hsp2/testcbp/PL3_5250_0001.h5'
#f = h5py.File(fpath,'a') # use mode 'a' which allows read, write, modify
hdf5_instance = HDF5(fpath)
io_manager = IOManager(hdf5_instance)
uci_obj = io_manager.read_uci()
siminfo = uci_obj.siminfo
state = {}
state['model_data'] = {}

dc = uci_obj.specactions['ACTIONS']
for ix in dc.index:
    # add the items to the state['model_data'] dict 
    speca = dc[ix:(ix+1)]
    print("row", ix, "keys:", speca.keys())
    # need to add a name attribute
    opname = 'SPEC' + 'ACTION' + str(ix)
    state['model_data'][opname] = {}
    opname = 'SPEC' + 'ACTION' + str(ix)
    state['model_data'][opname]['name'] = opname
    for ik in speca.keys(): 
        # this extracts the data from the data frame to the model_data dict
        state['model_data'][opname][ik] = speca.to_dict()[ik][ix]

@rburghol rburghol changed the title Add timeseries/other DSN access to specl Add timeseries/other DSN access to STATE/specl Jan 24, 2024
@rburghol
Copy link
Author

Time series writers, i.e, used for logging, may not be included in the domain dependency list, so we need a way of saying "also, run anything that depends on the STATE value of things that are running in the current domain". Example, O2 = withdrawal in cfs, and we may wish to log the original value, before HSP2 decides whether or not there is sufficient water available.

# get the deps for our reach that influence O2 and O3 (withdrawals in our CBP template)
dep_list = model_domain_dependencies(state, '/STATE/JL1_6562_6560/RCHRES_R001', ['O2', 'O3'])
# returns: [44, 45, 48, 49, 43, 47, 4]
# now, find anything that is linked to state for anything WITHIN this dep_list
all_deps = model_input_dependencies(state, dep_list)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant