Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error pulling a 'column' directly from a table with h5pyd #106

Open
MRossol opened this issue Sep 29, 2021 · 2 comments
Open

Error pulling a 'column' directly from a table with h5pyd #106

MRossol opened this issue Sep 29, 2021 · 2 comments
Assignees

Comments

@MRossol
Copy link
Contributor

MRossol commented Sep 29, 2021

h5pyd is unable to pull a "column" from a recarray/table directly.

Example code using h5py:

In [14]: with h5py.File(path, mode='r') as f:
    ...:     sector = f['enumerations']['sector']['id']
    ...:
    ...:     print(sector)
    ...:
[b'com' b'res' b'trans' b'ind']

Same attempt in h5pyd:

In [12]: with h5pyd.File(hsds_path, mode='r') as f:
    ...:     sector = f['enumerations']['sector']['id']
    ...:
    ...:     print(sector)
    ...:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-12-ecf8d2f2a8ec> in <module>
      1 with h5pyd.File(hsds_path, mode='r') as f:
----> 2     sector = f['enumerations']['sector']['id']
      3
      4

~/miniconda3/lib/python3.9/site-packages/h5pyd/_hl/dataset.py in __getitem__(self, args)
    862                         self.log.info("binary response, {} bytes".format(len(rsp)))
    863                         #arr1d = numpy.frombuffer(rsp, dtype=mtype)
--> 864                         arr1d = bytesToArray(rsp, mtype, page_mshape)
    865                         page_arr = numpy.reshape(arr1d, page_mshape)
    866                     else:

~/miniconda3/lib/python3.9/site-packages/h5pyd/_hl/base.py in bytesToArray(data, dt, shape)
    497         for index in range(nelements):
    498             offset = readElement(data, offset, arr, index, dt)
--> 499     arr = arr.reshape(shape)
    500     return arr
    501

ValueError: cannot reshape array of size 12 into shape (4,)

For reference, the source .h5 file is here:
s3://oedi-data-lake/dsgrid-2018-efs/state_hourly_residuals/eia_annual_energy_by_sector.dsg
the hsds domain is in the s3://nrel-pds-hsds/ bucket here:
'/nrel/dsgrid-2018-efs/state_hourly_residuals/eia_annual_energy_by_sector.dsg'

@jreadey
Copy link
Member

jreadey commented Sep 29, 2021

That's a feature not yet supported on h5pyd. As a work-around you can read the desired selection into a numpy array then extract the column from that.

@MRossol
Copy link
Contributor Author

MRossol commented Sep 29, 2021

Thanks @jreadey, That was the work around I suggested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants