Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible problems with Vattrinfo2 #340

Open
JohnLCaron opened this issue Mar 16, 2023 · 15 comments
Open

Possible problems with Vattrinfo2 #340

JohnLCaron opened this issue Mar 16, 2023 · 15 comments
Assignees
Labels
Component - C Library Core C library issues Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug Please report security issues to [email protected] instead of creating an issue on GitHub UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them
Milestone

Comments

@JohnLCaron
Copy link

Hello friends:

Im looking at what may be a bug in HDF4.2r0-Beta, downloaded 2/15/2023 from https://github.com/HDFGroup/hdf4.git and built locally on Ubuntu 22.04.

On the attached file, ./hdp dumpvg gives:

    attr12: name=start_latlon type=5 count=1 size=4
0.041719
    attr13: name=end_latlon type=5 count=1 size=4
-0.004751

but the correct answer is:

   start_latlon = 0.04171858f, -172.38489f ;
   end_latlon = -0.00475051f, 162.88487f ;

which I found using an independent library I am working on at https://github.com/JohnLCaron/cdm-kotlin (Note; work in Progress!)

Its likely Im missing something, for example what is the meaning of nfields in this interface:

intn Vattrinfo2(int32 vgroup_id, intn attr_index, char *attr_name, int32 *data_type, int32 *count, int32
  *size, int32 *nfields, uint16 *refnum)

   
I was guessing that theres only one field in an attribute? so count should = vh.nelems * fld[0].nelems
looks like a bug where Vattrinfo2 is returning count = fld[0].nelems ??

If you are really are supporting more than one field, seem likely that there are other calls that need to be made, but I havent found them using the Vattr* API calls.

Maybe theres a workaround (that hdp is not using either)? It would be great to have a working example on how to
handle vgroup attributes in a general way.

To reproduce:

./hdp dumpvg -r 401 2006166131201_00702_CS_2B-GEOPROF_GRANULE_P_R03_E00.hdf

on this file:

2006166131201_00702_CS_2B-GEOPROF_GRANULE_P_R03_E00.hdf.tar.gz

Thanks for your help!

John

@JohnLCaron
Copy link
Author

Downloaded version has readme with "HDF version 4.2.15-post0 currently under development"

@JohnLCaron
Copy link
Author

Tried it again with version "HDF version 4.2.17-1 currently under development" from github on 3/19/23, same problem.

Theres a possibly related problem where sometimes

Vattrinfo2(int32 vgroup_id, intn attr_index, char *attr_name, int32 *data_type, int32 *count, int32 *size, int32 *nfields, uint16 *refnum)

returns the same value for size and count (looks to me like size is incorrectly set to count). I can send a file that shows this if you want.

My real problem is that memory is getting corrupted somewhere, although that may be my error not yours.

Thanks for any insight into this.

@derobins derobins self-assigned this May 3, 2023
@derobins derobins added Component - C Library Core C library issues UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them Type - Bug Please report security issues to [email protected] instead of creating an issue on GitHub Priority - 2. Medium ⏹ It would be nice to have this in the next release labels May 3, 2023
@derobins derobins moved this to Todo in HDF4 4.3.0 May 3, 2023
@derobins
Copy link
Member

derobins commented May 8, 2023

So HDF4.2r0-Beta is from 1993. When we look at this with 4.2.16, we see:

    attr10: name=start_latlon type=5 count=2 size=8
    0.041719 -172.384888
    attr11: name=end_latlon type=5 count=2 size=8
    -0.004751 162.884872

Can you try this with the most recent version of HDF4?

@JohnLCaron
Copy link
Author

Hi Dana:

I downloaded the 4.2.16 release from https://support.hdfgroup.org/ftp/HDF/releases/HDF4.2.16/bin/unix/hdf-4.2.16-ubuntu2204_64.tar.gz

then ran:

~/Downloads/hdf/HDF-4.2.16-Linux/HDF_Group/HDF/4.2.16/bin:$ ./hdp --version
./hdp, HDF Version 4.2 Release 16, March 3, 2023

~/Downloads/hdf/HDF-4.2.16-Linux/HDF_Group/HDF/4.2.16/bin:$ ./hdp dumpvg -r 401 /home/all/testdata/hdf4/ssec/2006166131201_00702_CS_2B-GEOPROF_GRANULE_P_R03_E00.hdf > temp16.out

which gives:

    attr12: name=start_latlon type=5 count=1 size=4
	0.041719 
    attr13: name=end_latlon type=5 count=1 size=4
	-0.004751 

(full output attached).

Maybe an error in the linux version?
Not sure of the discrepency of attr10/11 and attr12/13 names

temp16.txt

-John

@bmribler
Copy link
Collaborator

bmribler commented May 9, 2023

I'm checking it out, John.
Binh-Minh

@JohnLCaron
Copy link
Author

Although Im reporting this as a bug in hdp, Im really trying to read the file myself using the C library API, where I was seeing the same problem with my initial attempt to do so.

Specifically, I am trying to match up the info I get from

intn Vattrinfo2(int32 vgroup_id, intn attr_index, char *attr_name, int32 *data_type, int32 *count, int32 *size, int32 *nfields, uint16 *refnum)

intn VSinquire(int32 vdata_id, int32 *n_records, int32 *interlace_mode, char *field_name_list, int32
*vdata_size, char *vdata_name)

int32 VFfieldorder(int32 vdata_id, int32 field_index)

with the info I get from the DFTAG_VH message, when I read that directly.

In particular, the DFTAG_VH message has a field

order_n     Order of the nth field of the Vdata (16-bit integer)

which may be misnamed, and should be named "nelems_n number of elements for the nth field". Because if I do that, that seems to get the correct number of values for the start_latlon example above when I directly read the DFTAG_VH messgae.

But really Im just guessing, as it doesnt seem to work using the C API call VFfieldorder(vdata_id, idx). I dont really have the tooling to understand the C library code.

It would be great to have a non-trivial example of how one reads attributes in a general way through the C API, that is, if one wants to process any hdf4 file and not a specific one, what does the code look like?

Would be glad to help untangle this if I can be helpful.

-- John

@bmribler
Copy link
Collaborator

bmribler commented May 9, 2023

Hi John,
There are two vgroups that have attributes start_latlon and end_latlon:
Vgroup:3
name = Swath Attributes; class = SWATH Vgroup;
which has
attr10: name=start_latlon type=5 count=2 size=8
0.041719 -172.384888
attr11: name=end_latlon type=5 count=2 size=8
-0.004751 162.884872

Vgroup:10
name = C:\DPS\Data\Products\2006166131201_00702_CS_2B-GEOPROF_GRANULE_; class = CDF0.0;
which has
attr12: name=start_latlon type=5 count=1 size=4
0.041719
attr13: name=end_latlon type=5 count=1 size=4
-0.004751

Do you know how the file was generated? Could you tell me the name of the program you're working on at https://github.com/JohnLCaron/cdm-kotlin?
Thanks,
Binh-Minh

@JohnLCaron
Copy link
Author

JohnLCaron commented May 10, 2023

Ok, that explains the confusion with the discrepency of attr10/11 and attr12/13. It also means that theres not a linux specific problem.

So we agree on the "start_latlon" attribute in vgroup 3 ("Swath Vgroup") , whose DFTAG_VH message (refno 42) has nvert = 1, nfields = 1, and fld[0].nelems = 2

We disagree on "start_latlon" attribute in vgroup 10 ("C:\DPS\Data\Products\2006166131201_00702_CS_2B-GEOPROF_GRANULE_; class = CDF0.0;) , whose DFTAG_VH message (refno 401) has nvert = 2, nfields = 1, and fld[0].nelems = 1

where from p 9-141 (june 2017 reference manual):

  • nvert = Number of entries in Vdata (32-bit integer)
  • nfields = Number of fields per entry in the Vdata
  • fld_nelems = order_n = Order of the nth field, which should actually be "number of elements for field n"

In my code I use nelems = nvert * order_n, and i would guess that the C code uses just order_n ?

I dont know where this file comes from other than whats in the metadata:

:HDFEOSVersion = "HDFEOS_V2.5" ;
:ID_CENTER = "CloudSat Data Processing Center" ;
:ID_CENTER_URL = "http://cloudsat.cira.colostate.edu" ;
:ID_CREATED = "Thu Nov 16 17:26:48 2006" ;
:ID_MACHINE = "SKINKB" ;
:ID_SITE = "Cooperative Institute for Research in the Atmosphere" ;

and the version message:

Version= 4.1.2 (NCSA HDF post Version 4.1 Release 2, NT DLL port November 1998)

It was part of my test suite for netcdf-java.

@JohnLCaron
Copy link
Author

In https://github.com/JohnLCaron/cdm-kotlin, the code that reads the messages directly is in:

core/src/main/kotlin/com/sunya/netchdf/hdf4/H4builder.kt

the code that uses the C API is

clibs/src/main/kotlin/com/sunya/netchdf/hdf4Clib/H4Cbuilder.kt

The comparsion unit test is

clibs/src/test/kotlin/com/sunya/netchdf/hdf4/HCcompare.kt

You wont be able to run the C API code without some modifications, sorry that code is still specific to my setup.

@JohnLCaron
Copy link
Author

In reviewing my code, I see that Ive done a workaround in H4Cbuilder.kt where I dont use Vgetattr2() except to get the VH message refno, and then read the VH message directly using VSinquire() and the VF*() calls. This makes H4Cbuilder (that calls the HDF4 C API) agree with the code in H4builder (which doesnt call the C API, but reads the data file directly).

Ive checked in some minor edits to make that clearer to follow. In case you're looking at that code, you should fetch the latest.

@bmribler
Copy link
Collaborator

I will as soon as I can, likely in a few day. I have some idea to try too.

@derobins derobins added this to the 4.3.0 milestone Aug 30, 2023
@bmribler
Copy link
Collaborator

bmribler commented Feb 18, 2024

Hi @JohnLCaron, I'm trying to get a grip on what the issue is at this time, but my head kept going all over the place...
OK, can we go back to where I found that there were two vgroups, 3 and 10, both of which have attributes named start_latlon/end_latlon? One of those pairs of attributes have 2 values and the other pair have only one value each.
Previously, I thought you said the attributes weren't displayed both values, but the two vgroups explained that. After that, what are other issues?

@bmribler
Copy link
Collaborator

bmribler commented Feb 19, 2024

@JohnLCaron, regarding your question about nfields argument, please refer to Section 4.2.1 and Figure 4b in the User's Guide: https://portal.hdfgroup.org/documentation/hdf4-docs/HDF4_Users_Guide.pdf.
The section defines the terms fields, components, and order in HDF4 vdata model.
FYI, here is an updated RM: https://portal.hdfgroup.org/documentation/hdf4-docs/HDF4_Reference_Manual.pdf

@derobins derobins moved this from Todo to In Progress in HDF4 4.3.0 Feb 22, 2024
@JohnLCaron
Copy link
Author

JohnLCaron commented Feb 27, 2024

@bmribler its been a while, so details are hazy. Heres what I can surmise from the above conversations:

If I look at your May 9 post above:

Vgroup:10
name = C:\DPS\Data\Products\2006166131201_00702_CS_2B-GEOPROF_GRANULE_; class = CDF0.0;
which has
attr12: name=start_latlon type=5 count=1 size=4
0.041719
attr13: name=end_latlon type=5 count=1 size=4
-0.004751

i think this is wrong, there are two values, not 1. From my code, nvert = 2, nfields = 1, but your code returns fld[0].nelems = 1

Im not sure where the problem is, and it may be that this is a file that got written with a version of the library with a bug in it. I mean, why are there two vgroups that have attributes start_latlon and end_latlon?

  1. what does "order_n = Order of the nth field" mean. what is order_n? Im guessing it actually means "number of elements for field n", but not sure.

  2. Im trying to write a program that shows all the available metadata in an HDF4 file. But none of your examples do that in a general way. Maybe thats just the way it is, no general reader for HDF4 and HDF-EOS?

If thats not enough, we can take it up again if/when i get back onto this project. Thanks.

@derobins derobins removed this from HDF4 4.3.0 Mar 3, 2024
@derobins derobins modified the milestones: 4.3.0, 4.4.0 Mar 3, 2024
@bmribler
Copy link
Collaborator

@JohnLCaron

  1. In HDF4, two vgroups can have attributes of the same name. An attribute's name is only required to be unique within its scope, which is a vgroup in this case. BTW, as you look through your data, this User's Guide might come in handy: https://portal.hdfgroup.org/documentation/hdf4-docs/HDF4_Users_Guide.pdf
  2. Could you send me the link to the Reference Manual you referred to?
  3. You can use the HDF4 tool hdp to display the contents of the file. If you just type "hdp", it will tell you the commands that you can use to display different types of information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug Please report security issues to [email protected] instead of creating an issue on GitHub UNCONFIRMED New issues are unconfirmed until a maintainer can duplicate them
Projects
None yet
Development

No branches or pull requests

3 participants