Skip to content

GVRS C Jump Start 01_Reading GVRS files

Gary Lucas edited this page Aug 20, 2024 · 4 revisions

Introduction

We begin this series of articles on GVRS techniques by describing how to use the GVRS C-language API to read data from a GVRS file. For the examples that follow, we will use a GVRS file created from the ETOPO1 data set. ETOPO1 is a whole-Earth elevation and ocean-depth product that was created by the U.S. government's National Oceanographic and Atmospheric Administration (NOAA). It is available in the public domain and can be used free of charge.

For this article, we used the original Java-based GVRS API to make a faithful transcription of the data from ETOPO1. To support the GVRS C-language API, we packaged the file using the Huffman-coding data compression algorithm. You can obtain the GVRS-formatted version of ETOPO1, by downloading it from our companion Gridfour (Java) project at ETOPO1_v1.0.4_comp_huffman.gvrs

Here are a few statistics about the ETOPO1 data source:

  1. Grid size 10800 rows by 21600 columns
  2. 233 million data cells
  3. Geographic (latitude, longitude) coordinates
  4. Cell spacing 1 minute of arc (1/60th of a degree)
  5. Elevation and depth given in meters
  6. Download size for GVRS compressed version: 127 MB.

The picture below was derived using the GVRS API to process the elevation and bathymetry information in the ETOPO1 data set. To produce the image, the source data was down sampled by a factor of 30-to-1.

ETOPO1 global elevation and bathymetry data

Note: A program based on the code snippets shown below is available in the
GvrsC "examples" folder under the name ExampleLocation.c

Opening a file

The code snippet below illustrates the steps in opening a GVRS file.

Gvrs* gvrs;
int status = GvrsOpen(&gvrs, "ETOPO1_v1.0.4_comp_huffman.gvrs", "r");
if (!gvrs) {
    printf("Unable to open GVRS file, error code %d\n", status);
    exit(-1);
}
GvrsSummarize(gvrs, stdout);
status = GvrsClose(gvrs);   // GvrsClose cleans up all resources associated with its input

The GvrsOpen() function is analogous to the fopen() function in the C standard library. It opens a GVRS file, then allocates and initializes memory needed to support file access. In this case, the file ETOPO1_v1.0.4_comp_huffman.gvrs was opened on a readonly basis.

Most functions in the GVRS C API return a status of zero if they are successful and a non-zero error code if they are unsuccessful. If the GvrsOpen function encountered an error in attempting to open the file, it would set the local variable gvrs (a pointer) to null and would return a non-zero status code.

Once the file is successfully opened, the code snippet prints a summary of the file organization to standard output and then closes the file.

The text below gives an abridged view of the summary

GVRS file:       ETOPO1_v1.0.4_comp_huffman.gvrs
UUID:            74e6d47-c224-4d8e-8076-cfacbcb71889
Identification:  ETOPO1_Ice_c_gmt4.grd
Last modified:   2024-06-15 02:48:24 (UTC)

Rows in raster:            10800
Columns in raster:         21600
Rows in tile:                 90
Columns in tile:             120
Rows of tiles:               120
Columns of tiles:            180
Cells in raster:       233280000
Cells in tile:             10800
Tiles in raster:           21600

Coordinate system: Geographic
Range of Values, Full Domain
   x values:      -180.000000,  180.000000, (360.000000)
   y values:       -90.000000,   90.000000, (180.000000)

Elements ----------------------------------------
    Name:   z
    Type:   Short
    Label:  Elevation
    Description: Elevation (positive values) or depth (negative), in meters
    Units:       m
    Values
        Minimum:    -11000
        Maximum:      8848
        Fill:       -32768

Data compression:  Enabled
Identification            Read Int    Write Int     Read Float     Write Flt
    GvrsHuffman           Yes         No            No             No

Identification data

The first thing we see in the summary text are identification strings.

All GVRS files are automatically assigned a Universally Unique Identifier (UUID). For practical purposes, the UUID assigns a unique identifier to every GVRS file. This feature is intended to support tracking files in inventory control systems, databases, and similar applications.

Every GVRS file can carry an optional, arbitrary identification string. The Gridfour software distribution includes a demonstration application called PackageData.java which uses the root-name of the source-file as its identification.

The identification strings are followed by the dimensions of the grid. These value match the statistics I cited above.

GVRS elements

The most interesting item in the summary block may be the table of elements .

GVRS elements provide the primary access point for reading data from a GVRS file. A GVRS file can contain one or more elements. ETOPO1 contains only a single value per grid cell, so its GVRS file specifies just one element. When I built ETOPO1_v1.0.4_comp_huffman.gvrs, I assigned this element the name "z". Names serve as a kind of key-string for referring to elements in a GVRS file.

The following code snippet shows an example of how we might read the value stored at grid row 1500 and grid column 1000. When GVRS functions accept grid coordinates, they are always given in the order row first and column second. In the example that follow, the GvrsElement structure we obtained for the name "z" is used to access the underlying data and obtain a value at the specified coordinates.

Gvrs* gvrs;
int status     = GvrsOpen(&gvrs, "ETOPO1_v1.0.4_comp_huffman.gvrs", "r");
GvrsElement* e = GvrsGetElementByName(gvrs, "z");

GvrsFloat fValue;
int status = GvrsElementReadFloat(e, iRow, iCol, &fValue);

When a program opens a GVRS file, the GVRS API allocates a single GvrsElement instance for each element. Since ETOPO1 contains only one data value (elevation), the Gvrs structure initialized from ETOPO1_v1.0.4_comp_huffman.gvrs contains only a single GvrsElement instance. When required, however, the GVRS API supports multiple elements per file.

One thing that might be useful to note here is the way the GVRS API names its functions. Functions named Get access data elements or structurs that exist within the Gvrs instance, but they do not change its data. Get methods do not allocate new memory. Functions named Read access the content of the GVRS raster and may potentially trigger read operations for the associated data file. Some Read functions calls allocate memory that must be managed by the calling application, though the "element read float" operations do not. The design of the Gvrs API attemps to make it clear to the developer when a program must manage memory. For this wiki article, we do not need to be concerned about that issue.

Reading data

So far, the example code snippets haven't done anything particularly interesting. So let's look at an example that actually used the GVRS ETOPO1 file to retrieve elevation data for inhabited locations on the face of the Earth.

Look-up operations using geographic coordinates

For the example below, I created a simple structure named Place that stores a name and a set of geographic coordinates (latitude and longitude). To create some sample data, I populated a set of Place elements with some arbitrarily chosen locations:

struct Place {
    char* name;
    double lat;
    double lon;
};

static struct Place places[] = {
    {"Auckland, NZ",                 -36.84,     174.74},
    {"Coachella, California, US",     33.6811,  -116.1744},
    {"Danbury, Connecticut, US",      41.386,    -73.482},
    {"Dayton, Ohio, US",              39.784,    -84.110},
    {"Deming, New Mexico, US",        32.268,   -107.757},
    {"Denver, Colorado, US",          39.7392,  -104.985},
    {"La Ciudad de Mexico, MX",       19.4450,   -99.1335},
    {"La Paz, Bolivia",              -16.4945,   -68.1389},
    {"Mauna Kea, US",                 19.82093, -155.46814},
    {"McMurdo Station, Antarctica", -77.85033,  166.69187},
    {"Nantes, France",                47.218,     -1.5528},
    {"Pontypridd, Wales",             51.59406,   -3.32126},
    {"Quebec, QC, Canada",            46.81224,  -71.20520},
    {"Sioux Falls, South Dakota, US", 43.56753,  -96.7245},
    {"Suzhou, CN",                    31.3347,   120.629},
    {"Zurich, CH",                    47.38,       8.543},
};

GVRS is not a Geographic Information System (GIS), but the API is useful for processing geophysical data sources. The GVRS API implements support for geographic coordinates as a convenience for developers working with geospatial information.

In the code snippet below, the geographic coordinates for the places are converted to integer grid coordinates and used to query elevations. Note that GVRS always gives geographic coordinates in the order latitude followed by longitude.

int iPlace;
int status;
GvrsFloat fvalue;
GvrsDouble row, col;
for (iPlace = 0; iPlace < nPlaces; iPlace++) {
    struct Place p = places[iPlace];
    GvrsMapGeoToGrid(gvrs, p.lat, p.lon, &row, &col);
    int iRow, iCol;
    iRow = (int)(row + 0.5);
    iCol = (int)(col + 0.5);
    status = GvrsElementReadFloat(e, iRow, iCol, &fvalue);
    if (status) {
        printf("Lookup failed for %s, grid coordinates (%d,%d), error %d\n", 
            p.name, iRow, iCol, status);
        break;
    }
    printf("%-30s %7.2f,  %7.2f, %12.1f%n",
          place.name, place.latitude, place.longitude, fValue);
}

The results are shown below (and, yes, the the town of Coachella really is below sea level).

Name                              Lat         Lon       Elevation (m)
Auckland, NZ                    -36.840,    174.740,         15.0
Coachella, California, US        33.681,   -116.174,        -22.0
Danbury, Connecticut, US         41.386,    -73.482,        154.0
Dayton, Ohio, US                 39.784,    -84.110,        238.0
Deming, New Mexico, US           32.268,   -107.757,       1324.0
Denver, Colorado, US             39.739,   -104.985,       1602.0
La Ciudad de México, MX          19.445,    -99.134,       2240.0
La Paz, Bolivia                 -16.495,    -68.139,       3749.0
Mauna Kea, US                    19.821,   -155.468,       4044.0
McMurdo Station, Antarctica     -77.850,    166.692,         31.0
Nantes, France                   47.218,     -1.553,         22.0
Pontypridd, Wales                51.594,     -3.321,        129.0
Québec, QC, Canada               46.812,    -71.205,         28.0
Sioux Falls, South Dakota, US    43.568,    -96.725,        428.0
Suzhou, CN                       31.335,    120.629,          5.0
Zürich, CH                       47.380,      8.543,        435.0

Look-up operations using interpolation

The elevations shown in the table represent the value at the center of the grid cell that contains the specified coordinates. Since the distance across a grid cell is approximately 1.85 kilometers (1.15 miles), the distance between the specified location and the elevation "point" could be more than a kilometer. To adjust for changes in elevation over distance, we use an interpolation technique.

There are lots of ways to perform an interpolation. Each has their particular merits. For the GVRS project, we implemented an interpolation based on the B-Spline technique. The code below shows an example. One thing to note is that the interpolator accepts GVRS "model coordinates" rather than latitudes and longitudes. Model coordinates are essentially Cartesian coordinates. So the interpolator accepts a value for longitude (e.g. the x coordinate) first and latitude (e.g. the y coordinate) second. In most of the GVRS API, we try to give the specifications for geographic coordinates in the same order they are usually given in spoken languages, "latitude and longutude". The call to the B-Spline interpolator is an exception to that rule.

GvrsInterpolationResult result;
//status = GvrsElementReadFloat(e, iRow, iCol, &fvalue);
status = GvrsInterpolateBspline(e, p.lon, p.lat, 1, &result);
printf("%-30s %9.2f, %9.2f, %8.1f\n", p.name, row, col, result.z);

The table below shows the results. Rather than showing latitude and longitude for the place names, it presents the computed grid coordinates. As we might expect, none of resulting row and column values are exact integral coordinates.

Name                              Row      Column      Elevation (m)
Auckland, NZ                     3189.10,   21283.90,         15.2
Coachella, California, US        7420.37,    3829.04,        -20.9
Danbury, Connecticut, US         7882.66,    6390.58,        162.0
Dayton, Ohio, US                 7786.54,    5752.90,        242.5
Deming, New Mexico, US           7335.56,    4334.08,       1322.1
Denver, Colorado, US             7783.85,    4500.40,       1607.2
La Ciudad de México, MX          6566.20,    4851.49,       2238.9
La Paz, Bolivia                  4409.83,    6711.17,       3762.1
Mauna Kea, US                    6588.76,    1471.41,       3946.8
McMurdo Station, Antarctica       728.48,   20801.01,         83.5
Nantes, France                   8232.58,   10706.33,         15.4
Pontypridd, Wales                8495.14,   10600.22,        142.3
Québec, QC, Canada               8208.23,    6527.19,         23.9
Sioux Falls, South Dakota, US    8013.55,    4996.03,        428.3
Suzhou, CN                       7279.58,   18037.24,          6.1
Zürich, CH                       8242.30,   11312.08,        459.1

The value of interpolation

While the use of an interpolator such as B-Spline adds an extra layer of interpretation to the data, it often represents a better way of modeling a surface. The difference between the two approaches – simple look up, and interpolation – can be summarized as follows:

  1. A simple lookup obtains the data value nearest to the query point.
  2. An interpolation provides an estimate of the actual value at the point of interest.

To get a sense of how surface modeling is affected by interpolation, consider the figure below. The figure shows the elevation profile for the route of the Boston Marathon with elevations in feet and distance traveled in miles. The Boston Marathon follows a set of paved roads starting in Hopkinton, MA and ending on Boylston Street in Boston, MA. The first frame shows the result using a simple lookup (nearest neighbor) without interpolation. The second frame shows the result using interpolation. Clearly, the interpolated profile is a closer match to how we would expect a road surface to behave.

Boston Marathon Elevation Profile in feet and miles

Conclusion

This article introduced techniques used to access the data in a GVRS file. It is the first in a planned series of articles. In future articles I will provide more information about the GVRS API and offer suggestions on how to get the best results when using it.

As always, user feedback is a key to improving both the content of these wiki pages and the GVRS API itself. Got something to say? I'd be interested in hearing from you.

References

National Oceanographic and Atmospheric Administration [NOAA], 2019. ETOPO1 Global Relief Model. Accessed December 2019 from https://www.ngdc.noaa.gov/mgg/global/

University Corporation for Atmospheric Research [UCAR], 2019. NetCDF-Java Library Accessed December 2019 from https://www.unidata.ucar.edu/software/netcdf-java/current/