Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Example of data output for archival #204

Merged
merged 10 commits into from
Sep 8, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](https://semver.org/).

## [0.X.X] - 2023-XX-XX
* Documentation
* Added example of how to export data for archival
* Maintenance
* Implemented unit tests for cleaning warnings
* Use pip install for readthedocs
Expand Down
66 changes: 66 additions & 0 deletions docs/archival.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Building data files for archival at NASA SPDF
=============================================

The codes and routines at pysatNASA are designed for end-users of NASA data
jklenzing marked this conversation as resolved.
Show resolved Hide resolved
products. However, pysat in general has also been used to build operational
instruments for generating archival data to be uploaded to the Space Physics
Data Facility (SPDF) at NASA.

In general, such instruments should include separate naming conventions. An
example of this is the REACH data, where netCDF4 files are generated for
archival purposes as part of the `ops_reach` package, but can be accessed by
jklenzing marked this conversation as resolved.
Show resolved Hide resolved
the end user through pysatNASA.
jklenzing marked this conversation as resolved.
Show resolved Hide resolved

In general, a ``pysat.Instrument`` object can be constructed for any dataset.
Full instructions and conventions can be found
jklenzing marked this conversation as resolved.
Show resolved Hide resolved
`here <https://pysat.readthedocs.io/en/latest/new_instrument.html>`_. In the
case of the REACH data, the operational code reads in a series of csv files and
updates the metadata according to user specifications. Once the file is loaded,
it can be exported to a netCDF4 file via pysat. In the simplest case, this is

::

reach = pysat.Instrument(inst_module=aero_reach, tag='l1b', inst_id=inst_id)
pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch')


However, there are additional options when translating pysat metadata to SPDF
preferred formats. An example of this is

::

# Use meta translation table to include SPDF preferred format.
# Note that multiple names are output for compliance with pysat.
# Using the most generalized form for labels for future compatibility.
meta_dict = {reach.meta.labels.min_val: ['VALIDMIN'],
reach.meta.labels.max_val: ['VALIDMAX'],
reach.meta.labels.units: ['UNITS'],
reach.meta.labels.name: ['CATDESC', 'LABLAXIS', 'FIELDNAM'],
reach.meta.labels.notes: ['VAR_NOTES'],
reach.meta.labels.fill_val: ['_FillValue'],
'Depend_0': ['DEPEND_0'],
'Format': ['FORMAT'],
'Monoton': ['MONOTON'],
'Var_Type': ['VAR_TYPE']}

pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch',
meta_translation=meta_dict,
export_pysat_info=False)


In this case, note that the pysat 'name' label is output to three different
metadata values required by the ITSP standards. Additionally, the
``export_pysat_info`` option is set to false here. This drops several internal
pysat metadata values before writing to file.
jklenzing marked this conversation as resolved.
Show resolved Hide resolved

jklenzing marked this conversation as resolved.
Show resolved Hide resolved
Other best practices for archival include adding the operational software version
to the metadata header before writing. The pysat version will be automatically
written to the metadata.

::

reach.meta.header.Software_version = ops_reach.__version__


A full example script to generate output files can be found at
https://github.com/jklenzing/ops_reach/blob/main/scripts/netcdf_gen.py
1 change: 1 addition & 0 deletions docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ tools

.. toctree::
examples/ex_init.rst

jklenzing marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ CDAWeb interface.
supported_constellations.rst
examples.rst
develop_guide.rst
archival.rst
migration_guide.rst
history.rst

Expand Down
Loading