diff --git a/CHANGELOG.md b/CHANGELOG.md index 3f24602a..6eb4a034 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,8 @@ All notable changes to this project will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/). ## [0.X.X] - 2023-XX-XX +* Documentation + * Added example of how to export data for archival * Maintenance * Implemented unit tests for cleaning warnings * Use pip install for readthedocs diff --git a/docs/archival.rst b/docs/archival.rst new file mode 100644 index 00000000..ad1b1d5e --- /dev/null +++ b/docs/archival.rst @@ -0,0 +1,69 @@ +Building data files for archival at NASA SPDF +============================================= + +The codes and routines at :py:mod:`pysatNASA` are designed for end-users of NASA data +products. However, pysat in general has also been used to build operational +instruments for generating archival data to be uploaded to the Space Physics +Data Facility (SPDF) at NASA. + +In general, such instruments should include separate naming conventions. An +example of this is the REACH data, where netCDF4 files are generated for +archival purposes as part of the :py:mod:`ops_reach` package, but can be accessed by +the end user through :py:mod:`pysatNASA`. + +In general, a :py:class:`pysat.Instrument` object can be constructed for any +dataset. Full instructions and conventions can be found +`here `_. In the +case of the REACH data, the operational code reads in a series of csv files and +updates the metadata according to user specifications. Once the file is loaded, +it can be exported to a netCDF4 file via pysat. In the simplest case, this is + +:: + + reach = pysat.Instrument(inst_module=aero_reach, tag='l1b', inst_id=inst_id) + pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch') + + +However, there are additional options when translating pysat metadata to SPDF +preferred formats. An example of this is + +:: + + # Use meta translation table to include SPDF preferred format. + # Note that multiple names are output for compliance with pysat. + # Using the most generalized form for labels for future compatibility. + meta_dict = {reach.meta.labels.min_val: ['VALIDMIN'], + reach.meta.labels.max_val: ['VALIDMAX'], + reach.meta.labels.units: ['UNITS'], + reach.meta.labels.name: ['CATDESC', 'LABLAXIS', 'FIELDNAM'], + reach.meta.labels.notes: ['VAR_NOTES'], + reach.meta.labels.fill_val: ['_FillValue'], + 'Depend_0': ['DEPEND_0'], + 'Format': ['FORMAT'], + 'Monoton': ['MONOTON'], + 'Var_Type': ['VAR_TYPE']} + + pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch', + meta_translation=meta_dict, + export_pysat_info=False) + + +In this case, note that the pysat 'name' label is output to three different +metadata values required by the ITSP standards. Additionally, the +:py:attr:`export_pysat_info` option is set to false here. This drops several +internal :py:mod:`pysat` metadata values before writing to file. + +A full guide to SPDF metadata standards can be found +`here `_. + +Other best practices for archival include adding the operational software version +to the metadata header before writing. The pysat version will be automatically +written to the metadata. + +:: + + reach.meta.header.Software_version = ops_reach.__version__ + + +A full example script to generate output files can be found at +https://github.com/jklenzing/ops_reach/blob/main/scripts/netcdf_gen.py diff --git a/docs/index.rst b/docs/index.rst index cda7cf4f..926c4f76 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -18,6 +18,7 @@ CDAWeb interface. supported_constellations.rst examples.rst develop_guide.rst + archival.rst migration_guide.rst history.rst