Specialized Attributes in CV files #15

mauzey1 · 2023-06-10T01:25:17Z

I'm thinking about ways to make the next version of CMOR/PrePARE able to parse CV files in a more general way to prevent some of the hard-coding issues we have with the current CMOR. However, I've noticed in generic_CV.json and CMIP6Plus_CV.json that there are still attributes that would have to be treated uniquely in the software. Below are some examples.

required_global_attributes
- This is good to have since it allow us to find attributes in files that don't have values listed in the CV, such as creation_date. It will also help with skipping entries in the CV that we don't look for in files such as Header and version_metadata. However, I noticed that this attribute isn't present in generic_CV.json. I thinks this should be a standard part of all CV files.
license and license_info
- In CMIP6Plus_CV.json, we have a license attribute that is a regex string. In the current CMOR, this string is used to match the license string provided by the user input or in a NetCDF file. In each source_id entry there is a license_info attribute containing all of the license info of the model that should be in license string. I think we should replace the regex with a template that gets populated with the info in license_info and then match the resulting string with the license found in the user input/file.
forcing_index, initialization_index, physics_index, realization_index, and variant_label
- The index attributes are integers that are used to form the variant label (ex. "r1p1i1f1"). When writing the attributes to a file, CMOR will just need to recognize that the index values are integers to compose the variant label. However, PrePARE will need to parse the variant label to compare the values in it with the indexes.
other attributes that have a regex string for a value
- CMOR/PrePARE would have to recognize that some attributes are meant to be treated as regex values rather than plain strings. These attributes include tracking_id, further_info_url, and data_specs_version. Shouldn't data_specs_version and Conventions value be checked against those in the header of the MIP variable table being used? Are tracking_id and further_info_url still relevant?
source_type, required_model_components, and additional_allowed_model_components
- Both CVs have a source_type attribute with a list of values. CMIP6Plus_CV.json also has experiment_id entries that have the attributes required_model_components and additional_allowed_model_components, which are used for finding required and additional values in the source_type attribute (ex. "source_type": "AOGCM ISM AER"). Are we suppose to check source_type only when we have required_model_components and additional_allowed_model_components present?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialized Attributes in CV files #15

Specialized Attributes in CV files #15

mauzey1 commented Jun 10, 2023

Specialized Attributes in CV files #15

Specialized Attributes in CV files #15

Comments

mauzey1 commented Jun 10, 2023