Update catalogs for xscen #68

aulemahal · 2023-02-03T20:15:14Z

Original issue:
intake-esm uses cat.esmcat.aggregation_control.variable_column_name to configure which column stores which variables are in which entry. When it is set, a search like : cat.search(variable=['tas']).to_dataset_dict() will only return tas, and not other variables in the same files. Also, it makes possible the use of the DerivedVariableRegistry with which we can convert data on-the-fly. (Ex: dtr from tasmin and tasmax).

The current intake-esm (even in master) will break if aggregation_control is not given. Also, a fix needs to be implemented to support OpenDAP links. But even if we fix those (see my PR on intake-esm), we are excluding the
PAVICS catalog from useful features by not setting this field.

EDIT: Two PRs on intake-esm have been made:

Aggregation control is now optional
Format== "opendap" is supported

However, the current catalogs have a few others caveats.

Dataset "ID"

Intake_esm won't build the dataset "keys" with fields from columns with both NaNs and values. When "aggregation control" is not given, keys are built by concatenating all columns. Thus, to work with intake-esm, pavics' catalogs must have values for each entry and column. For example, this is isn't true for the "biasadjusted" catalog, where driving_institution is empty for some datasets. Thus, we need to either fill the columns or to have another column acting as a complete dataset id (like xscen does). The current dataset_id does not contain the driving_institution information, so if it is used as a key, intake will receive multiple assets without knowing how to merge them.

xscen

Overall, the catalogs are not easy to work with with xscen. AFAIU, the current catalogs are not used by anyone ? If so, I suggest we copy the xscen vocabulary (column names), allowing an easier interaction, without losing human-readable information. This might necessitate some complex attribute parsing though, as the ncmls on pavics might not carry those attribute as-is.

The text was updated successfully, but these errors were encountered:

aulemahal added enhancement New feature or request question Further information is requested labels Feb 3, 2023

aulemahal changed the title ~~No variable subsetting possible~~ No variable subsetting, conversion possible Feb 3, 2023

huard self-assigned this Feb 3, 2023

huard added this to WP4 - Climate services Feb 6, 2023

huard moved this to Todo in WP4 - Climate services Feb 6, 2023

aulemahal changed the title ~~No variable subsetting, conversion possible~~ Update catalogs for xscen Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update catalogs for xscen #68

Update catalogs for xscen #68

aulemahal commented Feb 3, 2023 •

edited

Loading

Update catalogs for xscen #68

Update catalogs for xscen #68

Comments

aulemahal commented Feb 3, 2023 • edited Loading

Dataset "ID"

xscen

aulemahal commented Feb 3, 2023 •

edited

Loading