Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source_id: what info is required for registration? #4

Closed
larsbuntemeyer opened this issue Oct 16, 2022 · 10 comments
Closed

source_id: what info is required for registration? #4

larsbuntemeyer opened this issue Oct 16, 2022 · 10 comments

Comments

@larsbuntemeyer
Copy link
Contributor

larsbuntemeyer commented Oct 16, 2022

Inherited from CMIP6, we have, e.g.,

{
    "source_id": {
        "REMO2020": {
            "activity_participation": [
                "CORDEX"
            ],
            "cohort": [
                "Registered"
            ],
            "institution_id": [
                "GERICS"
            ],
            "label":"REMO2020",
            "license":"REMO2020 data produced by GERICS is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing input4MIPs output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.",
            "model_component":{
                "aerosol":{
                    "description":"CLASSIC (v1.0)",
                    "native_nominal_resolution":"250 km"
                },
                "atmos":{
                    "description":"",
                    "native_nominal_resolution":"250 km"
                },
                "atmosChem":{
                    "description":"none",
                    "native_nominal_resolution":"none"
                },
                "land":{
                    "description":"",
                    "native_nominal_resolution":"250 km"
                },
                "landIce":{
                    "description":"",
                    "native_nominal_resolution":"none"
                },
                "ocean":{
                    "description":"prescribed",
                    "native_nominal_resolution":"100 km"
                },
                "ocnBgchem":{
                    "description":"",
                    "native_nominal_resolution":"100 km"
                },
                "seaIce":{
                    "description":"prescribed",
                    "native_nominal_resolution":"100 km"
                }
            },
            "release_year":"2022",
            "source_id": "REMO2020"
        }
    }
}
@larsbuntemeyer larsbuntemeyer changed the title source_id what info is required for registration: source_id Oct 16, 2022
@larsbuntemeyer
Copy link
Contributor Author

should we keep model_components?. This informations is condensed during creation of the CV file, e.g., into:

 "source_id":{
            "REMO2020":{
                "activity_participation":[
                    "CORDEX"
                ],  
                "cohort":[
                    "Registered"
                ],  
                "institution_id":[
                    "GERICS"
                ],  
                "license":"REMO2020 data produced by GERICS is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing input4MIPs output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.",
                "source_id":"REMO2020",
                "source":"REMO2020 (2022): \naerosol: CLASSIC (v1.0)\natmos: HadGAM2 (r1.1, N96; 192 x 145 longitude/latitude; 38 levels; top level 39255 m)\natmosChem: none\nland: CABLE2.4\nlandIce: none\nocean: ACCESS-OM2 (MOM5, tripolar primarily 1deg; 360 x 300 longitude/latitude; 50 levels; top grid cell 0-10 m)\nocnBgchem: WOMBAT (same grid as ocean)\nseaIce: CICE4.1 (same grid as ocean)"
            }   
        },  

@larsbuntemeyer larsbuntemeyer changed the title what info is required for registration: source_id source_id: what info is required for registration? Oct 18, 2022
@gnikulin
Copy link
Contributor

gnikulin commented Nov 7, 2022

In general, we can keep the CMIP6 template with some updates (e.g. lake model) and perhaps we don't need "native_nominal_resolution" as it's not a constant and depends on resolution of a domain (e.g. EUR-44 or EUR-11).

@larsbuntemeyer
Copy link
Contributor Author

might become helpful to gather model docs: https://github.com/ES-DOC.

@larsbuntemeyer larsbuntemeyer changed the title source_id: what info is required for registration? source_id: what info is required for registration? Jan 27, 2023
@larsbuntemeyer larsbuntemeyer transferred this issue from WCRP-CORDEX/cordex-cmip6-cmor-tables Apr 4, 2023
@jesusff
Copy link
Contributor

jesusff commented Oct 23, 2023

I follow up a comment from @sethmcg on #19 here, which seems more on topic:

Does the source_id identify the model / method used to perform the downscaling? If so, I'm not sure that release_year and institution_id are well-defined for methods that aren't RCMs. For example, what would they be for the (simplistic but still widely-used) ESD method of interpolation + bias-correction?

The source_id should identify the method used. In this sense, I think institution_id would be perfectly defined for simple ESD methods. It must reflect the groups using exactly that method and keep a consistent source_id among them as long as the method is exactly the same. It does not mean that a given institution developed the method.

Regarding the release year, the group could provide the first registered use of the particular method (e.g. the year of the oldest paper one can find using this method). I guess this is also the spirit of collecting this info in GCMs/RCMs; to have an idea of the latest update of a method.

We could have one such example already from CAM-11. At UCR, they have already applied an ESD method to CMIP6 models. They call it BCSD and use as reference Wood et al, 2004. Therefore, this could register as:

{
    "source_id": {
        "BCSD": {
            "activity_participation": [
                "CORDEX-ESD"
            ],
            "cohort": [
                "Registered"
            ],
            "institution_id": [
                "UCR"
            ],
            "label":"Bias Correction and Spatial Disaggregation",
            "source_type":"ESD"
            "license":"Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/).",
            "model_description":"https://doi.org/10.1023/B:CLIM.0000013685.99609.9e",
            "release_year":"2004",
            "source_id": "BCSD"
        }
    }
}

Here, I'm using some changes proposed in #19 and #20, but not yet in place. Namely, CORDEX-ESD as activity_participation (instead of just CORDEX), a source_type, and use of a model_description URL instead of the model_component
@larsbuntemeyer , must source be built out of the model_component? or could we include it explicitly inthe json entry? In this case, I would say that "source"="Bias Correction and Spatial Disaggregation" and the label just as the source_id: "label"="BCSD"

@sethmcg
Copy link

sethmcg commented Oct 23, 2023

That makes sense, but I think it needs to be spelled out explicitly somewhere. The CMIP documents are very much written from the GCM developer's perspective, and it's not always obvious how to adapt things to downscaling activities. If CORDEX CVs are inheriting a lot of metadata architecture from CMIP, I think there should be a document in this repo (or at least pointed to in the README) that references the CMIP6 Global Attributes, DRS, Filenames, Directory Structure, and CV’s
doc and details what has been added / updated / changed / expanded for CORDEX.

@jesusff
Copy link
Contributor

jesusff commented Oct 23, 2023

Well, this document should be the CORDEX-CMIP6 Archiving Specifications we are writing in parallel to the development of this repo. In this repo, most files are simple lists of values corresponding to a given CV element. For those which are not simple lists (source_id, experiment_id, ...) we could include here a companion markdown file (CORDEX-CMIP6_source_id.md) with the explanation of their structure. Much like https://github.com/WCRP-CMIP/CMIP6_CVs/blob/master/.github/Model_registration_template.md , which can then be used in the registration to provide instructions to correctly fill the registration issue template.

@gnikulin
Copy link
Contributor

gnikulin commented Nov 6, 2023

regarding release_year originated in CMIP6, can be simply "provide when relevant"

@gnikulin
Copy link
Contributor

gnikulin commented Nov 6, 2023

"activity_participation" should be simply ESD in this case

{
    "source_id": {
        "BCSD": {
            "activity_participation": [
                "CORDEX-ESD"
            ],
            "cohort": [
                "Registered"
            ],
            "institution_id": [
                "UCR"
            ],
            "label":"Bias Correction and Spatial Disaggregation",
            "source_type":"ESD"
            "license":"Creative Commons Attribution 4.0 International License (CC BY 4.0; https://creativecommons.org/licenses/by/4.0/).",
            "model_description":"https://doi.org/10.1023/B:CLIM.0000013685.99609.9e",
            "release_year":"2004",
            "source_id": "BCSD"
        }
    }
}

@jesusff
Copy link
Contributor

jesusff commented Dec 7, 2023

We will need a new building rule for the source global attribute. It used to be a text with all model components pasted together. Should we now just take the label_extended as source? or something more elaborated? For example:

source = f"{label_extended}. See {further_info_url} for further configuration details."

@gnikulin
Copy link
Contributor

further_info_url in CMIP6 leads to ES-DOC that we are missing in CORDEX-CMIP6. Can we skip label_extendedand use simply source (full model name/version). Or we need to use label_extended for consistency with CMIP6 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

4 participants