Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

Add tutorial module for sample datasets #142

Closed
wants to merge 3 commits into from

Conversation

andersy005
Copy link
Member

@andersy005 andersy005 commented Aug 4, 2022

In [1]: import datatree

In [3]: dt = datatree.tutorial.open_datatree('cesm2-lens')

In [4]: dt
Out[4]: 
DataTree('None', parent=None)
├── DataTree('ocn')
│   ├── DataTree('historical')
│   │   └── DataTree('monthly')
│   │       ├── DataTree('cmip6')
│   │       │       Dimensions:     (member_id: 1, time: 6, z_t: 1, nlat: 384, nlon: 320, d2: 2)
│   │       │       Coordinates:
│   │       │         * member_id   (member_id) object 'r10i1181p1f1'
│   │       │         * time        (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│   │       │           time_bound  (time, d2) object ...
│   │       │         * z_t         (z_t) float32 500.0
│   │       │       Dimensions without coordinates: nlat, nlon, d2
│   │       │       Data variables:
│   │       │           O2          (member_id, time, z_t, nlat, nlon) float32 ...
│   │       │       Attributes:
│   │       │           Conventions:             CF-1.0; http://www.cgd.ucar.edu/cms/eaton/netcdf...
│   │       │           calendar:                All years have exactly  365 days.
│   │       │           cell_methods:            cell_methods = time: mean ==> the variable value...
│   │       │           contents:                Diagnostic and Prognostic Variables
│   │       │           model_doi_url:           https://doi.org/10.5065/D67H1H0V
│   │       │           revision:                $Id$
│   │       │           source:                  CCSM POP2, the CCSM Ocean Component
│   │       │           start_time:              This dataset was created on 2020-07-18 at 07:26:...
│   │       │           time_period_freq:        month_1
│   │       │           intake_esm_dataset_key:  ocn/historical/monthly/cmip6
│   │       └── DataTree('smbb')
│   │               Dimensions:     (member_id: 1, time: 6, z_t: 1, nlat: 384, nlon: 320, d2: 2)
│   │               Coordinates:
│   │                 * member_id   (member_id) object 'r11i1231p1f2'
│   │                 * time        (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│   │                   time_bound  (time, d2) object ...
│   │                 * z_t         (z_t) float32 500.0
│   │               Dimensions without coordinates: nlat, nlon, d2
│   │               Data variables:
│   │                   O2          (member_id, time, z_t, nlat, nlon) float32 ...
│   │               Attributes:
│   │                   Conventions:             CF-1.0; http://www.cgd.ucar.edu/cms/eaton/netcdf...
│   │                   calendar:                All years have exactly  365 days.
│   │                   cell_methods:            cell_methods = time: mean ==> the variable value...
│   │                   contents:                Diagnostic and Prognostic Variables
│   │                   model_doi_url:           https://doi.org/10.5065/D67H1H0V
│   │                   revision:                $Id$
│   │                   source:                  CCSM POP2, the CCSM Ocean Component
│   │                   time_period_freq:        month_1
│   │                   intake_esm_dataset_key:  ocn/historical/monthly/smbb
│   └── DataTree('ssp370')
│       └── DataTree('monthly')
│           ├── DataTree('cmip6')
│           │       Dimensions:     (member_id: 1, time: 6, z_t: 1, nlat: 384, nlon: 320, d2: 2)
│           │       Coordinates:
│           │         * member_id   (member_id) object 'r10i1181p1f1'
│           │         * time        (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00
│           │           time_bound  (time, d2) object ...
│           │         * z_t         (z_t) float32 500.0
│           │       Dimensions without coordinates: nlat, nlon, d2
│           │       Data variables:
│           │           O2          (member_id, time, z_t, nlat, nlon) float32 ...
│           │       Attributes:
│           │           Conventions:             CF-1.0; http://www.cgd.ucar.edu/cms/eaton/netcdf...
│           │           calendar:                All years have exactly  365 days.
│           │           cell_methods:            cell_methods = time: mean ==> the variable value...
│           │           contents:                Diagnostic and Prognostic Variables
│           │           model_doi_url:           https://doi.org/10.5065/D67H1H0V
│           │           revision:                $Id$
│           │           source:                  CCSM POP2, the CCSM Ocean Component
│           │           time_period_freq:        month_1
│           │           intake_esm_dataset_key:  ocn/ssp370/monthly/cmip6
│           └── DataTree('smbb')
│                   Dimensions:     (member_id: 1, time: 6, z_t: 1, nlat: 384, nlon: 320, d2: 2)
│                   Coordinates:
│                     * member_id   (member_id) object 'r11i1231p1f2'* time        (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00time_bound  (time, d2) object ...
│                     * z_t         (z_t) float32 500.0Dimensions without coordinates: nlat, nlon, d2Data variables:
│                       O2          (member_id, time, z_t, nlat, nlon) float32 ...
│                   Attributes:
│                       Conventions:             CF-1.0; http://www.cgd.ucar.edu/cms/eaton/netcdf...
│                       calendar:                All years have exactly  365 days.
│                       cell_methods:            cell_methods = time: mean ==> the variable value...
│                       contents:                Diagnostic and Prognostic Variablesmodel_doi_url:           https://doi.org/10.5065/D67H1H0Vrevision:                $Id$
│                       source:                  CCSM POP2, the CCSM Ocean Componenttime_period_freq:        month_1intake_esm_dataset_key:  ocn/ssp370/monthly/smbb
└── DataTree('atm')
    ├── DataTree('ssp370')
    │   └── DataTree('monthly')
    │       ├── DataTree('cmip6')
    │       │       Dimensions:    (member_id: 1, time: 6, lat: 192, lon: 288, nbnd: 2)
    │       │       Coordinates:
    │       │         * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
    │       │         * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
    │       │         * member_id  (member_id) object 'r10i1181p1f1'
    │       │         * time       (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00
    │       │           time_bnds  (time, nbnd) object ...
    │       │       Dimensions without coordinates: nbnd
    │       │       Data variables:
    │       │           PRECC      (member_id, time, lat, lon) float32 ...
    │       │           TREFHT     (member_id, time, lat, lon) float32 ...
    │       │       Attributes:
    │       │           source:                  CAM
    │       │           logname:                 sunseon
    │       │           Conventions:             CF-1.0
    │       │           time_period_freq:        month_1
    │       │           host:                    mom1
    │       │           topography_file:         /mnt/lustre/share/CESM/cesm_input/atm/cam/topo/f...
    │       │           model_doi_url:           https://doi.org/10.5065/D67H1H0V
    │       │           intake_esm_dataset_key:  atm/ssp370/monthly/cmip6
    │       └── DataTree('smbb')
    │               Dimensions:    (member_id: 1, time: 6, lat: 192, lon: 288, nbnd: 2)
    │               Coordinates:
    │                 * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0* lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8* member_id  (member_id) object 'r10i1191p1f2'* time       (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00time_bnds  (time, nbnd) object ...
    │               Dimensions without coordinates: nbndData variables:
    │                   PRECC      (member_id, time, lat, lon) float32 ...
    │                   TREFHT     (member_id, time, lat, lon) float32 ...
    │               Attributes:
    │                   source:                  CAMlogname:                 sunseonConventions:             CF-1.0time_period_freq:        month_1topography_file:         /mnt/lustre/share/CESM/cesm_input/atm/cam/topo/f...
    │                   model_doi_url:           https://doi.org/10.5065/D67H1H0Vintake_esm_dataset_key:  atm/ssp370/monthly/smbb
    └── DataTree('historical')
        └── DataTree('monthly')
            ├── DataTree('cmip6')
            │       Dimensions:    (member_id: 1, time: 6, lat: 192, lon: 288, nbnd: 2)
            │       Coordinates:
            │         * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0* lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8* member_id  (member_id) object 'r10i1181p1f1'* time       (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00time_bnds  (time, nbnd) object ...
            │       Dimensions without coordinates: nbndData variables:
            │           PRECC      (member_id, time, lat, lon) float32 ...
            │           TREFHT     (member_id, time, lat, lon) float32 ...
            │       Attributes:
            │           source:                  CAMlogname:                 sunseonConventions:             CF-1.0time_period_freq:        month_1NCO:                     netCDF Operators version 4.9.4 (Homepage = http:...
            │           topography_file:         /mnt/lustre/share/CESM/cesm_input/atm/cam/topo/f...
            │           model_doi_url:           https://doi.org/10.5065/D67H1H0Vintake_esm_dataset_key:  atm/historical/monthly/cmip6
            └── DataTree('smbb')
                    Dimensions:    (member_id: 1, time: 6, lat: 192, lon: 288, nbnd: 2)
                    Coordinates:
                      * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
                      * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
                      * member_id  (member_id) object 'r10i1191p1f2'
                      * time       (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
                        time_bnds  (time, nbnd) object ...
                    Dimensions without coordinates: nbnd
                    Data variables:
                        PRECC      (member_id, time, lat, lon) float32 ...
                        TREFHT     (member_id, time, lat, lon) float32 ...
                    Attributes:
                        source:                  CAM
                        logname:                 sunseon
                        Conventions:             CF-1.0
                        time_period_freq:        month_1
                        topography_file:         /mnt/lustre/share/CESM/cesm_input/atm/cam/topo/f...
                        model_doi_url:           https://doi.org/10.5065/D67H1H0V
                        intake_esm_dataset_key:  atm/historical/monthly/smbb

@TomNicholas
Copy link
Member

This looks great @andersy005 !

Code-wise I see no issues, and would be happy to merge.

The only thing I might want to change is the data itself: can we simplify it slightly? The raw data has obscure variable names (PRECC?), unneeded dimensions (e.g. member_id), extra nesting (monthly only has one entry). For tutorial data we might want to instead clean it a bit and re-upload it. If I clean it myself locally is there an easy way to replace what's in the carbonplan bucket? (Maybe we could also just put it straight in https://github.com/pydata/xarray-data too...)

@andersy005
Copy link
Member Author

The raw data has obscure variable names (PRECC?)

that's just CESM naming convention which isn't CF-compliant. we can exclude this dataset... The CMIP6 version which includes multi models, multi experiments should suffice

unneeded dimensions (e.g. member_id), extra nesting (monthly only has one entry)

I was trying to maintain the dimensionality of the original dataset, but i can easily get rid of those.

@andersy005
Copy link
Member Author

andersy005 commented Aug 8, 2022

I'm going to trim down the CMIP6 sample, and will add to pydata/xarray-data repository.

@andersy005
Copy link
Member Author

andersy005 commented Aug 8, 2022

@TomNicholas, here's what the CMIP sample looks like

CMIP6 Sample
DataTree('None', parent=None)
├── DataTree('CMIP')
│   ├── DataTree('CCCma')
│   │   └── DataTree('CanESM5')
│   │       └── DataTree('historical')
│   │           ├── DataTree('Amon')
│   │           │   └── DataTree('gn')
│   │           │           Dimensions:    (lat: 64, bnds: 2, lon: 128, time: 6)
│   │           │           Coordinates:
│   │           │             * lat        (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│   │           │               lat_bnds   (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│   │           │             * lon        (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│   │           │               lon_bnds   (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│   │           │             * time       (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│   │           │               time_bnds  (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│   │           │               member_id  <U8 'r1i1p1f1'
│   │           │           Dimensions without coordinates: bnds
│   │           │           Data variables:
│   │           │               pr         (time, lat, lon) float32 7.221e-07 8.962e-07 ... 1.108e-05
│   │           │           Attributes: (12/57)
│   │           │               CCCma_model_hash:            3dedf95315d603326fde4f5340dc0519d80d10c0
│   │           │               CCCma_parent_runid:          rc3-pictrl
│   │           │               CCCma_pycmor_hash:           33c30511acc319a98240633965a04ca99c26427e
│   │           │               CCCma_runid:                 rc3.1-his01
│   │           │               Conventions:                 CF-1.7 CMIP-6.2
│   │           │               YMDH_branch_time_in_child:   1850:01:01:00
│   │           │               ...                          ...
│   │           │               variant_label:               r1i1p1f1
│   │           │               version:                     v20190429
│   │           │               status:                      2019-10-25;created;by nhn2@columbia.edu
│   │           │               netcdf_tracking_ids:         hdl:21.14100/363e1ebe-46e7-43dc-9feb-a7a4a0c...
│   │           │               version_id:                  v20190429
│   │           │               intake_esm_dataset_key:      CMIP/CCCma/CanESM5/historical/Amon/gn
│   │           ├── DataTree('Lmon')
│   │           │   └── DataTree('gn')
│   │           │           Dimensions:    (time: 6, lat: 64, lon: 128, bnds: 2)
│   │           │           Coordinates:
│   │           │             * lat        (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│   │           │               lat_bnds   (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│   │           │             * lon        (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│   │           │               lon_bnds   (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│   │           │             * time       (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│   │           │               time_bnds  (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│   │           │               member_id  <U8 'r1i1p1f1'
│   │           │           Dimensions without coordinates: bnds
│   │           │           Data variables:
│   │           │               gpp        (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│   │           │               mrso       (time, lat, lon) float32 3.76e+03 3.76e+03 3.76e+03 ... 0.0 0.0
│   │           │           Attributes: (12/53)
│   │           │               variant_label:               r1i1p1f1
│   │           │               mip_era:                     CMIP6
│   │           │               license:                     CMIP6 model data produced by The Government ...
│   │           │               contact:                     ec.cccma.info-info.ccmac.ec@canada.ca
│   │           │               parent_variant_label:        r1i1p1f1
│   │           │               source_type:                 AOGCM
│   │           │               ...                          ...
│   │           │               realm:                       land
│   │           │               branch_time_in_child:        0.0
│   │           │               source:                      CanESM5 (2019): \naerosol: interactive\natmo...
│   │           │               initialization_index:        1
│   │           │               further_info_url:            https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
│   │           │               intake_esm_dataset_key:      CMIP/CCCma/CanESM5/historical/Lmon/gn
│   │           └── DataTree('Omon')
│   │               └── DataTree('gn')
│   │                       Dimensions:             (i: 360, j: 291, bnds: 2, time: 6, vertices: 4)
│   │                       Coordinates:
│   │                         * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
│   │                         * j                   (j) int32 0 1 2 3 4 5 6 ... 284 285 286 287 288 289 290
│   │                           latitude            (j, i) float64 -78.39 -78.39 -78.39 ... 50.23 50.01
│   │                           lev                 float64 3.047
│   │                           lev_bnds            (bnds) float64 0.0 6.194
│   │                           longitude           (j, i) float64 73.5 74.5 75.5 76.5 ... 72.95 72.96 72.99
│   │                         * time                (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:0...
│   │                           time_bnds           (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-0...
│   │                           member_id           <U8 'r1i1p1f1'
│   │                       Dimensions without coordinates: bnds, vertices
│   │                       Data variables:
│   │                           no3                 (time, j, i) float32 nan nan nan nan ... nan nan nan nan
│   │                           vertices_latitude   (j, i, vertices) float64 -78.29 -78.49 ... 50.11 50.11
│   │                           vertices_longitude  (j, i, vertices) float64 74.0 74.0 73.0 ... 72.95 73.0
│   │                           thetao              (time, j, i) float32 nan nan nan nan ... nan nan nan nan
│   │                       Attributes: (12/52)
│   │                           variant_label:               r1i1p1f1
│   │                           mip_era:                     CMIP6
│   │                           license:                     CMIP6 model data produced by The Government ...
│   │                           contact:                     ec.cccma.info-info.ccmac.ec@canada.ca
│   │                           parent_variant_label:        r1i1p1f1
│   │                           source_type:                 AOGCM
│   │                           ...                          ...
│   │                           physics_index:               1
│   │                           branch_time_in_child:        0.0
│   │                           source:                      CanESM5 (2019): \naerosol: interactive\natmo...
│   │                           initialization_index:        1
│   │                           further_info_url:            https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
│   │                           intake_esm_dataset_key:      CMIP/CCCma/CanESM5/historical/Omon/gn
│   ├── DataTree('MIROC')
│   │   └── DataTree('MIROC6')
│   │       └── DataTree('historical')
│   │           ├── DataTree('Lmon')
│   │           │   └── DataTree('gn')
│   │           │           Dimensions:    (lat: 128, bnds: 2, lon: 256, time: 6)
│   │           │           Coordinates:
│   │           │             * lat        (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│   │           │               lat_bnds   (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│   │           │             * lon        (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│   │           │               lon_bnds   (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│   │           │             * time       (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│   │           │               time_bnds  (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 1850-07-01
│   │           │               member_id  <U8 'r1i1p1f1'
│   │           │           Dimensions without coordinates: bnds
│   │           │           Data variables:
│   │           │               mrso       (time, lat, lon) float32 4.2e+03 4.2e+03 4.2e+03 ... nan nan nan
│   │           │           Attributes: (12/48)
│   │           │               Conventions:             CF-1.7 CMIP-6.2
│   │           │               activity_id:             CMIP
│   │           │               branch_method:           standard
│   │           │               branch_time_in_child:    0.0
│   │           │               branch_time_in_parent:   0.0
│   │           │               cmor_version:            3.3.2
│   │           │               ...                      ...
│   │           │               variable_id:             mrso
│   │           │               variant_label:           r1i1p1f1
│   │           │               status:                  2019-10-25;created;by nhn2@columbia.edu
│   │           │               netcdf_tracking_ids:     hdl:21.14100/a702781b-b6d9-4f90-a65d-c649d59a224...
│   │           │               version_id:              v20190311
│   │           │               intake_esm_dataset_key:  CMIP/MIROC/MIROC6/historical/Lmon/gn
│   │           ├── DataTree('Amon')
│   │           │   └── DataTree('gn')
│   │           │           Dimensions:    (lat: 128, bnds: 2, lon: 256, time: 6)
│   │           │           Coordinates:
│   │           │             * lat        (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│   │           │               lat_bnds   (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│   │           │             * lon        (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│   │           │               lon_bnds   (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│   │           │             * time       (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│   │           │               time_bnds  (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 1850-07-01
│   │           │               member_id  <U8 'r1i1p1f1'
│   │           │           Dimensions without coordinates: bnds
│   │           │           Data variables:
│   │           │               pr         (time, lat, lon) float32 2.144e-06 2.169e-06 ... 8.586e-06
│   │           │           Attributes: (12/48)
│   │           │               Conventions:             CF-1.7 CMIP-6.2
│   │           │               activity_id:             CMIP
│   │           │               branch_method:           standard
│   │           │               branch_time_in_child:    0.0
│   │           │               branch_time_in_parent:   0.0
│   │           │               cmor_version:            3.3.2
│   │           │               ...                      ...
│   │           │               variable_id:             pr
│   │           │               variant_label:           r1i1p1f1
│   │           │               status:                  2019-10-25;created;by nhn2@columbia.edu
│   │           │               netcdf_tracking_ids:     hdl:21.14100/61fa8b6b-e74c-4e86-9344-8ba946ee8a8...
│   │           │               version_id:              v20181212
│   │           │               intake_esm_dataset_key:  CMIP/MIROC/MIROC6/historical/Amon/gn
│   │           └── DataTree('Omon')
│   │               └── DataTree('gn')
│   │                       Dimensions:             (y: 256, x: 360, time: 6, bnds: 2, vertices: 4)
│   │                       Coordinates: (12/13)
│   │                           latitude            (y, x) float32 -88.0 -88.0 -88.0 ... 64.43 64.0 63.56
│   │                           lev                 float64 1.0
│   │                           lev_bnds            (bnds) float64 0.0 2.0
│   │                           longitude           (y, x) float32 60.5 61.5 62.5 63.5 ... 59.96 59.98 59.99
│   │                           sigma_bnds          (bnds) float64 -0.0 -0.04
│   │                         * time                (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│   │                           ...                  ...
│   │                         * x                   (x) float64 0.5 1.5 2.5 3.5 ... 356.5 357.5 358.5 359.5
│   │                           x_bnds              (x, bnds) float64 0.0 1.0 1.0 2.0 ... 359.0 359.0 360.0
│   │                         * y                   (y) float64 -88.0 -85.75 -85.25 ... 148.6 150.5 152.4
│   │                           y_bnds              (y, bnds) float64 -90.0 -86.0 -86.0 ... 151.5 153.3
│   │                           zlev_bnds           (bnds) float64 -0.0 -2.0
│   │                           member_id           <U8 'r1i1p1f1'
│   │                       Dimensions without coordinates: bnds, vertices
│   │                       Data variables:
│   │                           depth               (y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│   │                           depth_c             float64 50.0
│   │                           eta                 (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│   │                           nsigma              int32 10
│   │                           sigma               float64 -0.02
│   │                           thetao              (time, y, x) float32 nan nan nan nan ... nan nan nan nan
│   │                           vertices_latitude   (y, x, vertices) float32 -90.0 -90.0 ... 63.33 63.78
│   │                           vertices_longitude  (y, x, vertices) float32 60.0 61.0 61.0 ... 60.0 60.0
│   │                           zlev                float64 -1.0
│   │                       Attributes: (12/48)
│   │                           Conventions:             CF-1.7 CMIP-6.2
│   │                           activity_id:             CMIP
│   │                           branch_method:           standard
│   │                           branch_time_in_child:    0.0
│   │                           branch_time_in_parent:   0.0
│   │                           cmor_version:            3.3.2
│   │                           ...                      ...
│   │                           variable_id:             thetao
│   │                           variant_label:           r1i1p1f1
│   │                           status:                  2019-11-08;created;by nhn2@columbia.edu
│   │                           netcdf_tracking_ids:     hdl:21.14100/16598b35-19b4-49e3-98de-27b9e9444ad...
│   │                           version_id:              v20190311
│   │                           intake_esm_dataset_key:  CMIP/MIROC/MIROC6/historical/Omon/gn
│   └── DataTree('NCAR')
│       └── DataTree('CESM2-WACCM')
│           └── DataTree('historical')
│               ├── DataTree('Omon')
│               │   ├── DataTree('gr')
│               │   │       Dimensions:    (lat: 180, d2: 2, lon: 360, time: 6)
│               │   │       Coordinates:
│               │   │         * lat        (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
│               │   │           lat_bnds   (lat, d2) float64 -90.0 -89.0 -89.0 -88.0 ... 88.0 89.0 89.0 90.0
│               │   │           lev        float64 0.0
│               │   │           lev_bnds   (d2) float64 0.0 5.0
│               │   │         * lon        (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
│               │   │           lon_bnds   (lon, d2) float64 0.0 1.0 1.0 2.0 2.0 ... 358.0 359.0 359.0 360.0
│               │   │         * time       (time) object 1850-01-15 12:59:59.999997 ... 1850-06-15 00:00:00
│               │   │           time_bnds  (time, d2) object 1850-01-01 02:00:00.000003 ... 1850-07-01 00...
│               │   │           member_id  <U8 'r1i1p1f1'
│               │   │       Dimensions without coordinates: d2
│               │   │       Data variables:
│               │   │           no3        (time, lat, lon) float32 nan nan nan ... 0.006828 0.006827
│               │   │           thetao     (time, lat, lon) float32 nan nan nan nan ... -1.763 -1.763 -1.762
│               │   │       Attributes: (12/45)
│               │   │           variant_label:           r1i1p1f1
│               │   │           mip_era:                 CMIP6
│               │   │           license:                 CMIP6 model data produced by <The National Cente...
│               │   │           contact:                 cesm_cmip6@ucar.edu
│               │   │           parent_variant_label:    r1i1p1f1
│               │   │           source_type:             AOGCM BGC CHEM AER
│               │   │           ...                      ...
│               │   │           case_id:                 4
│               │   │           branch_time_in_child:    674885.0
│               │   │           source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│               │   │           initialization_index:    1
│               │   │           further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│               │   │           intake_esm_dataset_key:  CMIP/NCAR/CESM2-WACCM/historical/Omon/gr
│               │   └── DataTree('gn')
│               │           Dimensions:    (nlat: 384, nlon: 320, vertices: 4, d2: 2, time: 6)
│               │           Coordinates:
│               │               lat        (nlat, nlon) float64 -79.22 -79.22 -79.22 ... 72.2 72.19 72.19
│               │               lat_bnds   (nlat, nlon, vertices) float32 -79.49 -79.49 ... 72.41 72.41
│               │               lev        float64 500.0
│               │               lev_bnds   (d2) float32 0.0 10.0
│               │               lon        (nlat, nlon) float64 320.6 321.7 322.8 ... 318.9 319.4 319.8
│               │               lon_bnds   (nlat, nlon, vertices) float32 320.0 321.1 321.1 ... 320.0 319.6
│               │             * nlat       (nlat) int32 1 2 3 4 5 6 7 8 ... 377 378 379 380 381 382 383 384
│               │             * nlon       (nlon) int32 1 2 3 4 5 6 7 8 ... 313 314 315 316 317 318 319 320
│               │             * time       (time) object 1850-01-15 13:00:00 ... 1850-06-15 00:00:00
│               │               time_bnds  (time, d2) object 1850-01-01 02:00:00.000003 ... 1850-07-01 00...
│               │               member_id  <U8 'r1i1p1f1'
│               │           Dimensions without coordinates: vertices, d2
│               │           Data variables:
│               │               no3        (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│               │               thetao     (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│               │           Attributes: (12/45)
│               │               variant_label:           r1i1p1f1
│               │               mip_era:                 CMIP6
│               │               license:                 CMIP6 model data produced by <The National Cente...
│               │               contact:                 cesm_cmip6@ucar.edu
│               │               parent_variant_label:    r1i1p1f1
│               │               source_type:             AOGCM BGC CHEM AER
│               │               ...                      ...
│               │               case_id:                 4
│               │               branch_time_in_child:    674885.0
│               │               source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│               │               initialization_index:    1
│               │               further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│               │               intake_esm_dataset_key:  CMIP/NCAR/CESM2-WACCM/historical/Omon/gn
│               ├── DataTree('Amon')
│               │   └── DataTree('gn')
│               │           Dimensions:    (time: 6, lat: 192, lon: 288, nbnd: 2)
│               │           Coordinates:
│               │             * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
│               │               lat_bnds   (lat, nbnd) float64 -90.0 -89.53 -89.53 ... 89.53 89.53 90.0
│               │             * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
│               │               lon_bnds   (lon, nbnd) float64 -0.625 0.625 0.625 ... 358.1 358.1 359.4
│               │               plev       float64 1e+05
│               │             * time       (time) object 1850-01-15 12:00:00 ... 1850-06-15 00:00:00
│               │               time_bnds  (time, nbnd) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│               │               member_id  <U8 'r1i1p1f1'
│               │           Dimensions without coordinates: nbnd
│               │           Data variables:
│               │               co2        (time, lat, lon) float32 nan nan nan ... 0.0002868 0.0002868
│               │               pr         (time, lat, lon) float32 2.706e-06 2.706e-06 ... 4.324e-06
│               │           Attributes: (12/46)
│               │               variant_label:           r1i1p1f1
│               │               mip_era:                 CMIP6
│               │               license:                 CMIP6 model data produced by <The National Cente...
│               │               contact:                 cesm_cmip6@ucar.edu
│               │               parent_variant_label:    r1i1p1f1
│               │               source_type:             AOGCM BGC CHEM AER
│               │               ...                      ...
│               │               case_id:                 4
│               │               branch_time_in_child:    674885.0
│               │               source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│               │               initialization_index:    1
│               │               further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│               │               intake_esm_dataset_key:  CMIP/NCAR/CESM2-WACCM/historical/Amon/gn
│               └── DataTree('Lmon')
│                   └── DataTree('gn')
│                           Dimensions:    (time: 6, lat: 192, lon: 288, hist_interval: 2)
│                           Coordinates:
│                             * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0lat_bnds   (lat, hist_interval) float32 -90.0 -89.53 -89.53 ... 89.53 90.0* lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8lon_bnds   (lon, hist_interval) float32 -0.625 0.625 0.625 ... 358.1 359.4* time       (time) object 1850-01-15 11:45:00.000013 ... 1850-06-15 00:00:00time_bnds  (time, hist_interval) object 1849-12-31 23:29:59.999987 ... 18...
│                               member_id  <U8 'r1i1p1f1'Dimensions without coordinates: hist_intervalData variables:
│                               gpp        (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... nan nan nan nanmrso       (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nanAttributes: (12/46)
│                               variant_label:           r1i1p1f1mip_era:                 CMIP6license:                 CMIP6 model data produced by <The National Cente...
│                               contact:                 cesm_cmip6@ucar.eduparent_variant_label:    r1i1p1f1source_type:             AOGCM BGC CHEM AER
│                               ...                      ...
│                               case_id:                 4branch_time_in_child:    674885.0source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│                               initialization_index:    1further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│                               intake_esm_dataset_key:  CMIP/NCAR/CESM2-WACCM/historical/Lmon/gn
└── DataTree('ScenarioMIP')
    ├── DataTree('MIROC')
    │   └── DataTree('MIROC6')
    │       └── DataTree('ssp370')
    │           ├── DataTree('Omon')
    │           │   └── DataTree('gn')
    │           │           Dimensions:             (y: 256, x: 360, time: 6, bnds: 2, vertices: 4)
    │           │           Coordinates: (12/13)
    │           │               latitude            (y, x) float32 -88.0 -88.0 -88.0 ... 64.43 64.0 63.56
    │           │               lev                 float64 1.0
    │           │               lev_bnds            (bnds) float64 0.0 2.0
    │           │               longitude           (y, x) float32 60.5 61.5 62.5 63.5 ... 59.96 59.98 59.99
    │           │               sigma_bnds          (bnds) float64 -0.0 -0.04
    │           │             * time                (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16
    │           │               ...                  ...
    │           │             * x                   (x) float64 0.5 1.5 2.5 3.5 ... 356.5 357.5 358.5 359.5
    │           │               x_bnds              (x, bnds) float64 0.0 1.0 1.0 2.0 ... 359.0 359.0 360.0
    │           │             * y                   (y) float64 -88.0 -85.75 -85.25 ... 148.6 150.5 152.4
    │           │               y_bnds              (y, bnds) float64 -90.0 -86.0 -86.0 ... 151.5 153.3
    │           │               zlev_bnds           (bnds) float64 -0.0 -2.0
    │           │               member_id           <U8 'r1i1p1f1'
    │           │           Dimensions without coordinates: bnds, vertices
    │           │           Data variables:
    │           │               depth               (y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    │           │               depth_c             float64 50.0
    │           │               eta                 (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    │           │               nsigma              int32 10
    │           │               sigma               float64 -0.02
    │           │               thetao              (time, y, x) float32 nan nan nan nan ... nan nan nan nan
    │           │               vertices_latitude   (y, x, vertices) float32 -90.0 -90.0 ... 63.33 63.78
    │           │               vertices_longitude  (y, x, vertices) float32 60.0 61.0 61.0 ... 60.0 60.0
    │           │               zlev                float64 -1.0
    │           │           Attributes: (12/48)
    │           │               Conventions:             CF-1.7 CMIP-6.2
    │           │               activity_id:             ScenarioMIP AerChemMIP
    │           │               branch_method:           standard
    │           │               branch_time_in_child:    60265.0
    │           │               branch_time_in_parent:   60265.0
    │           │               cmor_version:            3.4.0
    │           │               ...                      ...
    │           │               variable_id:             thetao
    │           │               variant_label:           r1i1p1f1
    │           │               status:                  2019-11-18;created;by nhn2@columbia.edu
    │           │               netcdf_tracking_ids:     hdl:21.14100/99dda520-c9e9-4617-b4ca-0de0a2b9398...
    │           │               version_id:              v20190627
    │           │               intake_esm_dataset_key:  ScenarioMIP/MIROC/MIROC6/ssp370/Omon/gn
    │           ├── DataTree('Amon')
    │           │   └── DataTree('gn')
    │           │           Dimensions:    (lat: 128, bnds: 2, lon: 256, time: 6)
    │           │           Coordinates:
    │           │             * lat        (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
    │           │               lat_bnds   (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
    │           │             * lon        (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
    │           │               lon_bnds   (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
    │           │             * time       (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16
    │           │               time_bnds  (time, bnds) datetime64[ns] 2015-01-01 2015-02-01 ... 2015-07-01
    │           │               member_id  <U8 'r1i1p1f1'
    │           │           Dimensions without coordinates: bnds
    │           │           Data variables:
    │           │               pr         (time, lat, lon) float32 1.137e-06 1.131e-06 ... 7.446e-06
    │           │           Attributes: (12/48)
    │           │               Conventions:             CF-1.7 CMIP-6.2
    │           │               activity_id:             ScenarioMIP AerChemMIP
    │           │               branch_method:           standard
    │           │               branch_time_in_child:    60265.0
    │           │               branch_time_in_parent:   60265.0
    │           │               cmor_version:            3.4.0
    │           │               ...                      ...
    │           │               variable_id:             pr
    │           │               variant_label:           r1i1p1f1
    │           │               status:                  2019-10-25;created;by nhn2@columbia.edu
    │           │               netcdf_tracking_ids:     hdl:21.14100/c23c415d-adca-4e01-8e7c-11617bcfa2bb
    │           │               version_id:              v20190627
    │           │               intake_esm_dataset_key:  ScenarioMIP/MIROC/MIROC6/ssp370/Amon/gn
    │           └── DataTree('Lmon')
    │               └── DataTree('gn')
    │                       Dimensions:    (lat: 128, bnds: 2, lon: 256, time: 6)
    │                       Coordinates:
    │                         * lat        (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93lat_bnds   (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0* lon        (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6lon_bnds   (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3* time       (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16time_bnds  (time, bnds) datetime64[ns] 2015-01-01 2015-02-01 ... 2015-07-01member_id  <U8 'r1i1p1f1'Dimensions without coordinates: bndsData variables:
    │                           mrso       (time, lat, lon) float32 4.2e+03 4.2e+03 4.2e+03 ... nan nan nanAttributes: (12/48)
    │                           Conventions:             CF-1.7 CMIP-6.2activity_id:             ScenarioMIP AerChemMIPbranch_method:           standardbranch_time_in_child:    60265.0branch_time_in_parent:   60265.0cmor_version:            3.4.0
    │                           ...                      ...
    │                           variable_id:             mrsovariant_label:           r1i1p1f1status:                  2019-10-29;created;by nhn2@columbia.edunetcdf_tracking_ids:     hdl:21.14100/3ba01dc3-ab7e-45d0-882a-66ed2768a642version_id:              v20190627intake_esm_dataset_key:  ScenarioMIP/MIROC/MIROC6/ssp370/Lmon/gn
    ├── DataTree('NCAR')
    │   └── DataTree('CESM2-WACCM')
    │       └── DataTree('ssp370')
    │           ├── DataTree('Amon')
    │           │   └── DataTree('gn')
    │           │           Dimensions:    (time: 6, lat: 192, lon: 288, nbnd: 2)
    │           │           Coordinates:
    │           │             * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
    │           │               lat_bnds   (lat, nbnd) float64 -90.0 -89.53 -89.53 ... 89.53 89.53 90.0
    │           │             * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
    │           │               lon_bnds   (lon, nbnd) float64 -0.625 0.625 0.625 ... 358.1 358.1 359.4
    │           │               plev       float64 1e+05
    │           │             * time       (time) object 2015-01-15 12:00:00 ... 2015-06-15 00:00:00
    │           │               time_bnds  (time, nbnd) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00
    │           │               member_id  <U8 'r1i1p1f1'
    │           │           Dimensions without coordinates: nbnd
    │           │           Data variables:
    │           │               co2        (time, lat, lon) float32 nan nan nan ... 0.0004034 0.0004034
    │           │               pr         (time, lat, lon) float32 1.919e-06 1.919e-06 ... 1.043e-05
    │           │           Attributes: (12/45)
    │           │               variant_label:           r1i1p1f1
    │           │               mip_era:                 CMIP6
    │           │               license:                 CMIP6 model data produced by <The National Cente...
    │           │               contact:                 cesm_cmip6@ucar.edu
    │           │               parent_variant_label:    r1i1p1f1
    │           │               source_type:             AOGCM BGC CHEM AER
    │           │               ...                      ...
    │           │               case_id:                 969
    │           │               branch_time_in_child:    735110.0
    │           │               source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
    │           │               initialization_index:    1
    │           │               further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
    │           │               intake_esm_dataset_key:  ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Amon/gn
    │           ├── DataTree('Omon')
    │           │   ├── DataTree('gr')
    │           │   │       Dimensions:    (lat: 180, d2: 2, lon: 360, time: 6)
    │           │   │       Coordinates:
    │           │   │         * lat        (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
    │           │   │           lat_bnds   (lat, d2) float64 -90.0 -89.0 -89.0 -88.0 ... 88.0 89.0 89.0 90.0
    │           │   │           lev        float64 0.0
    │           │   │           lev_bnds   (d2) float64 0.0 5.0
    │           │   │         * lon        (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
    │           │   │           lon_bnds   (lon, d2) float64 0.0 1.0 1.0 2.0 2.0 ... 358.0 359.0 359.0 360.0
    │           │   │         * time       (time) object 2015-01-15 13:00:00.000007 ... 2015-06-15 00:00:00
    │           │   │           time_bnds  (time, d2) object 2015-01-01 02:00:00.000003 ... 2015-07-01 00...
    │           │   │           member_id  <U8 'r1i1p1f1'
    │           │   │       Dimensions without coordinates: d2
    │           │   │       Data variables:
    │           │   │           no3        (time, lat, lon) float32 nan nan nan ... 0.004002 0.004001
    │           │   │           thetao     (time, lat, lon) float32 nan nan nan nan ... -1.68 -1.68 -1.68
    │           │   │       Attributes: (12/44)
    │           │   │           variant_label:           r1i1p1f1
    │           │   │           mip_era:                 CMIP6
    │           │   │           license:                 CMIP6 model data produced by <The National Cente...
    │           │   │           contact:                 cesm_cmip6@ucar.edu
    │           │   │           parent_variant_label:    r1i1p1f1
    │           │   │           source_type:             AOGCM BGC CHEM AER
    │           │   │           ...                      ...
    │           │   │           case_id:                 969
    │           │   │           branch_time_in_child:    735110.0
    │           │   │           source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
    │           │   │           initialization_index:    1
    │           │   │           further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
    │           │   │           intake_esm_dataset_key:  ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Omon/gr
    │           │   └── DataTree('gn')
    │           │           Dimensions:    (nlat: 384, nlon: 320, vertices: 4, d2: 2, time: 6)
    │           │           Coordinates:
    │           │               lat        (nlat, nlon) float64 -79.22 -79.22 -79.22 ... 72.2 72.19 72.19
    │           │               lat_bnds   (nlat, nlon, vertices) float32 -79.49 -79.49 ... 72.41 72.41
    │           │               lev        float64 500.0
    │           │               lev_bnds   (d2) float32 0.0 10.0
    │           │               lon        (nlat, nlon) float64 320.6 321.7 322.8 ... 318.9 319.4 319.8
    │           │               lon_bnds   (nlat, nlon, vertices) float32 320.0 321.1 321.1 ... 320.0 319.6
    │           │             * nlat       (nlat) int32 1 2 3 4 5 6 7 8 ... 377 378 379 380 381 382 383 384
    │           │             * nlon       (nlon) int32 1 2 3 4 5 6 7 8 ... 313 314 315 316 317 318 319 320
    │           │             * time       (time) object 2015-01-15 13:00:00.000007 ... 2015-06-15 00:00:00
    │           │               time_bnds  (time, d2) object 2015-01-01 02:00:00.000003 ... 2015-07-01 00...
    │           │               member_id  <U8 'r1i1p1f1'
    │           │           Dimensions without coordinates: vertices, d2
    │           │           Data variables:
    │           │               no3        (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
    │           │               thetao     (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
    │           │           Attributes: (12/44)
    │           │               variant_label:           r1i1p1f1
    │           │               mip_era:                 CMIP6
    │           │               license:                 CMIP6 model data produced by <The National Cente...
    │           │               contact:                 cesm_cmip6@ucar.edu
    │           │               parent_variant_label:    r1i1p1f1
    │           │               source_type:             AOGCM BGC CHEM AER
    │           │               ...                      ...
    │           │               case_id:                 969
    │           │               branch_time_in_child:    735110.0
    │           │               source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
    │           │               initialization_index:    1
    │           │               further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
    │           │               intake_esm_dataset_key:  ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Omon/gn
    │           └── DataTree('Lmon')
    │               └── DataTree('gn')
    │                       Dimensions:    (lat: 192, lon: 288, time: 6, hist_interval: 2)
    │                       Coordinates:
    │                         * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0* lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8* time       (time) object 2015-01-15 11:45:00 ... 2015-05-15 12:00:00member_id  <U8 'r1i1p1f1'lat_bnds   (lat, hist_interval) float32 -90.0 -89.53 -89.53 ... 89.53 90.0lon_bnds   (lon, hist_interval) float32 -0.625 0.625 0.625 ... 358.1 359.4time_bnds  (time, hist_interval) object 2014-12-31 23:29:59.999997 ... 20...
    │                       Dimensions without coordinates: hist_intervalData variables:
    │                           gpp        (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nanmrso       (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nanAttributes: (12/45)
    │                           variant_label:           r1i1p1f1mip_era:                 CMIP6license:                 CMIP6 model data produced by <The National Cente...
    │                           contact:                 cesm_cmip6@ucar.eduparent_variant_label:    r1i1p1f1source_type:             AOGCM BGC CHEM AER
    │                           ...                      ...
    │                           case_id:                 969branch_time_in_child:    735110.0source:                  CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
    │                           initialization_index:    1further_info_url:        https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
    │                           intake_esm_dataset_key:  ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Lmon/gn
    └── DataTree('CCCma')
        └── DataTree('CanESM5')
            └── DataTree('ssp370')
                ├── DataTree('Amon')
                │   └── DataTree('gn')
                │           Dimensions:    (lat: 64, bnds: 2, lon: 128, time: 6)
                │           Coordinates:
                │             * lat        (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86lat_bnds   (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0* lon        (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2lon_bnds   (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6* time       (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00time_bnds  (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00member_id  <U8 'r1i1p1f1'Dimensions without coordinates: bndsData variables:
                │               pr         (time, lat, lon) float32 2.504e-06 2.678e-06 ... 6.46e-06Attributes: (12/57)
                │               CCCma_model_hash:            1f91f92cb6d607391f44831504025d32fc44faa1CCCma_parent_runid:          rc3.1-his01CCCma_pycmor_hash:           33c30511acc319a98240633965a04ca99c26427eCCCma_runid:                 rc3.1-s7001Conventions:                 CF-1.7 CMIP-6.2YMDH_branch_time_in_child:   2015:01:01:00
                │               ...                          ...
                │               tracking_id:                 hdl:21.14100/8c4a1496-f308-493e-8ecc-a2e253e...
                │               variable_id:                 prvariant_label:               r1i1p1f1version:                     v20190429version_id:                  v20190429intake_esm_dataset_key:      ScenarioMIP/CCCma/CanESM5/ssp370/Amon/gn
                ├── DataTree('Lmon')
                │   └── DataTree('gn')
                │           Dimensions:    (time: 6, lat: 64, lon: 128, bnds: 2)
                │           Coordinates:
                │             * lat        (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86lat_bnds   (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0* lon        (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2lon_bnds   (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6* time       (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00time_bnds  (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00member_id  <U8 'r1i1p1f1'Dimensions without coordinates: bndsData variables:
                │               gpp        (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0mrso       (time, lat, lon) float32 3.76e+03 3.76e+03 3.76e+03 ... 0.0 0.0Attributes: (12/53)
                │               variant_label:               r1i1p1f1mip_era:                     CMIP6license:                     CMIP6 model data produced by The Government ...
                │               contact:                     ec.cccma.info-info.ccmac.ec@canada.caparent_variant_label:        r1i1p1f1source_type:                 AOGCM
                │               ...                          ...
                │               realm:                       landbranch_time_in_child:        60225.0source:                      CanESM5 (2019): \naerosol: interactive\natmo...
                │               initialization_index:        1further_info_url:            https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
                │               intake_esm_dataset_key:      ScenarioMIP/CCCma/CanESM5/ssp370/Lmon/gn
                └── DataTree('Omon')
                    └── DataTree('gn')
                            Dimensions:             (i: 360, j: 291, bnds: 2, time: 6, vertices: 4)
                            Coordinates:
                              * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
                              * j                   (j) int32 0 1 2 3 4 5 6 ... 284 285 286 287 288 289 290
                                latitude            (j, i) float64 -78.39 -78.39 -78.39 ... 50.23 50.01
                                lev                 float64 3.047
                                lev_bnds            (bnds) float64 0.0 6.194
                                longitude           (j, i) float64 73.5 74.5 75.5 76.5 ... 72.95 72.96 72.99
                              * time                (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:0...
                                time_bnds           (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-0...
                                member_id           <U8 'r1i1p1f1'
                            Dimensions without coordinates: bnds, vertices
                            Data variables:
                                no3                 (time, j, i) float32 nan nan nan nan ... nan nan nan nan
                                vertices_latitude   (j, i, vertices) float64 -78.29 -78.49 ... 50.11 50.11
                                vertices_longitude  (j, i, vertices) float64 74.0 74.0 73.0 ... 72.95 73.0
                                thetao              (time, j, i) float32 nan nan nan nan ... nan nan nan nan
                            Attributes: (12/52)
                                variant_label:               r1i1p1f1
                                mip_era:                     CMIP6
                                license:                     CMIP6 model data produced by The Government ...
                                contact:                     ec.cccma.info-info.ccmac.ec@canada.ca
                                parent_variant_label:        r1i1p1f1
                                source_type:                 AOGCM
                                ...                          ...
                                physics_index:               1
                                branch_time_in_child:        60225.0
                                source:                      CanESM5 (2019): \naerosol: interactive\natmo...
                                initialization_index:        1
                                further_info_url:            https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
                                intake_esm_dataset_key:      ScenarioMIP/CCCma/CanESM5/ssp370/Omon/gn
let me know if everything looks good

@TomNicholas
Copy link
Member

Thank you so much for doing this @andersy005 , but I think we might be on slightly different pages with what I'm looking for. 😅

What I ideally want is the simplest possible datatree that I can still do non-trivial operations on, but which still has some obvious physical interpretation that doesn't require extra thought for the person reading the documentation (who may not work in geoscience!).

If you look at the existing airtemps tutorial dataset we use in xarray, you can see it's fairly minimal and understandable.

<xarray.Dataset>
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ...
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

It's obvious what lat, lon, and time are, and that it contains air temperature data (it would be even clearer if air was renamed to temp but I digress...). There are no other dimensions or coordinates, and the list of attributes isn't too excessive.

The first dataset you showed is perhaps closest to this - it has two distinct types of data that lie on different grids for a good reason (i.e. ocean and atmosphere data). It also has historical data vs a projection, and at least some of the variable names are clear (i.e. O2).

What does smbb mean vs cmip6 in this context?

The CMIP6 version which includes multi models, multi experiments should suffice

However I do also like this, because it gives a motivation for cross-node operations (such as comparing the results of two models).

that's just CESM naming convention which isn't CF-compliant. we can exclude this dataset...

Being CF-compliant isn't really the problem, it's that we want names that actually mean something to datatree users who are from unrelated fields of science. In fact we want to ensure that none of the documentation examples rely on cf-xarray for interpreting anything.

Thank you for sharing the notebook you used to create the data. I think instead of a back-and-forth the easiest way to proceed might be for me to mess with what you've already given me (which is great - I wouldn't even have known where to look!), then I'll put it in xarray-data. At that point we can merge this PR but just point it to that data. How does that sound?

@andersy005
Copy link
Member Author

andersy005 commented Aug 9, 2022

Being CF-compliant isn't really the problem, it's that we want names that actually mean something to datatree users who are from unrelated fields of science. In fact we want to ensure that none of the documentation examples rely on cf-xarray for interpreting anything.

I concur that understanding what some of these characteristics mean would require being familiar with the sample dataset in question. However, In my opinion, in addition to domain agnostic datasets, domain-specific datasets are valuable because

  • some of the resulting datatree hierarchy is influenced by vocabulary and other domain-specific characteristics
  • it might make it easier for folks to map their use cases to the datatree model.

Perhaps the more the merrier? The documentation doesn't have to use all these sample datasets (having an archive of different/diverse datasets could come in handy).

What does smbb mean vs cmip6 in this context?

these are the forcing variants used in the CESM Large Ensemble simulations (e.g. smbb: Smoothed Biomass Burning). there's more explanation here: https://ncar.github.io/cesm2-le-aws/model_documentation.html

Thank you for sharing the notebook you used to create the data. I think instead of a back-and-forth the easiest way to proceed might be for me to mess with what you've already given me (which is great - I wouldn't even have known where to look!), then I'll put it in xarray-data. At that point we can merge this PR but just point it to that data.

You bet. This sounds good to me. Ping me if you need my input

@TomNicholas
Copy link
Member

Perhaps the more the merrier? The documentation doesn't have to use all these sample datasets (having an archive of different/diverse datasets could come in handy).

Oh yes definitely! That's a good point - we could just merge these two datasets into xarray-data and have them as options for tutorial.open_datatree even if I later use a simplified version for some documentation examples.

@TomNicholas TomNicholas mentioned this pull request Jan 5, 2023
14 tasks
@eni-awowale eni-awowale linked an issue Sep 7, 2024 that may be closed by this pull request
6 tasks
@TomNicholas
Copy link
Member

Closing in favour of linked upstream issue

@TomNicholas TomNicholas closed this Oct 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Example datatree for use in tutorial documentation
2 participants