-
Notifications
You must be signed in to change notification settings - Fork 43
Add tutorial
module for sample datasets
#142
Conversation
This looks great @andersy005 ! Code-wise I see no issues, and would be happy to merge. The only thing I might want to change is the data itself: can we simplify it slightly? The raw data has obscure variable names ( |
that's just CESM naming convention which isn't CF-compliant. we can exclude this dataset... The CMIP6 version which includes multi models, multi experiments should suffice
I was trying to maintain the dimensionality of the original dataset, but i can easily get rid of those. |
I'm going to trim down the CMIP6 sample, and will add to pydata/xarray-data repository. |
@TomNicholas, here's what the CMIP sample looks like CMIP6 SampleDataTree('None', parent=None)
├── DataTree('CMIP')
│ ├── DataTree('CCCma')
│ │ └── DataTree('CanESM5')
│ │ └── DataTree('historical')
│ │ ├── DataTree('Amon')
│ │ │ └── DataTree('gn')
│ │ │ Dimensions: (lat: 64, bnds: 2, lon: 128, time: 6)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│ │ │ lat_bnds (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│ │ │ * lon (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│ │ │ lon_bnds (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│ │ │ * time (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│ │ │ time_bnds (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: bnds
│ │ │ Data variables:
│ │ │ pr (time, lat, lon) float32 7.221e-07 8.962e-07 ... 1.108e-05
│ │ │ Attributes: (12/57)
│ │ │ CCCma_model_hash: 3dedf95315d603326fde4f5340dc0519d80d10c0
│ │ │ CCCma_parent_runid: rc3-pictrl
│ │ │ CCCma_pycmor_hash: 33c30511acc319a98240633965a04ca99c26427e
│ │ │ CCCma_runid: rc3.1-his01
│ │ │ Conventions: CF-1.7 CMIP-6.2
│ │ │ YMDH_branch_time_in_child: 1850:01:01:00
│ │ │ ... ...
│ │ │ variant_label: r1i1p1f1
│ │ │ version: v20190429
│ │ │ status: 2019-10-25;created;by nhn2@columbia.edu
│ │ │ netcdf_tracking_ids: hdl:21.14100/363e1ebe-46e7-43dc-9feb-a7a4a0c...
│ │ │ version_id: v20190429
│ │ │ intake_esm_dataset_key: CMIP/CCCma/CanESM5/historical/Amon/gn
│ │ ├── DataTree('Lmon')
│ │ │ └── DataTree('gn')
│ │ │ Dimensions: (time: 6, lat: 64, lon: 128, bnds: 2)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│ │ │ lat_bnds (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│ │ │ * lon (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│ │ │ lon_bnds (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│ │ │ * time (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:00:00
│ │ │ time_bnds (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: bnds
│ │ │ Data variables:
│ │ │ gpp (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ │ │ mrso (time, lat, lon) float32 3.76e+03 3.76e+03 3.76e+03 ... 0.0 0.0
│ │ │ Attributes: (12/53)
│ │ │ variant_label: r1i1p1f1
│ │ │ mip_era: CMIP6
│ │ │ license: CMIP6 model data produced by The Government ...
│ │ │ contact: ec.cccma.info-info.ccmac.ec@canada.ca
│ │ │ parent_variant_label: r1i1p1f1
│ │ │ source_type: AOGCM
│ │ │ ... ...
│ │ │ realm: land
│ │ │ branch_time_in_child: 0.0
│ │ │ source: CanESM5 (2019): \naerosol: interactive\natmo...
│ │ │ initialization_index: 1
│ │ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
│ │ │ intake_esm_dataset_key: CMIP/CCCma/CanESM5/historical/Lmon/gn
│ │ └── DataTree('Omon')
│ │ └── DataTree('gn')
│ │ Dimensions: (i: 360, j: 291, bnds: 2, time: 6, vertices: 4)
│ │ Coordinates:
│ │ * i (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
│ │ * j (j) int32 0 1 2 3 4 5 6 ... 284 285 286 287 288 289 290
│ │ latitude (j, i) float64 -78.39 -78.39 -78.39 ... 50.23 50.01
│ │ lev float64 3.047
│ │ lev_bnds (bnds) float64 0.0 6.194
│ │ longitude (j, i) float64 73.5 74.5 75.5 76.5 ... 72.95 72.96 72.99
│ │ * time (time) object 1850-01-16 12:00:00 ... 1850-06-16 00:0...
│ │ time_bnds (time, bnds) object 1850-01-01 00:00:00 ... 1850-07-0...
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: bnds, vertices
│ │ Data variables:
│ │ no3 (time, j, i) float32 nan nan nan nan ... nan nan nan nan
│ │ vertices_latitude (j, i, vertices) float64 -78.29 -78.49 ... 50.11 50.11
│ │ vertices_longitude (j, i, vertices) float64 74.0 74.0 73.0 ... 72.95 73.0
│ │ thetao (time, j, i) float32 nan nan nan nan ... nan nan nan nan
│ │ Attributes: (12/52)
│ │ variant_label: r1i1p1f1
│ │ mip_era: CMIP6
│ │ license: CMIP6 model data produced by The Government ...
│ │ contact: ec.cccma.info-info.ccmac.ec@canada.ca
│ │ parent_variant_label: r1i1p1f1
│ │ source_type: AOGCM
│ │ ... ...
│ │ physics_index: 1
│ │ branch_time_in_child: 0.0
│ │ source: CanESM5 (2019): \naerosol: interactive\natmo...
│ │ initialization_index: 1
│ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
│ │ intake_esm_dataset_key: CMIP/CCCma/CanESM5/historical/Omon/gn
│ ├── DataTree('MIROC')
│ │ └── DataTree('MIROC6')
│ │ └── DataTree('historical')
│ │ ├── DataTree('Lmon')
│ │ │ └── DataTree('gn')
│ │ │ Dimensions: (lat: 128, bnds: 2, lon: 256, time: 6)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│ │ │ lat_bnds (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│ │ │ * lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│ │ │ lon_bnds (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│ │ │ * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│ │ │ time_bnds (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 1850-07-01
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: bnds
│ │ │ Data variables:
│ │ │ mrso (time, lat, lon) float32 4.2e+03 4.2e+03 4.2e+03 ... nan nan nan
│ │ │ Attributes: (12/48)
│ │ │ Conventions: CF-1.7 CMIP-6.2
│ │ │ activity_id: CMIP
│ │ │ branch_method: standard
│ │ │ branch_time_in_child: 0.0
│ │ │ branch_time_in_parent: 0.0
│ │ │ cmor_version: 3.3.2
│ │ │ ... ...
│ │ │ variable_id: mrso
│ │ │ variant_label: r1i1p1f1
│ │ │ status: 2019-10-25;created;by nhn2@columbia.edu
│ │ │ netcdf_tracking_ids: hdl:21.14100/a702781b-b6d9-4f90-a65d-c649d59a224...
│ │ │ version_id: v20190311
│ │ │ intake_esm_dataset_key: CMIP/MIROC/MIROC6/historical/Lmon/gn
│ │ ├── DataTree('Amon')
│ │ │ └── DataTree('gn')
│ │ │ Dimensions: (lat: 128, bnds: 2, lon: 256, time: 6)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│ │ │ lat_bnds (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│ │ │ * lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│ │ │ lon_bnds (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│ │ │ * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│ │ │ time_bnds (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 1850-07-01
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: bnds
│ │ │ Data variables:
│ │ │ pr (time, lat, lon) float32 2.144e-06 2.169e-06 ... 8.586e-06
│ │ │ Attributes: (12/48)
│ │ │ Conventions: CF-1.7 CMIP-6.2
│ │ │ activity_id: CMIP
│ │ │ branch_method: standard
│ │ │ branch_time_in_child: 0.0
│ │ │ branch_time_in_parent: 0.0
│ │ │ cmor_version: 3.3.2
│ │ │ ... ...
│ │ │ variable_id: pr
│ │ │ variant_label: r1i1p1f1
│ │ │ status: 2019-10-25;created;by nhn2@columbia.edu
│ │ │ netcdf_tracking_ids: hdl:21.14100/61fa8b6b-e74c-4e86-9344-8ba946ee8a8...
│ │ │ version_id: v20181212
│ │ │ intake_esm_dataset_key: CMIP/MIROC/MIROC6/historical/Amon/gn
│ │ └── DataTree('Omon')
│ │ └── DataTree('gn')
│ │ Dimensions: (y: 256, x: 360, time: 6, bnds: 2, vertices: 4)
│ │ Coordinates: (12/13)
│ │ latitude (y, x) float32 -88.0 -88.0 -88.0 ... 64.43 64.0 63.56
│ │ lev float64 1.0
│ │ lev_bnds (bnds) float64 0.0 2.0
│ │ longitude (y, x) float32 60.5 61.5 62.5 63.5 ... 59.96 59.98 59.99
│ │ sigma_bnds (bnds) float64 -0.0 -0.04
│ │ * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 1850-06-16
│ │ ... ...
│ │ * x (x) float64 0.5 1.5 2.5 3.5 ... 356.5 357.5 358.5 359.5
│ │ x_bnds (x, bnds) float64 0.0 1.0 1.0 2.0 ... 359.0 359.0 360.0
│ │ * y (y) float64 -88.0 -85.75 -85.25 ... 148.6 150.5 152.4
│ │ y_bnds (y, bnds) float64 -90.0 -86.0 -86.0 ... 151.5 153.3
│ │ zlev_bnds (bnds) float64 -0.0 -2.0
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: bnds, vertices
│ │ Data variables:
│ │ depth (y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ │ depth_c float64 50.0
│ │ eta (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ │ nsigma int32 10
│ │ sigma float64 -0.02
│ │ thetao (time, y, x) float32 nan nan nan nan ... nan nan nan nan
│ │ vertices_latitude (y, x, vertices) float32 -90.0 -90.0 ... 63.33 63.78
│ │ vertices_longitude (y, x, vertices) float32 60.0 61.0 61.0 ... 60.0 60.0
│ │ zlev float64 -1.0
│ │ Attributes: (12/48)
│ │ Conventions: CF-1.7 CMIP-6.2
│ │ activity_id: CMIP
│ │ branch_method: standard
│ │ branch_time_in_child: 0.0
│ │ branch_time_in_parent: 0.0
│ │ cmor_version: 3.3.2
│ │ ... ...
│ │ variable_id: thetao
│ │ variant_label: r1i1p1f1
│ │ status: 2019-11-08;created;by nhn2@columbia.edu
│ │ netcdf_tracking_ids: hdl:21.14100/16598b35-19b4-49e3-98de-27b9e9444ad...
│ │ version_id: v20190311
│ │ intake_esm_dataset_key: CMIP/MIROC/MIROC6/historical/Omon/gn
│ └── DataTree('NCAR')
│ └── DataTree('CESM2-WACCM')
│ └── DataTree('historical')
│ ├── DataTree('Omon')
│ │ ├── DataTree('gr')
│ │ │ Dimensions: (lat: 180, d2: 2, lon: 360, time: 6)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
│ │ │ lat_bnds (lat, d2) float64 -90.0 -89.0 -89.0 -88.0 ... 88.0 89.0 89.0 90.0
│ │ │ lev float64 0.0
│ │ │ lev_bnds (d2) float64 0.0 5.0
│ │ │ * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
│ │ │ lon_bnds (lon, d2) float64 0.0 1.0 1.0 2.0 2.0 ... 358.0 359.0 359.0 360.0
│ │ │ * time (time) object 1850-01-15 12:59:59.999997 ... 1850-06-15 00:00:00
│ │ │ time_bnds (time, d2) object 1850-01-01 02:00:00.000003 ... 1850-07-01 00...
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: d2
│ │ │ Data variables:
│ │ │ no3 (time, lat, lon) float32 nan nan nan ... 0.006828 0.006827
│ │ │ thetao (time, lat, lon) float32 nan nan nan nan ... -1.763 -1.763 -1.762
│ │ │ Attributes: (12/45)
│ │ │ variant_label: r1i1p1f1
│ │ │ mip_era: CMIP6
│ │ │ license: CMIP6 model data produced by <The National Cente...
│ │ │ contact: cesm_cmip6@ucar.edu
│ │ │ parent_variant_label: r1i1p1f1
│ │ │ source_type: AOGCM BGC CHEM AER
│ │ │ ... ...
│ │ │ case_id: 4
│ │ │ branch_time_in_child: 674885.0
│ │ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ │ initialization_index: 1
│ │ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ │ intake_esm_dataset_key: CMIP/NCAR/CESM2-WACCM/historical/Omon/gr
│ │ └── DataTree('gn')
│ │ Dimensions: (nlat: 384, nlon: 320, vertices: 4, d2: 2, time: 6)
│ │ Coordinates:
│ │ lat (nlat, nlon) float64 -79.22 -79.22 -79.22 ... 72.2 72.19 72.19
│ │ lat_bnds (nlat, nlon, vertices) float32 -79.49 -79.49 ... 72.41 72.41
│ │ lev float64 500.0
│ │ lev_bnds (d2) float32 0.0 10.0
│ │ lon (nlat, nlon) float64 320.6 321.7 322.8 ... 318.9 319.4 319.8
│ │ lon_bnds (nlat, nlon, vertices) float32 320.0 321.1 321.1 ... 320.0 319.6
│ │ * nlat (nlat) int32 1 2 3 4 5 6 7 8 ... 377 378 379 380 381 382 383 384
│ │ * nlon (nlon) int32 1 2 3 4 5 6 7 8 ... 313 314 315 316 317 318 319 320
│ │ * time (time) object 1850-01-15 13:00:00 ... 1850-06-15 00:00:00
│ │ time_bnds (time, d2) object 1850-01-01 02:00:00.000003 ... 1850-07-01 00...
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: vertices, d2
│ │ Data variables:
│ │ no3 (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│ │ thetao (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│ │ Attributes: (12/45)
│ │ variant_label: r1i1p1f1
│ │ mip_era: CMIP6
│ │ license: CMIP6 model data produced by <The National Cente...
│ │ contact: cesm_cmip6@ucar.edu
│ │ parent_variant_label: r1i1p1f1
│ │ source_type: AOGCM BGC CHEM AER
│ │ ... ...
│ │ case_id: 4
│ │ branch_time_in_child: 674885.0
│ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ initialization_index: 1
│ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ intake_esm_dataset_key: CMIP/NCAR/CESM2-WACCM/historical/Omon/gn
│ ├── DataTree('Amon')
│ │ └── DataTree('gn')
│ │ Dimensions: (time: 6, lat: 192, lon: 288, nbnd: 2)
│ │ Coordinates:
│ │ * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
│ │ lat_bnds (lat, nbnd) float64 -90.0 -89.53 -89.53 ... 89.53 89.53 90.0
│ │ * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
│ │ lon_bnds (lon, nbnd) float64 -0.625 0.625 0.625 ... 358.1 358.1 359.4
│ │ plev float64 1e+05
│ │ * time (time) object 1850-01-15 12:00:00 ... 1850-06-15 00:00:00
│ │ time_bnds (time, nbnd) object 1850-01-01 00:00:00 ... 1850-07-01 00:00:00
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: nbnd
│ │ Data variables:
│ │ co2 (time, lat, lon) float32 nan nan nan ... 0.0002868 0.0002868
│ │ pr (time, lat, lon) float32 2.706e-06 2.706e-06 ... 4.324e-06
│ │ Attributes: (12/46)
│ │ variant_label: r1i1p1f1
│ │ mip_era: CMIP6
│ │ license: CMIP6 model data produced by <The National Cente...
│ │ contact: cesm_cmip6@ucar.edu
│ │ parent_variant_label: r1i1p1f1
│ │ source_type: AOGCM BGC CHEM AER
│ │ ... ...
│ │ case_id: 4
│ │ branch_time_in_child: 674885.0
│ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ initialization_index: 1
│ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ intake_esm_dataset_key: CMIP/NCAR/CESM2-WACCM/historical/Amon/gn
│ └── DataTree('Lmon')
│ └── DataTree('gn')
│ Dimensions: (time: 6, lat: 192, lon: 288, hist_interval: 2)
│ Coordinates:
│ * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
│ lat_bnds (lat, hist_interval) float32 -90.0 -89.53 -89.53 ... 89.53 90.0
│ * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
│ lon_bnds (lon, hist_interval) float32 -0.625 0.625 0.625 ... 358.1 359.4
│ * time (time) object 1850-01-15 11:45:00.000013 ... 1850-06-15 00:00:00
│ time_bnds (time, hist_interval) object 1849-12-31 23:29:59.999987 ... 18...
│ member_id <U8 'r1i1p1f1'
│ Dimensions without coordinates: hist_interval
│ Data variables:
│ gpp (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... nan nan nan nan
│ mrso (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
│ Attributes: (12/46)
│ variant_label: r1i1p1f1
│ mip_era: CMIP6
│ license: CMIP6 model data produced by <The National Cente...
│ contact: cesm_cmip6@ucar.edu
│ parent_variant_label: r1i1p1f1
│ source_type: AOGCM BGC CHEM AER
│ ... ...
│ case_id: 4
│ branch_time_in_child: 674885.0
│ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ initialization_index: 1
│ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ intake_esm_dataset_key: CMIP/NCAR/CESM2-WACCM/historical/Lmon/gn
└── DataTree('ScenarioMIP')
├── DataTree('MIROC')
│ └── DataTree('MIROC6')
│ └── DataTree('ssp370')
│ ├── DataTree('Omon')
│ │ └── DataTree('gn')
│ │ Dimensions: (y: 256, x: 360, time: 6, bnds: 2, vertices: 4)
│ │ Coordinates: (12/13)
│ │ latitude (y, x) float32 -88.0 -88.0 -88.0 ... 64.43 64.0 63.56
│ │ lev float64 1.0
│ │ lev_bnds (bnds) float64 0.0 2.0
│ │ longitude (y, x) float32 60.5 61.5 62.5 63.5 ... 59.96 59.98 59.99
│ │ sigma_bnds (bnds) float64 -0.0 -0.04
│ │ * time (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16
│ │ ... ...
│ │ * x (x) float64 0.5 1.5 2.5 3.5 ... 356.5 357.5 358.5 359.5
│ │ x_bnds (x, bnds) float64 0.0 1.0 1.0 2.0 ... 359.0 359.0 360.0
│ │ * y (y) float64 -88.0 -85.75 -85.25 ... 148.6 150.5 152.4
│ │ y_bnds (y, bnds) float64 -90.0 -86.0 -86.0 ... 151.5 153.3
│ │ zlev_bnds (bnds) float64 -0.0 -2.0
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: bnds, vertices
│ │ Data variables:
│ │ depth (y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ │ depth_c float64 50.0
│ │ eta (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ │ nsigma int32 10
│ │ sigma float64 -0.02
│ │ thetao (time, y, x) float32 nan nan nan nan ... nan nan nan nan
│ │ vertices_latitude (y, x, vertices) float32 -90.0 -90.0 ... 63.33 63.78
│ │ vertices_longitude (y, x, vertices) float32 60.0 61.0 61.0 ... 60.0 60.0
│ │ zlev float64 -1.0
│ │ Attributes: (12/48)
│ │ Conventions: CF-1.7 CMIP-6.2
│ │ activity_id: ScenarioMIP AerChemMIP
│ │ branch_method: standard
│ │ branch_time_in_child: 60265.0
│ │ branch_time_in_parent: 60265.0
│ │ cmor_version: 3.4.0
│ │ ... ...
│ │ variable_id: thetao
│ │ variant_label: r1i1p1f1
│ │ status: 2019-11-18;created;by nhn2@columbia.edu
│ │ netcdf_tracking_ids: hdl:21.14100/99dda520-c9e9-4617-b4ca-0de0a2b9398...
│ │ version_id: v20190627
│ │ intake_esm_dataset_key: ScenarioMIP/MIROC/MIROC6/ssp370/Omon/gn
│ ├── DataTree('Amon')
│ │ └── DataTree('gn')
│ │ Dimensions: (lat: 128, bnds: 2, lon: 256, time: 6)
│ │ Coordinates:
│ │ * lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│ │ lat_bnds (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│ │ * lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│ │ lon_bnds (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│ │ * time (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16
│ │ time_bnds (time, bnds) datetime64[ns] 2015-01-01 2015-02-01 ... 2015-07-01
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: bnds
│ │ Data variables:
│ │ pr (time, lat, lon) float32 1.137e-06 1.131e-06 ... 7.446e-06
│ │ Attributes: (12/48)
│ │ Conventions: CF-1.7 CMIP-6.2
│ │ activity_id: ScenarioMIP AerChemMIP
│ │ branch_method: standard
│ │ branch_time_in_child: 60265.0
│ │ branch_time_in_parent: 60265.0
│ │ cmor_version: 3.4.0
│ │ ... ...
│ │ variable_id: pr
│ │ variant_label: r1i1p1f1
│ │ status: 2019-10-25;created;by nhn2@columbia.edu
│ │ netcdf_tracking_ids: hdl:21.14100/c23c415d-adca-4e01-8e7c-11617bcfa2bb
│ │ version_id: v20190627
│ │ intake_esm_dataset_key: ScenarioMIP/MIROC/MIROC6/ssp370/Amon/gn
│ └── DataTree('Lmon')
│ └── DataTree('gn')
│ Dimensions: (lat: 128, bnds: 2, lon: 256, time: 6)
│ Coordinates:
│ * lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
│ lat_bnds (lat, bnds) float64 -90.0 -88.28 -88.28 ... 88.28 88.28 90.0
│ * lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
│ lon_bnds (lon, bnds) float64 -0.7031 0.7031 0.7031 ... 357.9 357.9 359.3
│ * time (time) datetime64[ns] 2015-01-16T12:00:00 ... 2015-06-16
│ time_bnds (time, bnds) datetime64[ns] 2015-01-01 2015-02-01 ... 2015-07-01
│ member_id <U8 'r1i1p1f1'
│ Dimensions without coordinates: bnds
│ Data variables:
│ mrso (time, lat, lon) float32 4.2e+03 4.2e+03 4.2e+03 ... nan nan nan
│ Attributes: (12/48)
│ Conventions: CF-1.7 CMIP-6.2
│ activity_id: ScenarioMIP AerChemMIP
│ branch_method: standard
│ branch_time_in_child: 60265.0
│ branch_time_in_parent: 60265.0
│ cmor_version: 3.4.0
│ ... ...
│ variable_id: mrso
│ variant_label: r1i1p1f1
│ status: 2019-10-29;created;by nhn2@columbia.edu
│ netcdf_tracking_ids: hdl:21.14100/3ba01dc3-ab7e-45d0-882a-66ed2768a642
│ version_id: v20190627
│ intake_esm_dataset_key: ScenarioMIP/MIROC/MIROC6/ssp370/Lmon/gn
├── DataTree('NCAR')
│ └── DataTree('CESM2-WACCM')
│ └── DataTree('ssp370')
│ ├── DataTree('Amon')
│ │ └── DataTree('gn')
│ │ Dimensions: (time: 6, lat: 192, lon: 288, nbnd: 2)
│ │ Coordinates:
│ │ * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
│ │ lat_bnds (lat, nbnd) float64 -90.0 -89.53 -89.53 ... 89.53 89.53 90.0
│ │ * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
│ │ lon_bnds (lon, nbnd) float64 -0.625 0.625 0.625 ... 358.1 358.1 359.4
│ │ plev float64 1e+05
│ │ * time (time) object 2015-01-15 12:00:00 ... 2015-06-15 00:00:00
│ │ time_bnds (time, nbnd) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: nbnd
│ │ Data variables:
│ │ co2 (time, lat, lon) float32 nan nan nan ... 0.0004034 0.0004034
│ │ pr (time, lat, lon) float32 1.919e-06 1.919e-06 ... 1.043e-05
│ │ Attributes: (12/45)
│ │ variant_label: r1i1p1f1
│ │ mip_era: CMIP6
│ │ license: CMIP6 model data produced by <The National Cente...
│ │ contact: cesm_cmip6@ucar.edu
│ │ parent_variant_label: r1i1p1f1
│ │ source_type: AOGCM BGC CHEM AER
│ │ ... ...
│ │ case_id: 969
│ │ branch_time_in_child: 735110.0
│ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ initialization_index: 1
│ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ intake_esm_dataset_key: ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Amon/gn
│ ├── DataTree('Omon')
│ │ ├── DataTree('gr')
│ │ │ Dimensions: (lat: 180, d2: 2, lon: 360, time: 6)
│ │ │ Coordinates:
│ │ │ * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5
│ │ │ lat_bnds (lat, d2) float64 -90.0 -89.0 -89.0 -88.0 ... 88.0 89.0 89.0 90.0
│ │ │ lev float64 0.0
│ │ │ lev_bnds (d2) float64 0.0 5.0
│ │ │ * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
│ │ │ lon_bnds (lon, d2) float64 0.0 1.0 1.0 2.0 2.0 ... 358.0 359.0 359.0 360.0
│ │ │ * time (time) object 2015-01-15 13:00:00.000007 ... 2015-06-15 00:00:00
│ │ │ time_bnds (time, d2) object 2015-01-01 02:00:00.000003 ... 2015-07-01 00...
│ │ │ member_id <U8 'r1i1p1f1'
│ │ │ Dimensions without coordinates: d2
│ │ │ Data variables:
│ │ │ no3 (time, lat, lon) float32 nan nan nan ... 0.004002 0.004001
│ │ │ thetao (time, lat, lon) float32 nan nan nan nan ... -1.68 -1.68 -1.68
│ │ │ Attributes: (12/44)
│ │ │ variant_label: r1i1p1f1
│ │ │ mip_era: CMIP6
│ │ │ license: CMIP6 model data produced by <The National Cente...
│ │ │ contact: cesm_cmip6@ucar.edu
│ │ │ parent_variant_label: r1i1p1f1
│ │ │ source_type: AOGCM BGC CHEM AER
│ │ │ ... ...
│ │ │ case_id: 969
│ │ │ branch_time_in_child: 735110.0
│ │ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ │ initialization_index: 1
│ │ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ │ intake_esm_dataset_key: ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Omon/gr
│ │ └── DataTree('gn')
│ │ Dimensions: (nlat: 384, nlon: 320, vertices: 4, d2: 2, time: 6)
│ │ Coordinates:
│ │ lat (nlat, nlon) float64 -79.22 -79.22 -79.22 ... 72.2 72.19 72.19
│ │ lat_bnds (nlat, nlon, vertices) float32 -79.49 -79.49 ... 72.41 72.41
│ │ lev float64 500.0
│ │ lev_bnds (d2) float32 0.0 10.0
│ │ lon (nlat, nlon) float64 320.6 321.7 322.8 ... 318.9 319.4 319.8
│ │ lon_bnds (nlat, nlon, vertices) float32 320.0 321.1 321.1 ... 320.0 319.6
│ │ * nlat (nlat) int32 1 2 3 4 5 6 7 8 ... 377 378 379 380 381 382 383 384
│ │ * nlon (nlon) int32 1 2 3 4 5 6 7 8 ... 313 314 315 316 317 318 319 320
│ │ * time (time) object 2015-01-15 13:00:00.000007 ... 2015-06-15 00:00:00
│ │ time_bnds (time, d2) object 2015-01-01 02:00:00.000003 ... 2015-07-01 00...
│ │ member_id <U8 'r1i1p1f1'
│ │ Dimensions without coordinates: vertices, d2
│ │ Data variables:
│ │ no3 (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│ │ thetao (time, nlat, nlon) float32 nan nan nan nan ... nan nan nan nan
│ │ Attributes: (12/44)
│ │ variant_label: r1i1p1f1
│ │ mip_era: CMIP6
│ │ license: CMIP6 model data produced by <The National Cente...
│ │ contact: cesm_cmip6@ucar.edu
│ │ parent_variant_label: r1i1p1f1
│ │ source_type: AOGCM BGC CHEM AER
│ │ ... ...
│ │ case_id: 969
│ │ branch_time_in_child: 735110.0
│ │ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ │ initialization_index: 1
│ │ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ │ intake_esm_dataset_key: ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Omon/gn
│ └── DataTree('Lmon')
│ └── DataTree('gn')
│ Dimensions: (lat: 192, lon: 288, time: 6, hist_interval: 2)
│ Coordinates:
│ * lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
│ * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
│ * time (time) object 2015-01-15 11:45:00 ... 2015-05-15 12:00:00
│ member_id <U8 'r1i1p1f1'
│ lat_bnds (lat, hist_interval) float32 -90.0 -89.53 -89.53 ... 89.53 90.0
│ lon_bnds (lon, hist_interval) float32 -0.625 0.625 0.625 ... 358.1 359.4
│ time_bnds (time, hist_interval) object 2014-12-31 23:29:59.999997 ... 20...
│ Dimensions without coordinates: hist_interval
│ Data variables:
│ gpp (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
│ mrso (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
│ Attributes: (12/45)
│ variant_label: r1i1p1f1
│ mip_era: CMIP6
│ license: CMIP6 model data produced by <The National Cente...
│ contact: cesm_cmip6@ucar.edu
│ parent_variant_label: r1i1p1f1
│ source_type: AOGCM BGC CHEM AER
│ ... ...
│ case_id: 969
│ branch_time_in_child: 735110.0
│ source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite ...
│ initialization_index: 1
│ further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2-...
│ intake_esm_dataset_key: ScenarioMIP/NCAR/CESM2-WACCM/ssp370/Lmon/gn
└── DataTree('CCCma')
└── DataTree('CanESM5')
└── DataTree('ssp370')
├── DataTree('Amon')
│ └── DataTree('gn')
│ Dimensions: (lat: 64, bnds: 2, lon: 128, time: 6)
│ Coordinates:
│ * lat (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│ lat_bnds (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│ * lon (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│ lon_bnds (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│ * time (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00
│ time_bnds (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00
│ member_id <U8 'r1i1p1f1'
│ Dimensions without coordinates: bnds
│ Data variables:
│ pr (time, lat, lon) float32 2.504e-06 2.678e-06 ... 6.46e-06
│ Attributes: (12/57)
│ CCCma_model_hash: 1f91f92cb6d607391f44831504025d32fc44faa1
│ CCCma_parent_runid: rc3.1-his01
│ CCCma_pycmor_hash: 33c30511acc319a98240633965a04ca99c26427e
│ CCCma_runid: rc3.1-s7001
│ Conventions: CF-1.7 CMIP-6.2
│ YMDH_branch_time_in_child: 2015:01:01:00
│ ... ...
│ tracking_id: hdl:21.14100/8c4a1496-f308-493e-8ecc-a2e253e...
│ variable_id: pr
│ variant_label: r1i1p1f1
│ version: v20190429
│ version_id: v20190429
│ intake_esm_dataset_key: ScenarioMIP/CCCma/CanESM5/ssp370/Amon/gn
├── DataTree('Lmon')
│ └── DataTree('gn')
│ Dimensions: (time: 6, lat: 64, lon: 128, bnds: 2)
│ Coordinates:
│ * lat (lat) float64 -87.86 -85.1 -82.31 -79.53 ... 82.31 85.1 87.86
│ lat_bnds (lat, bnds) float64 -90.0 -86.58 -86.58 ... 86.58 86.58 90.0
│ * lon (lon) float64 0.0 2.812 5.625 8.438 ... 348.8 351.6 354.4 357.2
│ lon_bnds (lon, bnds) float64 -1.406 1.406 1.406 ... 355.8 355.8 358.6
│ * time (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:00:00
│ time_bnds (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-01 00:00:00
│ member_id <U8 'r1i1p1f1'
│ Dimensions without coordinates: bnds
│ Data variables:
│ gpp (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
│ mrso (time, lat, lon) float32 3.76e+03 3.76e+03 3.76e+03 ... 0.0 0.0
│ Attributes: (12/53)
│ variant_label: r1i1p1f1
│ mip_era: CMIP6
│ license: CMIP6 model data produced by The Government ...
│ contact: ec.cccma.info-info.ccmac.ec@canada.ca
│ parent_variant_label: r1i1p1f1
│ source_type: AOGCM
│ ... ...
│ realm: land
│ branch_time_in_child: 60225.0
│ source: CanESM5 (2019): \naerosol: interactive\natmo...
│ initialization_index: 1
│ further_info_url: https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
│ intake_esm_dataset_key: ScenarioMIP/CCCma/CanESM5/ssp370/Lmon/gn
└── DataTree('Omon')
└── DataTree('gn')
Dimensions: (i: 360, j: 291, bnds: 2, time: 6, vertices: 4)
Coordinates:
* i (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
* j (j) int32 0 1 2 3 4 5 6 ... 284 285 286 287 288 289 290
latitude (j, i) float64 -78.39 -78.39 -78.39 ... 50.23 50.01
lev float64 3.047
lev_bnds (bnds) float64 0.0 6.194
longitude (j, i) float64 73.5 74.5 75.5 76.5 ... 72.95 72.96 72.99
* time (time) object 2015-01-16 12:00:00 ... 2015-06-16 00:0...
time_bnds (time, bnds) object 2015-01-01 00:00:00 ... 2015-07-0...
member_id <U8 'r1i1p1f1'
Dimensions without coordinates: bnds, vertices
Data variables:
no3 (time, j, i) float32 nan nan nan nan ... nan nan nan nan
vertices_latitude (j, i, vertices) float64 -78.29 -78.49 ... 50.11 50.11
vertices_longitude (j, i, vertices) float64 74.0 74.0 73.0 ... 72.95 73.0
thetao (time, j, i) float32 nan nan nan nan ... nan nan nan nan
Attributes: (12/52)
variant_label: r1i1p1f1
mip_era: CMIP6
license: CMIP6 model data produced by The Government ...
contact: ec.cccma.info-info.ccmac.ec@canada.ca
parent_variant_label: r1i1p1f1
source_type: AOGCM
... ...
physics_index: 1
branch_time_in_child: 60225.0
source: CanESM5 (2019): \naerosol: interactive\natmo...
initialization_index: 1
further_info_url: https://furtherinfo.es-doc.org/CMIP6.CCCma.C...
intake_esm_dataset_key: ScenarioMIP/CCCma/CanESM5/ssp370/Omon/gn |
Thank you so much for doing this @andersy005 , but I think we might be on slightly different pages with what I'm looking for. 😅 What I ideally want is the simplest possible datatree that I can still do non-trivial operations on, but which still has some obvious physical interpretation that doesn't require extra thought for the person reading the documentation (who may not work in geoscience!). If you look at the existing <xarray.Dataset>
Dimensions: (lat: 25, time: 2920, lon: 53)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 ...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... It's obvious what The first dataset you showed is perhaps closest to this - it has two distinct types of data that lie on different grids for a good reason (i.e. ocean and atmosphere data). It also has historical data vs a projection, and at least some of the variable names are clear (i.e. What does
However I do also like this, because it gives a motivation for cross-node operations (such as comparing the results of two models).
Being CF-compliant isn't really the problem, it's that we want names that actually mean something to datatree users who are from unrelated fields of science. In fact we want to ensure that none of the documentation examples rely on cf-xarray for interpreting anything. Thank you for sharing the notebook you used to create the data. I think instead of a back-and-forth the easiest way to proceed might be for me to mess with what you've already given me (which is great - I wouldn't even have known where to look!), then I'll put it in xarray-data. At that point we can merge this PR but just point it to that data. How does that sound? |
I concur that understanding what some of these characteristics mean would require being familiar with the sample dataset in question. However, In my opinion, in addition to domain agnostic datasets, domain-specific datasets are valuable because
Perhaps the more the merrier? The documentation doesn't have to use all these sample datasets (having an archive of different/diverse datasets could come in handy).
these are the forcing variants used in the CESM Large Ensemble simulations (e.g.
You bet. This sounds good to me. Ping me if you need my input |
Oh yes definitely! That's a good point - we could just merge these two datasets into xarray-data and have them as options for |
Closing in favour of linked upstream issue |
pre-commit run --all-files
api.rst
docs/source/whats-new.rst