Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem writing GRIB output when initialized with ERA5 via CDS #21

Open
russ-schumacher opened this issue Sep 27, 2024 · 7 comments
Open

Comments

@russ-schumacher
Copy link

When running graphcast using CDS, using a command like this:

ai-models --input cds --date 20221220--time 0000 --path "output-graphcast.grib" graphcast

the model runs successfully, but then fails to write the grib file on forecast hour 6. The issue appears to be with writing the precipitation field, throwing this error:

2024-09-27 16:33:26,419 INFO Converting output xarray to GRIB and saving
ECCODES ERROR : concept: no match for paramId=228
ECCODES ERROR : concept: input handle edition=2, centre=ecmf
ECCODES ERROR : concept: input handle dataset=era
ECCODES ERROR : Please check the Parameter Database 'https://codes.ecmwf.int/grib/param-db/?id=228'
2024-09-27 16:33:34,333 ERROR Error setting edition=2
2024-09-27 16:33:34,333 ERROR Concept no match

I used the debug option and wrote output.nc, which looks fine, so the issue simply appears to be writing out to grib.

When initializing using opendata, this issue does not occur and everything looks fine.

A possible hint at the cause is from wgrib2...for the opendata-initialized run that works, the grib record looks like this:
7:5174768:d=2024092600:var discipline=0 center=98 local_table=1 parmcat=1 parm=193:surface:0-0 day acc fcst:

while in the cds-initialized run that fails, the grib record looks like this in the hour zero file (nothing is written after that):
7:14536536:d=2022122000:TPRATE:surface:0-0 day acc fcst:

Not sure why a different precipitation parameter is being written depending on the initialization source, though. Thanks for any insights!

@jovanovski
Copy link

jovanovski commented Oct 1, 2024

Having the same issue here!

@russ-schumacher which command did you use to generate the NetCDF output?

@Zappandy
Copy link

Zappandy commented Oct 3, 2024

@russ-schumacher and @jovanovski I'm also having the same issue. On our end, we're downloading the era5 data using climetlab with initial conditions being generated every 5 days in 2023.

I've added the code we use to download the data to reproduce our issue.

#!/usr/bin/python3
try:
    from functools import lru_cache
except ImportError:
    from backports.functools_lru_cache import lru_cache
import climetlab as cml
from datetime import datetime, timedelta

# Create a list of dates in 2023 with steps of 5 days
start_date = datetime(2023, 1, 1)
end_date = datetime(2023, 12, 31)
delta = timedelta(days=5)

dates = []
current_date = start_date

while current_date <= end_date:
    dates.append(current_date.strftime('%Y-%m-%d'))
    current_date += delta

sfc_data = cml.load_source(
    "cds",  
    "reanalysis-era5-single-levels",  
    variable = ["lsm", "2t", "msl", "10u", "10v", "tp", "z"],
    product_type = "reanalysis",
    area = [90, 0, -90, 360],
    grid = [0.25, 0.25],
    date = dates,
    time = "12:00",
    format = "grib"
)

atm_data = cml.load_source(
    "cds", 
    "reanalysis-era5-pressure-levels", 
    variable = ["t", "z", "u", "v", "w", "q"],
    level = [50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 850, 925, 1000],
    product_type = "reanalysis",
    area = [90, 0, -90, 360],
    grid = [0.25, 0.25],
    date = dates,
    time = "12:00",
    format = "grib"
)

Note that before this problem, we actually were having issues with nan values with the 2t variable because of the gribapi. I commented some lines out to bypass this as I understand some nan values in 2t are fine as long as they correspond to the ocean. However, after "fixing" this, we started seeing the same issue you guys are dealing with total precipitation.

Just for the sake of providing a more detailed diagnosis, I've added the traceback error that we had with 2t before commenting some lines out from the gribapi code. Note that debugging the gribapi is a bit of a pain because there's a circular import between eccodes and the gribapi...

2024-10-02 04:41:16,975 INFO Doing full rollout prediction in JAX: 1 minute 4 seconds.
2024-10-02 04:41:16,975 INFO Converting output xarray to GRIB and saving
ECCODES ERROR   :  Minimum value out of range: nan
ECCODES ERROR   :  GRIB2 simple packing: unable to set values (Encoding invalid)
ECCODES ERROR   :  Unable to set double array 'codedValues' (Encoding invalid)
2024-10-02 04:41:17,883 ERROR Error setting values
2024-10-02 04:41:17,883 ERROR Encoding invalid
Traceback (most recent call last):
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/codes.py", line 221, in set_values
    eccodes.codes_set_values(self._handle, values.flatten())
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 2126, in grib_set_values
    grib_set_double_array(gribid, "values", values)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 1200, in grib_set_double_array
    GRIB_CHECK(lib.grib_set_double_array(h, key.encode(ENC), a, length))
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 232, in GRIB_CHECK
    errors.raise_grib_error(errid)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/errors.py", line 381, in raise_grib_error
    raise ERROR_MAP[errid](errid)
gribapi.errors.EncodingError: Encoding invalid
2024-10-02 04:41:17,886 INFO Saving output data: 0.9 second.
2024-10-02 04:41:17,886 INFO Total time: 1 minute 26 seconds.
Traceback (most recent call last):
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/outputs/__init__.py", line 62, in write
    handle, path = self.output.write(data, *args, **kwargs)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/output.py", line 390, in write
    handle = self._coder.encode(
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/output.py", line 132, in encode
    handle.set_values(values)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/codes.py", line 221, in set_values
    eccodes.codes_set_values(self._handle, values.flatten())
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 2126, in grib_set_values
    grib_set_double_array(gribid, "values", values)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 1200, in grib_set_double_array
    GRIB_CHECK(lib.grib_set_double_array(h, key.encode(ENC), a, length))
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 232, in GRIB_CHECK
    errors.raise_grib_error(errid)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/errors.py", line 381, in raise_grib_error
    raise ERROR_MAP[errid](errid)
gribapi.errors.EncodingError: Encoding invalid

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/bin/ai-models", line 8, in <module>
    sys.exit(main())
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 362, in main
    _main(sys.argv[1:])
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 310, in _main
    run(vars(args), unknownargs)
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 335, in run
    model.run()
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models_graphcast/model.py", line 232, in run
    save_output_xarray(
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models_graphcast/output.py", line 68, in save_output_xarray
    write(
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/model.py", line 120, in write
    self.output.write(*args, **kwargs, **self.grib_extra_metadata),
  File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/outputs/__init__.py", line 67, in write
    raise ValueError(f"NaN values found in field. args={args} kwargs={kwargs}")
ValueError: NaN values found in field. args=() kwargs={'template': GribField(2t,None,20230101,1200,0,0), 'step': 6}

@russ-schumacher
Copy link
Author

Having the same issue here!

@russ-schumacher which command did you use to generate the NetCDF output?

If you add the "--debug" flag when running, it will output a netcdf of the model output, as well as the input training matrix, and a couple others.

@owainkenwayucl
Copy link

owainkenwayucl commented Nov 21, 2024

Also having the same issue, also downloading from CDS.

> ai-models --input cds --date 20230110 --time 0000 --debug graphcast

<snip>

024-11-21 14:51:21,053 DEBUG CALLING func [] {'startStep': 0, 'endStep': 6, 'self': <earthkit.data.readers.grib.output.GribCoder object at 0x2b7b0040d1d0>}                                                      
2024-11-21 14:51:21,055 DEBUG GribOutput.metadata {'edition': 2, 'generatingProcessIdentifier': 1, 'stream': 'oper', 'expver': 'dmgc', 'class': 'ml', 'startStep': 0, 'endStep': 6}                               
ECCODES ERROR   :  concept: no match for paramId=228                                                                                                                                                              
ECCODES ERROR   :  concept: input handle edition=2, centre=ecmf                                                                                                                                                   
ECCODES ERROR   :  concept: input handle dataset=era                                                                                                                                                              
ECCODES ERROR   :  Please check the Parameter Database 'https://codes.ecmwf.int/grib/param-db/?id=228'                                                                                                            
2024-11-21 14:51:21,057 ERROR Error setting edition=2                                                                                                                                                             
2024-11-21 14:51:21,057 ERROR Concept no match                                                      

<snip more errors as things fail as a result>

@owainkenwayucl
Copy link

Very upsettingly, I created a different virtual env with the same (?) packages installed, possibly in a different order and it now works.

here is the difference between pip list on broken (left) and works (right)

5a6
> array_api_compat          1.9.1
8,9c9
< cachetools                5.5.0
< cads-api-client           1.5.2
---
> cads-api-client           1.5.4
22c22
< contourpy                 1.3.0
---
> contourpy                 1.3.1
24c24
< dask                      2024.10.0
---
> dask                      2024.11.2
27c27
< earthkit-data             0.10.9
---
> earthkit-data             0.11.1
38c38
< fonttools                 4.54.1
---
> fonttools                 4.55.0
60c60
< multiurl                  0.3.2
---
> multiurl                  0.3.3
72d71
< nvidia-ml-py              12.535.161
74d72
< nvitop                    1.3.2
82d79
< psutil                    6.1.0
96c93
< setuptools                75.1.0
---
> setuptools                75.2.0
109c106
< zipp                      3.20.2
---
> zipp                      3.21.0

@owainkenwayucl
Copy link

owainkenwayucl commented Nov 21, 2024

Got it! The problem is the version of earthkit-data (and possibly the missing array_api_compat which is installed as a dependency when you install 0.11.1 of earthkit-data).

Installing version 0.11.1 fixed it.

pip install earthkit-data==0.11.1

@owainkenwayucl
Copy link

Looking at the changelog for Earthkit 0.11.x there are a lot of changes which look potentially relevant: https://earthkit-data.readthedocs.io/en/latest/release_notes/version_0.11_updates.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants