-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem writing GRIB output when initialized with ERA5 via CDS #21
Comments
Having the same issue here! @russ-schumacher which command did you use to generate the NetCDF output? |
@russ-schumacher and @jovanovski I'm also having the same issue. On our end, we're downloading the era5 data using climetlab with initial conditions being generated every 5 days in 2023. I've added the code we use to download the data to reproduce our issue. #!/usr/bin/python3
try:
from functools import lru_cache
except ImportError:
from backports.functools_lru_cache import lru_cache
import climetlab as cml
from datetime import datetime, timedelta
# Create a list of dates in 2023 with steps of 5 days
start_date = datetime(2023, 1, 1)
end_date = datetime(2023, 12, 31)
delta = timedelta(days=5)
dates = []
current_date = start_date
while current_date <= end_date:
dates.append(current_date.strftime('%Y-%m-%d'))
current_date += delta
sfc_data = cml.load_source(
"cds",
"reanalysis-era5-single-levels",
variable = ["lsm", "2t", "msl", "10u", "10v", "tp", "z"],
product_type = "reanalysis",
area = [90, 0, -90, 360],
grid = [0.25, 0.25],
date = dates,
time = "12:00",
format = "grib"
)
atm_data = cml.load_source(
"cds",
"reanalysis-era5-pressure-levels",
variable = ["t", "z", "u", "v", "w", "q"],
level = [50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 850, 925, 1000],
product_type = "reanalysis",
area = [90, 0, -90, 360],
grid = [0.25, 0.25],
date = dates,
time = "12:00",
format = "grib"
) Note that before this problem, we actually were having issues with nan values with the 2t variable because of the gribapi. I commented some lines out to bypass this as I understand some nan values in 2t are fine as long as they correspond to the ocean. However, after "fixing" this, we started seeing the same issue you guys are dealing with total precipitation. Just for the sake of providing a more detailed diagnosis, I've added the traceback error that we had with 2t before commenting some lines out from the gribapi code. Note that debugging the gribapi is a bit of a pain because there's a circular import between eccodes and the gribapi... 2024-10-02 04:41:16,975 INFO Doing full rollout prediction in JAX: 1 minute 4 seconds.
2024-10-02 04:41:16,975 INFO Converting output xarray to GRIB and saving
ECCODES ERROR : Minimum value out of range: nan
ECCODES ERROR : GRIB2 simple packing: unable to set values (Encoding invalid)
ECCODES ERROR : Unable to set double array 'codedValues' (Encoding invalid)
2024-10-02 04:41:17,883 ERROR Error setting values
2024-10-02 04:41:17,883 ERROR Encoding invalid
Traceback (most recent call last):
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/codes.py", line 221, in set_values
eccodes.codes_set_values(self._handle, values.flatten())
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 2126, in grib_set_values
grib_set_double_array(gribid, "values", values)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 1200, in grib_set_double_array
GRIB_CHECK(lib.grib_set_double_array(h, key.encode(ENC), a, length))
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 232, in GRIB_CHECK
errors.raise_grib_error(errid)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/errors.py", line 381, in raise_grib_error
raise ERROR_MAP[errid](errid)
gribapi.errors.EncodingError: Encoding invalid
2024-10-02 04:41:17,886 INFO Saving output data: 0.9 second.
2024-10-02 04:41:17,886 INFO Total time: 1 minute 26 seconds.
Traceback (most recent call last):
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/outputs/__init__.py", line 62, in write
handle, path = self.output.write(data, *args, **kwargs)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/output.py", line 390, in write
handle = self._coder.encode(
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/output.py", line 132, in encode
handle.set_values(values)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/earthkit/data/readers/grib/codes.py", line 221, in set_values
eccodes.codes_set_values(self._handle, values.flatten())
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 2126, in grib_set_values
grib_set_double_array(gribid, "values", values)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 1200, in grib_set_double_array
GRIB_CHECK(lib.grib_set_double_array(h, key.encode(ENC), a, length))
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/gribapi.py", line 232, in GRIB_CHECK
errors.raise_grib_error(errid)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/gribapi/errors.py", line 381, in raise_grib_error
raise ERROR_MAP[errid](errid)
gribapi.errors.EncodingError: Encoding invalid
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/gpfs/home/bsc/bsc927078/graphcast_snake/bin/ai-models", line 8, in <module>
sys.exit(main())
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 362, in main
_main(sys.argv[1:])
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 310, in _main
run(vars(args), unknownargs)
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/__main__.py", line 335, in run
model.run()
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models_graphcast/model.py", line 232, in run
save_output_xarray(
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models_graphcast/output.py", line 68, in save_output_xarray
write(
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/model.py", line 120, in write
self.output.write(*args, **kwargs, **self.grib_extra_metadata),
File "/gpfs/home/bsc/bsc927078/graphcast_snake/lib/python3.10/site-packages/ai_models/outputs/__init__.py", line 67, in write
raise ValueError(f"NaN values found in field. args={args} kwargs={kwargs}")
ValueError: NaN values found in field. args=() kwargs={'template': GribField(2t,None,20230101,1200,0,0), 'step': 6} |
If you add the "--debug" flag when running, it will output a netcdf of the model output, as well as the input training matrix, and a couple others. |
Also having the same issue, also downloading from CDS.
|
Very upsettingly, I created a different virtual env with the same (?) packages installed, possibly in a different order and it now works. here is the difference between
|
Got it! The problem is the version of Installing version 0.11.1 fixed it.
|
Looking at the changelog for Earthkit 0.11.x there are a lot of changes which look potentially relevant: https://earthkit-data.readthedocs.io/en/latest/release_notes/version_0.11_updates.html |
When running graphcast using CDS, using a command like this:
ai-models --input cds --date 20221220--time 0000 --path "output-graphcast.grib" graphcast
the model runs successfully, but then fails to write the grib file on forecast hour 6. The issue appears to be with writing the precipitation field, throwing this error:
I used the debug option and wrote output.nc, which looks fine, so the issue simply appears to be writing out to grib.
When initializing using opendata, this issue does not occur and everything looks fine.
A possible hint at the cause is from wgrib2...for the opendata-initialized run that works, the grib record looks like this:
7:5174768:d=2024092600:var discipline=0 center=98 local_table=1 parmcat=1 parm=193:surface:0-0 day acc fcst:
while in the cds-initialized run that fails, the grib record looks like this in the hour zero file (nothing is written after that):
7:14536536:d=2022122000:TPRATE:surface:0-0 day acc fcst:
Not sure why a different precipitation parameter is being written depending on the initialization source, though. Thanks for any insights!
The text was updated successfully, but these errors were encountered: