Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WW3 recipe #57

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ jobs:
f.write(f"max_num_workers={max_num_workers}")

- name: "Deploy recipes"
uses: "pangeo-forge/deploy-recipe-action@v0.1"
uses: "pangeo-forge/deploy-recipe-action@v0.2"
with:
select_recipe_by_label: true
pangeo_forge_runner_config: >
Expand All @@ -70,7 +70,8 @@ jobs:
"service_account_email": "[email protected]",
"project_id": "leap-pangeo",
"temp_gcs_location": "gs://leap-scratch/data-library/temp",
"max_num_workers": ${{ env.max_num_workers }}
"max_num_workers": ${{ env.max_num_workers }},
"machine_type": "n2-highmem-4"
},
"TargetStorage": {
"fsspec_class": "gcsfs.GCSFileSystem",
Expand Down
11 changes: 11 additions & 0 deletions feedstock/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,17 @@ recipes:
- licensor
url: https://arxiv.org/abs/2306.08754
license: "Apache-2.0"
- id: WW3
object: "ww3:WW3"
description: "20 year global ocean surface wave hindcast provided by Ifremer (French Research Institute for Exploitation of the Sea)"
provenance:
providers:
- name: "Ifremer"
description: "Ifremer"
roles:
- host
url: https://www.ifremer.fr/en
license: "TBW"
maintainers:
- name: "Julius Busecke"
orcid: 0000-0001-8571-865X
Expand Down
1 change: 1 addition & 0 deletions feedstock/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
# once a `0.10.1` release of `pangeo-forge-recipes` is available that includes this fix, we should
# install from that release instead.
git+https://github.com/pangeo-forge/pangeo-forge-recipes.git@2739f2264ec385eadd3c73226fb85f5cdbf32a1a
gcsfs
59 changes: 59 additions & 0 deletions feedstock/ww3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
"""
Wave Watch 3
"""
import apache_beam as beam
from pangeo_forge_recipes.patterns import pattern_from_file_sequence
from pangeo_forge_recipes.transforms import (
Indexed,
OpenURLWithFSSpec,
OpenWithXarray,
StoreToZarr,
T,
)

years = range(1993, 2023)
months = range(1, 13)
dates = []
for y in years:
for m in months:
dates.append((y, m))


def make_full_path(date: tuple[int, int]):
year, month = date
return f'https://data-dataref.ifremer.fr/ww3/GLOBMULTI_ERA5_GLOBCUR_01/GLOB-30M/{year}/FIELD_NC/LOPS_WW3-GLOB-30M_{year}{month:02d}.nc'


input_urls = [make_full_path(date) for date in dates]
pattern = pattern_from_file_sequence(input_urls, concat_dim='time')


# does this succeed with all coords stripped?
class StripCoords(beam.PTransform):
@staticmethod
def _strip_all_coords(item: Indexed[T]) -> Indexed[T]:
"""
Many netcdfs contain variables other than the one specified in the `variable_id` facet.
Set them all to coords
"""
index, ds = item
print(f'Preprocessing before {ds =}')
ds = ds.reset_coords(drop=True)
print(f'Preprocessing after {ds =}')
return index, ds

def expand(self, pcoll: beam.PCollection) -> beam.PCollection:
return pcoll | 'Debug: Remove coordinates' >> beam.Map(self._strip_all_coords)


WW3 = (
beam.Create(pattern.items())
| OpenURLWithFSSpec()
| OpenWithXarray(xarray_open_kwargs={'preprocess':lambda ds: ds.set_coords('MAPSTA')})
| StripCoords()
| StoreToZarr(
store_name='WW3.zarr',
combine_dims=pattern.combine_dim_keys,
target_chunks={'time': 200},
)
)
Loading