Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model data prep #6

Merged
merged 80 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
2ca9c37
Work out a bunch of stuff
collijk May 15, 2024
83d5136
Add all boilerplate for command
collijk May 15, 2024
dc3f823
Fix command line stuff
collijk May 15, 2024
ee93188
Add option when there are not paths
collijk May 15, 2024
faaaa54
Add lcz
collijk May 16, 2024
f0d160b
Bugfixes to get lcz working
collijk May 16, 2024
37f40cb
Prep training data
collijk May 16, 2024
a74bbb5
lcz extraction task
collijk May 16, 2024
06cd5d4
Prep training data script and clean up option usage
collijk May 16, 2024
5846c7f
use lcz v3
collijk May 16, 2024
4b43fb7
Add prep training data
collijk May 16, 2024
79849a0
Bugfixes and get prep training data running
collijk May 16, 2024
1ec16f7
Add station id and remove experimental data
collijk May 24, 2024
454ce1a
Uppdate rra tools and jobmon usage
collijk May 24, 2024
bad9727
REvamp era5 download script
collijk May 24, 2024
e8f41be
Add month
collijk May 24, 2024
61afa29
Fix some task names
collijk May 24, 2024
407bdc5
Need to request day
collijk May 24, 2024
ae32777
Add caching
collijk May 24, 2024
e95d050
compress files
collijk May 27, 2024
a07cbfc
Add infrastructure to download different filetypes and do compression…
collijk May 28, 2024
f8081d2
Expand variables and parallelize over users
collijk Jun 9, 2024
59d4932
Port in cmip pipeline
collijk Jun 9, 2024
aa22cda
Add notebook code for generating daily era5 estimates
collijk Jun 12, 2024
58824e9
Add cmip extraction
collijk Jun 12, 2024
790ea56
Add gcsfs and zarr dependencies
collijk Jun 12, 2024
5db69f3
CMIP6 extraction
collijk Jun 12, 2024
83d6030
CMIP6 extraction
collijk Jun 12, 2024
1c6d385
typo
collijk Jun 12, 2024
7ee36b3
some logging
collijk Jun 12, 2024
593a2a7
some logging
collijk Jun 12, 2024
0de6d2f
Change naming scheme
collijk Jun 12, 2024
1e8da84
Update runtime
collijk Jun 13, 2024
b84ef0d
Do some reorg
collijk Jun 14, 2024
c2a4520
Put together era5 daily script
collijk Jun 14, 2024
93dd769
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 14, 2024
6760a19
Fix runner
collijk Jun 14, 2024
da853c8
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 14, 2024
5272118
Add month specific logging and shorten range for testing
collijk Jun 14, 2024
c75e03b
Be lazier
collijk Jun 14, 2024
0b52a39
Add cmip daily
collijk Jun 14, 2024
51fd1bd
Change layout for era5 daily
collijk Jun 15, 2024
c61f2c6
Change era5_daily to historical_daily
collijk Jun 15, 2024
7f177c5
Add overwrite
collijk Jun 15, 2024
9309a64
Bump runtime
collijk Jun 15, 2024
d37be57
Add worflow to generate historical reference
collijk Jun 15, 2024
0188518
typo
collijk Jun 16, 2024
56b2471
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 16, 2024
ec7cf80
Add tasmin/tasmax, overwrite option, and some robustness
collijk Jun 16, 2024
fd58f3d
Add tasmin/tasmax, overwrite option, and some robustness. Fix bugs.
collijk Jun 16, 2024
19a7dd4
Formatting
collijk Jun 16, 2024
99f3872
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 16, 2024
3f43234
typo
collijk Jun 16, 2024
066e9db
merge in upstream
collijk Jun 16, 2024
d8ea998
thread through overwrite in extract cmip
collijk Jun 16, 2024
e5bd086
Better cmip logging
collijk Jun 16, 2024
8300d19
Delete some spurious historical variables, add runner for scenarios
collijk Jun 16, 2024
7a8edc9
Fix overwrite
collijk Jun 16, 2024
f010d20
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 16, 2024
164debe
Add logging, linear interp for anomaly, and multiplicative anomaly ap…
collijk Jun 16, 2024
6c2cb16
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jun 16, 2024
6770d88
Reorder load and shift longitude ops
collijk Jun 16, 2024
892d1ed
Infer variable from dataset
collijk Jun 16, 2024
5c97e12
Need call to interp calendar
collijk Jun 16, 2024
a5bef4f
Lots of fidling to get things to work
collijk Jun 17, 2024
71a81b9
Add annual scenario
collijk Jun 18, 2024
ff75c62
Catch empty workflow error
collijk Jun 19, 2024
107e5c6
Get annual working
collijk Jun 19, 2024
b643ca3
make year usage more coherent
collijk Jun 19, 2024
cd67a38
Make scenario run by year
collijk Jun 19, 2024
f7e42fe
Use transform class everywhere
collijk Jun 19, 2024
1611af2
pullback with_target_variable
collijk Jun 19, 2024
6293566
Add script to generate derived daily variables
collijk Jun 19, 2024
b1f5f12
Add readme for pipline stages
collijk Jul 7, 2024
e5a136a
Fix derived climate variables
collijk Jul 8, 2024
3b435d3
start scenario inclusion script
collijk Jul 8, 2024
e4b0e9e
Merge branch 'model-data-prep' of github.com:ihmeuw/climate-downscale…
collijk Jul 8, 2024
70f5abf
Add task to generate scenario inclusion metadata
collijk Jul 8, 2024
ae7b477
Remove extra path column
collijk Jul 8, 2024
cef92b3
So many fixes
collijk Jul 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,064 changes: 1,053 additions & 11 deletions poetry.lock

Large diffs are not rendered by default.

19 changes: 17 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,22 @@ python = ">=3.10, <3.13"
click = "*"
numpy = "^1.26.4"
pandas = "^2.2.2"
rasterra = "^0.5.9"
rasterra = "^0.5.11"
shapely = "^2.0.4"
geopandas = "^0.14.4"
xarray = "^2024.3.0"
cdsapi = "^0.7.0"
matplotlib = "^3.8.4"
scikit-learn = "^1.4.2"
rra-tools = "^1.0.6"
rra-tools = "^1.0.10"
netcdf4 = "^1.6.5"
pyarrow = "^16.0.0"
types-requests = "^2.31.0.20240406"
types-tqdm = "^4.66.0.20240417"
gcsfs = "^2024.6.0"
zarr = "^2.18.2"
types-pyyaml = "^6.0.12.20240311"
dask = "^2024.5.2"

[tool.poetry.group.dev.dependencies]
mkdocstrings = {version = ">=0.23", extras = ["python"]}
Expand Down Expand Up @@ -90,6 +95,14 @@ ignore = [
"RUF007", # zip is idiomatic, this is a dumb check
"RET505", # Else after return, makes a lot of false positives
"E501", # Line too long, this is autoformatted
"PYI041", # Use float instead of int | float; dumb rule
"T201", # print is fine for now.
"RET504", # Unnecessary assignment before return
"PLR0913", # Too many arguments in function call, hard with CLIs.
"TRY201", #
"PD010", # I like stack and unstack
"FBT001", # Boolean positional args are super common in clis
"FBT002", # Boolean positional args are super common in clis
]

[tool.ruff.lint.per-file-ignores]
Expand Down Expand Up @@ -142,6 +155,8 @@ exclude = [
[[tool.mypy.overrides]]
module = [
"cdsapi.*",
"affine.*",
"gcsfs.*",
]
ignore_missing_imports = true

Expand Down
4 changes: 2 additions & 2 deletions src/climate_downscale/cli.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import click

from climate_downscale import extract
from climate_downscale import downscale, extract, generate


@click.group()
Expand All @@ -13,7 +13,7 @@ def cdtask() -> None:
"""Entry point for running climate downscale tasks."""


for module in [extract]:
for module in [extract, downscale, generate]:
runners = getattr(module, "RUNNERS", {})
task_runners = getattr(module, "TASK_RUNNERS", {})

Expand Down
259 changes: 259 additions & 0 deletions src/climate_downscale/cli_options.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
from typing import ParamSpec, TypeVar

import click
from rra_tools.cli_tools import (
RUN_ALL,
ClickOption,
with_choice,
with_debugger,
with_input_directory,
with_num_cores,
with_output_directory,
with_progress_bar,
with_queue,
with_verbose,
)

_T = TypeVar("_T")
_P = ParamSpec("_P")


VALID_HISTORY_YEARS = [str(y) for y in range(1990, 2024)]
VALID_REFERENCE_YEARS = VALID_HISTORY_YEARS[-5:]
VALID_FORECAST_YEARS = [str(y) for y in range(2024, 2101)]


def with_year(
*,
years: list[str],
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"year",
"y",
allow_all=allow_all,
choices=years,
help="Year to extract data for.",
)


VALID_MONTHS = [f"{i:02d}" for i in range(1, 13)]


def with_month(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"month",
"m",
allow_all=allow_all,
choices=VALID_MONTHS,
help="Month to extract data for.",
)


VALID_ERA5_VARIABLES = [
"10m_u_component_of_wind",
"10m_v_component_of_wind",
"2m_dewpoint_temperature",
"2m_temperature",
"surface_net_solar_radiation",
"surface_net_thermal_radiation",
"surface_pressure",
"surface_solar_radiation_downwards",
"surface_thermal_radiation_downwards",
"total_precipitation",
"total_sky_direct_solar_radiation_at_surface",
]


def with_era5_variable(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"era5-variable",
"x",
allow_all=allow_all,
choices=VALID_ERA5_VARIABLES,
help="Variable to extract.",
)


VALID_ERA5_DATASETS = ["reanalysis-era5-land", "reanalysis-era5-single-levels"]


def with_era5_dataset(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"era5-dataset",
"d",
allow_all=allow_all,
choices=VALID_ERA5_DATASETS,
help="Dataset to extract.",
)


VALID_CMIP6_SOURCES = [
"CAMS-CSM1-0",
"CanESM5",
"CNRM-ESM2-1",
"GFDL-ESM4",
"GISS-E2-1-G",
"MIROC-ES2L",
"MIROC6",
"MRI-ESM2-0",
]


def with_cmip6_source(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"cmip6-source",
"s",
allow_all=allow_all,
choices=VALID_CMIP6_SOURCES,
help="CMIP6 source to extract.",
)


VALID_CMIP6_EXPERIMENTS = [
"ssp119",
"ssp126",
"ssp245",
"ssp370",
"ssp585",
]


def with_cmip6_experiment(
*,
allow_all: bool = False,
allow_historical: bool = False,
) -> ClickOption[_P, _T]:
choices = VALID_CMIP6_EXPERIMENTS[:]
if allow_historical:
choices.append("historical")
return with_choice(
"cmip6-experiment",
"e",
allow_all=allow_all,
choices=choices,
help="CMIP6 experiment to extract.",
)


VALID_CMIP6_VARIABLES = [
"uas",
"vas",
"hurs",
"tas",
"tasmin",
"tasmax",
"pr",
]


def with_cmip6_variable(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"cmip6-variable",
"x",
allow_all=allow_all,
choices=VALID_CMIP6_VARIABLES,
help="CMIP6 variable to extract.",
)


def with_target_variable(
*,
variable_names: list[str],
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"target-variable",
"t",
allow_all=allow_all,
choices=variable_names,
help="Variable to generate.",
)


STRIDE = 30
LATITUDES = [str(lat) for lat in range(-90, 90, STRIDE)]
LONGITUDES = [str(lon) for lon in range(-180, 180, STRIDE)]


def with_lat_start(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"lat-start",
allow_all=allow_all,
choices=LATITUDES,
help="Latitude of the top-left corner of the tile.",
)


def with_lon_start(
*,
allow_all: bool = False,
) -> ClickOption[_P, _T]:
return with_choice(
"lon-start",
allow_all=allow_all,
choices=LONGITUDES,
help="Longitude of the top-left corner of the tile.",
)


def with_overwrite() -> ClickOption[_P, _T]:
return click.option(
"--overwrite",
is_flag=True,
help="Overwrite existing files.",
)


__all__ = [
"VALID_HISTORY_YEARS",
"VALID_REFERENCE_YEARS",
"VALID_FORECAST_YEARS",
"VALID_MONTHS",
"VALID_ERA5_VARIABLES",
"VALID_ERA5_DATASETS",
"VALID_CMIP6_SOURCES",
"VALID_CMIP6_EXPERIMENTS",
"VALID_CMIP6_VARIABLES",
"STRIDE",
"LATITUDES",
"LONGITUDES",
"with_year",
"with_month",
"with_era5_variable",
"with_era5_dataset",
"with_cmip6_source",
"with_cmip6_experiment",
"with_cmip6_variable",
"with_lat_start",
"with_lon_start",
"with_output_directory",
"with_queue",
"with_verbose",
"with_debugger",
"with_input_directory",
"with_num_cores",
"with_progress_bar",
"RUN_ALL",
"ClickOption",
"with_choice",
"with_overwrite",
]
Loading
Loading