From 3f564ef5a3d6f9bab4f675d328e40f49d3c8ac23 Mon Sep 17 00:00:00 2001 From: pciturri Date: Thu, 26 Sep 2024 02:53:53 +0200 Subject: [PATCH] ft: time configuration can now be instantiated by explicit time windows docs: completed Experiment Configuration section, --- docs/guide/config.rst | 4 - ...tests_config.rst => evaluation_config.rst} | 2 + docs/guide/experiment_config.rst | 269 ++++++++++++++++++ docs/guide/model_config.rst | 2 + docs/guide/postprocess_config.rst | 4 +- docs/guide/region_config.rst | 4 - docs/guide/time_config.rst | 4 - docs/index.rst | 13 +- floatcsep/utils/helpers.py | 27 +- 9 files changed, 297 insertions(+), 32 deletions(-) delete mode 100644 docs/guide/config.rst rename docs/guide/{tests_config.rst => evaluation_config.rst} (67%) create mode 100644 docs/guide/experiment_config.rst delete mode 100644 docs/guide/region_config.rst delete mode 100644 docs/guide/time_config.rst diff --git a/docs/guide/config.rst b/docs/guide/config.rst deleted file mode 100644 index 2a6bd75..0000000 --- a/docs/guide/config.rst +++ /dev/null @@ -1,4 +0,0 @@ -Experiment Configuration -======================== - -In progress \ No newline at end of file diff --git a/docs/guide/tests_config.rst b/docs/guide/evaluation_config.rst similarity index 67% rename from docs/guide/tests_config.rst rename to docs/guide/evaluation_config.rst index c83c7ff..e4939be 100644 --- a/docs/guide/tests_config.rst +++ b/docs/guide/evaluation_config.rst @@ -1,3 +1,5 @@ +.. _evaluation_config: + Evaluations Definition ====================== diff --git a/docs/guide/experiment_config.rst b/docs/guide/experiment_config.rst new file mode 100644 index 0000000..a62a3e5 --- /dev/null +++ b/docs/guide/experiment_config.rst @@ -0,0 +1,269 @@ +Experiment Configuration +======================== + +**floatCSEP** provides a standardized workflow for forecasting experiments, whose instructions can be set in a configuration file. Here, we need to define the Experiments' temporal settings, geographic region, seismic catalogs, forecasting models, evaluation tests and post-process. + + +Configuration Structure +----------------------- +Configuration files are written in ``YAML`` format and is divided into different aspects of the Experiment setup: + +1. **Experiment Metadata**: Experiment's information such as its ``name``, ``authors``, ``doi``, ``URL``, etc. +2. **Temporal Configuration** (``time_config``): Temporal characteristics of the experiment, such as the start and end dates, experiment class (time-independent or time-dependent), the testing intervals, etc. +3. **Spatial and Magnitude Configuration** (``region_config``): Describes testing region, such as its geographic bounds, magnitude ranges, and depth ranges. +4. **Seismic Catalog** (``catalog``): Defines the seismicity data source to test the models. It can either link to a seismic network API, or an existing file in the system. +5. **Models** (``models``): Configuration of forecasting models. It can direct to an additional configuration file with ``model_config`` for readability. See :ref:`model_config`. +6. **Evaluation Tests** (``tests``): Configuration of the statistical tests to evaluate the models. It can direct to an additional configuration file with ``test_config`` for readability. See :ref:`evaluation_config`. +7. **Postprocessing** (``postprocess``): Instructions on how to process and visualize the experiment's results, such as plotting forecasts or generating reports. See :ref:`postprocess`. + +.. note:: + + YAML (Yet Another Markup Language) is a human-readable format used for configuration files. It uses **key: value** pairs to define settings, and indentation to represent nested structures. Lists are denoted by hyphens (`-`). + + +**Example Basic Configuration** (``config.yml``): + +.. code-block:: yaml + + name: CSEP Experiment + time_config: + start_date: 2010-1-1T00:00:00 + end_date: 2020-1-1T00:00:00 + region_config: + region: region.txt + mag_min: 4.0 + mag_max: 9.0 + mag_bin: 0.1 + depth_min: 0 + depth_max: 70 + catalog: catalog.csv + models: + - Smoothed-Seismicity: + path: ssm.dat + - Uniform: + path: uniform.dat + tests: + - Poisson S-test: + func: poisson_evaluations.spatial_test + plot_func: plot_poisson_consistency_test + postprocess: + plot_forecasts: + cmap: magma + catalog: True + + + +Experiment Metadata +------------------- + +.. list-table:: + :widths: 20 80 + + * - **name** + - Maximum magnitude to be considered. + * - **authors** + - Authors of the experiment + * - **doi** + - identifier associated to the experiment + * - **URL** + - repository of the experiment (e.g., Github) + +Any non-parsed parameter (e.g., not specified in the documentation) will be stored also as metadata. + + +Temporal Definition +------------------- + +Configuring the temporal definition of the experiment is indicated with the ``time_config`` option, followed by an indented block of admissible parameters. The purpose of this configuration section is to set a testing **time-window**, or a sequence of **time-windows**. Each time-window is then defined by ``datetime`` strings representing its lower and upper edges. + +Time-windows can be defined from different combination of the following parameters: + +.. list-table:: + :widths: 20 80 + + * - **start_date** + - The start date of the experiment in UTC and ISO8601 format (``%Y-%m-%dT%H:%M:%S``) + * - **end_date** + - The end date of the experiment in UTC and ISO8601 format (``%Y-%m-%dT%H:%M:%S``) + * - **intervals** + - An integer amount of testing intervals (time windows). + * - **horizon** + - Indicates the time windows `length`. It is written as a number, followed by a hyphen (`-`) and a time unit (``days``, ``weeks``, ``months``, ``years``). e.g.: ``1-days``, ``2.5-years``. + * - **offset** + - Offset between consecutive time-windows. If none given or ``offset=horizon``, time-windows are non-overlapping. It is written as a number, followed by a hyphen (`-`) and a time unit (``days``, ``weeks``, ``months``, ``years``). e.g.: ``1-days``, ``2.5-years``. + * - **growth** + - How to discretize the time-windows between ``start_date`` and ``end_date``. Options are: **incremental** (The end of a time window matches the beginning of the next) or **cumulative** (All time-windows have a start at the experiment ``start_date``). + * - **exp_class** + - Experiment temporal class. Options are: + **ti** (default): Time-Independent; **td**: Time-Dependent. + +.. note:: + + For a Time-Independent (``ti``) experiment class, the following argument combinations are possible: + + - (``start_date``, ``end_date``) + - (``start_date``, ``end_date``, ``intervals``) + - (``start_date``, ``end_date``, ``horizon``) + - (``start_date``, ``intervals``, ``horizon``) + + For a Time-Dependent (``td``) experiment class, the following argument combinations are possible: + + - (``start_date``, ``end_date``, ``intervals``) + - (``start_date``, ``end_date``, ``horizon``) + - (``start_date``, ``intervals``, ``horizon``) + - (``start_date``, ``end_date``, ``horizon``, ``offset``) + - (``start_date``, ``intervals``, ``horizon``, ``offset``) + + +Some example of parameter combinations: + ++------------------------------------------------+----------------------------------------------------------+ +| .. code-block:: yaml | Two time-windows of equal size between 2010 and 2020 | +| | | +| time_config: | - ``2010-01-01T00:00:00`` - ``2015-01-01T00:00:00`` | +| start_date: 2010-01-01T00:00:00 | - ``2015-01-01T00:00:00`` - ``2020-01-01T00:00:00`` | +| end_date: 2020-01-01T00:00:00 | | +| intervals: 2 | | ++------------------------------------------------+----------------------------------------------------------+ +| .. code-block:: yaml | Two cummulative time-windows between 2010 and 2020 | +| | | +| time_config: | - ``2010-01-01T00:00:00`` - ``2015-01-01T00:00:00`` | +| start_date: 2010-01-01T00:00:00 | - ``2010-01-01T00:00:00`` - ``2020-01-01T00:00:00`` | +| end_date: 2020-01-01T00:00:00 | | +| intervals: 2 | | +| growth: cumulative | | ++------------------------------------------------+----------------------------------------------------------+ +| .. code-block:: yaml | Time-Dependent experiment with three ``1-day`` windows | +| | | +| time_config: | | +| start_date: 2010-01-01T00:00:00 | - ``2010-01-01T00:00:00`` - ``2010-01-02T00:00:00`` | +| intervals: 3 | - ``2010-01-02T00:00:00`` - ``2010-01-03T00:00:00`` | +| horizon: 1-days | - ``2010-01-03T00:00:00`` - ``2010-01-04T00:00:00`` | +| exp_class: td | | ++------------------------------------------------+----------------------------------------------------------+ +| .. code-block:: yaml | Two overlapping ``7-days`` time-windows | +| | | +| time_config: | - ``2010-01-01T00:00:00`` - ``2010-01-08T00:00:00`` | +| start_date: 2010-01-01T00:00:00 | - ``2010-01-02T00:00:00`` - ``2020-01-09T00:00:00`` | +| intervals: 2 | | +| horizon: 7-days | | +| offset: 1-day | | +| exp_class: td | | ++------------------------------------------------+----------------------------------------------------------+ + +Alternatively, time windows can be defined explicitly as a **list** of timewindow (which are a **list** of ``datetimes``): + +.. code-block:: yaml + + time_config: + timewindows: + - - 2010-01-01T00:00:00 + - 2011-01-01T00:00:00 + - - 2011-01-01T00:00:00 + - 2012-01-01T00:00:00 + +Spatial and Magnitude Definition +-------------------------------- + +Configuring the spatial and magnitude definitions is done through the ``region_config`` option, followed by an indented block of admissible parameters. Here, we need to define the spatial region (check the `Region `_ concept from **pyCSEP**), the magnitude `bins` (i.e., discretization) and the `depth` extent (used mostly to filter out seismicity outside this range): + +.. list-table:: + :widths: 20 80 + + * - **region** + - The spatial domain where forecasts will be tested. Either a file or a **CSEP** region. + * - **mag_min** + - The minimum magnitude of the experiment. + * - **mag_max** + - The maximum magnitude of the experiment. + * - **mag_bin** + - The size of a magnitude bin. + * - **depth_min** + - The minimum depth (in `km`) of the experiment. + * - **depth_max** + - The maximum depth (in `km`) of the experiment. + + +1. The ``region`` parameter can be defined from: + + + * A **CSEP** region: This correspond to pre-established testing regions for highly seismic areas. This parameter is linked to **pyCSEP** functions, and can be either ``italy_csep_region``, ``nz_csep_region``, ``california_relm_region`` or ``global_region``. + * A `txt` file with the spatial cells collection , where each cell is defined by its origin (e.g., the x (lon) and y (lat) of the lower-left corner). See the **pyCSEP** `documentation `_ on Regions, the class :class:`~csep.core.regions.CartesianGrid2D` and its method :meth:`~csep.core.regions.CartesianGrid2D.from_origins`. For example, for a region consisting of three cells, their origins can be written as: + + .. code-block:: + + 10.0 40.0 + 10.0 40.1 + 10.1 40.0 + +2. Magnitude definition: We need to define a magnitude discretization or `bins`. The parameters **mag_min**, **mag_max**, **mag_bin** allows to create an uniformly distributed set of bins. For example, the command: + + .. code-block:: yaml + + mag_min: 4.0 + mag_max: 5.0 + mag_bin: 0.5 + + would result in two magnitude bins with ranges ``[4.0, 4.5)`` and ``[4.5, 5.0)``. Alternatively, magnitudes can be written explicitly by their bin edges. For example: + + .. code-block:: yaml + + magnitudes: + - 4.0 + - 4.1 + - 4.2 + + resulting in the ``[4.0, 4.1)`` and ``[4.1, 4.2)`` + +3. Depths: The minimum and maximum depths are just required to filter out seismicity outside those ranges. + + +Some example of region configurations would be: + ++------------------------------------------------+---------------------------------------------------------------------+ +| .. code-block:: yaml | - Uses the **CSEP** Italy region, as defined by the function | +| | :func:`~csep.core.regions.italy_csep_region`. | +| region_config: | - Discretizes the magnitude range into 40 bins between 4.0 and 9.0 | +| region: italy_csep_region | - Test the models against `crustal` seismicity above 30 km. | +| mag_min: 5.0 | The -2 is meant in case of shallow events above sea level | +| mag_max: 9.0 | | +| mag_bin: 0.1 | | +| depth_min: -2 | | +| depth_max: 30 | | ++------------------------------------------------+---------------------------------------------------------------------+ +| .. code-block:: yaml | - Loads a file ``region_file.txt`` which contains the cells' | +| | originsof the region. | +| region_config: | - Contains two magnitude bins: ``[6.0, 7.0)``, ``[7.0, 8.0)`` and | +| region: region_file.txt | | +| depth_min: 70 | | +| depth_max: 150 | | +| magnitudes: | | +| - 6.0 | | +| - 7.0 | | +| - 8.0 | | ++------------------------------------------------+---------------------------------------------------------------------+ + + +Seismicity Catalog +------------------ + +The seismicity catalog can be defined with the ``catalog`` parameter. It represents the **main catalog** of the experiment, and will be used to test the forecasts against, or if required, as input catalog for time-dependent models. It can be obtained from: + +* **Authorative data source** + + **floatCSEP** can retrieve the catalog from a seismic network API. The possible options are: + + - ``query_gcmt``: Global Centroid Moment Tensor Catalog (https://www.globalcmt.org/) - via the API of the ISC (https://www.isc.ac.uk/) + - ``query_comcat``: ANSS Comprehensive Earthquake Catalog ComCat (https://earthquake.usgs.gov/data/comcat/) + - ``query_bsi``: Bollettino Sismico Italiano (https://bsi.ingv.it/) + - ``query_gns``: GNS GeoNet New Zealand Catalog (https://www.geonet.org.nz/) + +* **Catalog file in pyCSEP format** + + A file can be used as **main catalog**. It must be in a **pyCSEP** format, namely in the :meth:`~pycsep.utils.readers.csep_ascii` style (see :doc:`pycsep:concepts/catalogs`) or ``.json`` format. The latter is the default catalog used by **floatCSEP**, as it allows the storage of metadata. + + .. note:: + A catalog can be stored as ``.json`` with :meth:`CSEPCatalog.write_json() ` using **pyCSEP**. + +.. important:: + The main catalog will be stored, and consecutively filtered to the extent of each testing time-window, as well as to the experiment's spatial domain, and magnitude- and depth- ranges. diff --git a/docs/guide/model_config.rst b/docs/guide/model_config.rst index 18e29db..b3faccc 100644 --- a/docs/guide/model_config.rst +++ b/docs/guide/model_config.rst @@ -1,3 +1,5 @@ +.. _model_config: + Models Configuration ==================== diff --git a/docs/guide/postprocess_config.rst b/docs/guide/postprocess_config.rst index d682785..ff6c2e3 100644 --- a/docs/guide/postprocess_config.rst +++ b/docs/guide/postprocess_config.rst @@ -1,4 +1,6 @@ -Post-process Options +.. _postprocess: + +Post-Process Options ==================== TBI \ No newline at end of file diff --git a/docs/guide/region_config.rst b/docs/guide/region_config.rst deleted file mode 100644 index add2089..0000000 --- a/docs/guide/region_config.rst +++ /dev/null @@ -1,4 +0,0 @@ -Region Definition -================= - -TBI \ No newline at end of file diff --git a/docs/guide/time_config.rst b/docs/guide/time_config.rst deleted file mode 100644 index a96c689..0000000 --- a/docs/guide/time_config.rst +++ /dev/null @@ -1,4 +0,0 @@ -Temporal Definition -=================== - -TBI \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index fa8007c..8d36a03 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,7 +1,6 @@ =============================== floatCSEP: Floating Experiments =============================== - *Testing earthquake forecasts made simple.* .. image:: https://img.shields.io/badge/GitHub-Repository-blue?logo=github @@ -76,9 +75,9 @@ The `Collaboratory for the Study of Earthquake Predictability