Skip to content

rodekruis/IBF-river-flood-pipeline

Repository files navigation

IBF-river-flood-pipeline

Forecast riverine flooding. Part of IBF-system.

Description

The pipeline roughly consists of three steps:

  • Extract data on river discharge from an external provider, both in specific locations (stations) and over pre-defined areas (administrative divisions).
  • Forecast floods by determining if the river discharge is higher than pre-defined thresholds; if so, calculate flood extent and impact (affected people).
  • Send this data to the IBF app.

The pipeline stores data in:

  • ibf-cosmos (Azure Cosmos DB): river discharge per station / administrative division, flood forecasts, and trigger thresholds
  • 510ibfsystem (Azure Storage Account): raw data from GloFAS and other geospatial data

The pipeline depends on the following services:

For more information, see the functional architecture diagram.

Basic Usage

To run the pipeline locally

  1. fill in the secrets in .env.example and rename the file to .env; in this way, they will be loaded as environment variables
  2. install requirements
pip install poetry
poetry install --no-interaction
  1. run the pipeline with python flood_pipeline.py
Usage: flood_pipeline.py [OPTIONS]

Options:
  --country TEXT        country ISO3
  --prepare             prepare discharge data
  --extract             extract discharge data
  --forecast            forecast floods
  --send                send to IBF
  --save                save to storage
  --datetimestart TEXT  datetime start ISO 8601
  --datetimeend TEXT    datetime end ISO 8601
  --debug               debug mode: process only one ensemble member from yesterday
  --help                show this message and exit.

Scenarios

You can run the pipeline with dummy data that simulate different scenarios using python run_scenarios.py:

Usage: run_scenario.py [OPTIONS]

Options:
  -s, --events TEXT       list of events
  -c, --country TEXT      country
  -d, --upload_time TEXT  upload datetime [optional]
  --help                  show this message and exit

The scenario is entirely defined by the events parameter, which is a list of dictionaries. Each dictionary represents a flood event and has the following keys:

  • station-code: the unique ID of the GloFAS station
  • type: the type of the event, either trigger, low-alert, medium-alert, or no-trigger
  • lead-time: the number of days between the forecast and the event

Example of a scenario for Kenya with two events:

python run_scenario.py -c "KEN" -s '[{"station-code": "G5305", "type": "trigger", "lead-time": 5}, {"station-code": "G5142", "type": "medium-alert", "lead-time": 7}]'

Tip

Scenarios can be run remotely and loaded to ibf-test through the Logic App river-flood-pipeline-scenario. Just make a POST request to the Logic App with the necessary payload1.

Advanced Usage

Bug fixing

  1. Identify the error in the most recent runs of river-flood-pipeline-prod:
    • Check run time, if too short or too long (should be about 30-90 min)
    • Check if any action doesn't have a green tick (failed action)
    • Check the logs in "Get logs from a container instance"
    • Find out which part of code fails based on traceback and error messages in the logs
  2. Fix the bug on dev and test:
    • Checkout the dev branch and pull the latest changes (git checkout dev && git pull)
    • Fix the bug
    • [OPTIONAL] Test locally, if you have time and disk space (a few GBs are needed)
    • Commit and push the changes to the dev branch (git add . && git commit -m "bug fix" && git push origin dev)
    • Test remotely with river-flood-pipeline-dev, which will upload the output to ibf-test; just make a POST request to the Logic App with the necessary payload1. Example in Python:
import requests
url = "https://prod-79.westeurope.logic.azure.com:443/workflows/..."
response = requests.request("POST", url, json={"country": "KEN"})
  1. Deploy to prod:
    • Open a pull request (PR) from dev to main and inform IBF team that it needs to be merged
    • Wait for the PR to be merged and the code deployed
  2. Re-submit the failed run of river-flood-pipeline-prod or wait for it to run again the next day.

How do I set up the pipeline for a new country?

  1. Check that the administrative boundaries are in the IBF system; if not, ask IBF developers to add them
  2. Add country-specific configuration in config/config.yaml
  3. Create historical flood extent maps
python data_updates\add_flood_maps.py --country <country ISO3>
  1. Compute trigger and alert thresholds
python data_updates\add_flood_thresholds.py --country <country ISO3>
  1. Update Glofas data pipeline in IBF-data-factory so that it will trigger a pipeline run for the new country

How do I insert an exception for a specific country?

You don't. The pipeline is designed to work in the same way for all countries. If you need to change the pipeline's behavior for a specific country, please discuss your needs with your fellow data specialist, they will try their best to accommodate your request.

There is a new version of GloFAS, should I update the pipeline? How?

GloFAS should update river discharge data in a backward-compatible way, i.e. without changing the data model. If that is not the case, you need to have a look at floodpipeline/extract.py and change what's needed.

What will probably change with the new GloFAS version are the trigger/alert thresholds. To update the trigger/alert thresholds... [TBI].

Footnotes

  1. The URL and Request Body JSON Schema of the Logic App trigger are visible in Develooment Tools > Logic app deisgner > HTTP request trigger. 2

About

IBF river flood pipeline

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published