From 5e0935a5943104aa42d5497fdecf1ce2424e1105 Mon Sep 17 00:00:00 2001
From: Gabriel Stefanini Vicente
-2. **Data Products**. These are analytical products derived from the Foundational Datasets, which can be further used to generate indicators and insights. All data products include documentation, links to original data sources (and/or information on how to access them), and a description of their limitations. Reference resources are also cited, where relevant. In the documentation, each Data Product has it's own "chapter", generated through use of a Jupyter notebook.
+2. **Data Products**. These are analytical products derived from the Datasets, which can be further used to generate indicators and insights. All data products include documentation, links to original data sources (and/or information on how to access them), and a description of their limitations. Reference resources are also cited, where relevant. In the documentation, each Data Product has it's own "chapter", generated through use of a Jupyter notebook.
-3. **Insights and Indicators**. Each Data Goods package may also include additional analytical work, such as dynamic maps, data visualizaations, and/or sample indicators. Indicators can be derived from a combination of **Foundational Datasets** and **Data Products**. By combining these two inputs, teams are empowered to develop a large array of indicators to meet their project needs. Indicators can be presented side-by-side in an Excel workbook -- a format that is generally accessible to the widest audience. Because all indicators are based on the same underlying data, they are comparable with each other, across geographies and across time.
+3. **Insights and Indicators**. Each Data Goods package may also include additional analytical work, such as dynamic maps, data visualizaations, and/or sample indicators. Indicators can be derived from a combination of **Datasets** and **Data Products**. By combining these two inputs, teams are empowered to develop a large array of indicators to meet their project needs. Indicators can be presented side-by-side in an Excel workbook -- a format that is generally accessible to the widest audience. Because all indicators are based on the same underlying data, they are comparable with each other, across geographies and across time.
4. **Data Lab Team**. For each project, the [World Bank Data Lab](https://wbdatalab.org/) recruits colleagues from throughout our organization, pooling our collective great talents in support of our lending and technical assistance operations. Data Goods documentation includes a list and contact information for the unique team that prepared the Goods.
diff --git a/docs/requirements.txt b/docs/requirements.txt
deleted file mode 100644
index 873d7b8b..00000000
--- a/docs/requirements.txt
+++ /dev/null
@@ -1,2 +0,0 @@
-docutils==0.17.1
-jupyter-book==1.0.0
\ No newline at end of file
diff --git a/notebooks/hsos-survey/README.md b/notebooks/hsos-survey/README.md
index d150d034..5847ad1f 100644
--- a/notebooks/hsos-survey/README.md
+++ b/notebooks/hsos-survey/README.md
@@ -1,5 +1,4 @@
-# Humanitarian Assistance Survey Data
-
+# Humanitarian Assistance Survey
## Data
@@ -7,7 +6,6 @@ We use data from the Humanitarian Situation Overview Survey (HSOS) conducted by
The HSOS includes information on community situations and needs relating to shelter, electricity, water sanitation and hygiene (WASH), food security, livelihoods, health, education, humanitarian assistance, and priority needs. HSOS has disaggregated information on the different conditions and needs of residents and internally displaced persons (IDPs).
-
## Methodology
In the analysis, we look at trends in the data before and after the earthquake, and whether those trends differ depending on severity of exposure to the earthquake. We use data from December 2022 as the most recent data before the earthquake and April 2023 as the first dataset after the earthquake. Note that we do not include data from January 2023 as it was collected for NWS only, and we do not include data from February 2023 as it was collected during the time of the earthquake.
@@ -16,7 +14,6 @@ We compare outcomes for communities severely impacted by the earthquake before/a
The earthquake exposure measure uses the Modified Mercalli Intensity Scale (mmi). From this we derive three groups of communities: light earthquake intensity, moderate intensity, and strong or very strong intensity.
-
## Implementation
Code to replicate the analysis can be found [here](https://github.com/datapartnership/syria-economic-monitor/tree/main/notebooks/hsos-survey/notebooks/hsos-survey/Do%20Files/).
@@ -24,11 +21,10 @@ The [main script](https://github.com/datapartnership/syria-economic-monitor/tree
The script to produce the graphs can be found [here](https://github.com/datapartnership/syria-economic-monitor/tree/main/notebooks/hsos-survey/notebooks/hsos-survey/Do%20Files/4_Bar%Graphs.do).
All data used in the analysis can be downloaded from [Syria Humanitarian Situation Overview Survey](https://reach-info.org/syr/hsos/).
-
-
## Findings
### Humanitarian Assistance
+
Communities strongly affected by the earthquake (in Northwest Syria) were
more likely to receive humanitarian assistance than communities that were
less affected (in Northeast Syria), both before and after the earthquake.
@@ -77,6 +73,7 @@ Access to Humanitarian Aid - Voucher
diff --git a/notebooks/internet-connectivity/README.md b/notebooks/internet-connectivity/README.md index 6949e72b..257963c5 100644 --- a/notebooks/internet-connectivity/README.md +++ b/notebooks/internet-connectivity/README.md @@ -19,4 +19,3 @@ Once the data was obtained from the Ookla Speedtest portal, the point data were The limitation of Ookla's Speedtest connectivity relies on user-generated tests. Although the average number of users who take a test remain relatively consistent per month, it is subject to fluctuation. The dataset also does not contain information about the same latitude and longitude consistently. However, given aggregated data is used (at admin 1 or admin 2 levels), the findings can still be useful. ## Next Steps - diff --git a/notebooks/mobility/README.md b/notebooks/mobility/README.md index 12d5a100..a20b2cfd 100644 --- a/notebooks/mobility/README.md +++ b/notebooks/mobility/README.md @@ -10,7 +10,7 @@ This pilot study seeks to demonstrate the potential of mobility data as a powerf This pilot study resulted in following (experimental) outputs: -- {ref}`mobility-stops` +- [Estimating cross-border movement in Lebanon and Syria through Mobility Data](stops/README.md) - [Türkiye-Syria Earquake Impact](https://datapartnership.org/turkiye-earthquake-impact/notebooks/mobility/README.html) ## Data diff --git a/notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb b/notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb index ca4b428f..8f304150 100644 --- a/notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb +++ b/notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb @@ -71,6 +71,7 @@ "source": [ "# Below, see auxiliary functions. Ideally, we'll move them to a package in the future\n", "\n", + "\n", "def get_h3_tessellation(\n", " gdf: geopandas.GeoDataFrame, name=\"shapeName\", resolution=RESOLUTION\n", "):\n", @@ -84,7 +85,9 @@ " match geometry.geom_type:\n", " case \"Polygon\":\n", " hex_ids = h3.polyfill(\n", - " shapely.geometry.mapping(geometry), resolution, geo_json_conformant=True\n", + " shapely.geometry.mapping(geometry),\n", + " resolution,\n", + " geo_json_conformant=True,\n", " )\n", "\n", " h3_tessellation = h3_tessellation.union(set(hex_ids))\n", @@ -93,7 +96,9 @@ " case \"MultiPolygon\":\n", " for x in geometry.geoms:\n", " hex_ids = h3.polyfill(\n", - " shapely.geometry.mapping(x), resolution, geo_json_conformant=True\n", + " shapely.geometry.mapping(x),\n", + " resolution,\n", + " geo_json_conformant=True,\n", " )\n", "\n", " h3_tessellation = h3_tessellation.union(set(hex_ids))\n", @@ -715,7 +720,7 @@ "source": [ "### Region B\n", "\n", - "In this step, we generate **Region B** defined as a border buffer strip between Lebanon and Syria, using boudaries provided by [geoBoundaries](https://www.geoboundaries.org)." + "In this step, we generate **Region B** defined as a border buffer strip between Lebanon and Syria, using boundaries provided by [geoBoundaries](https://www.geoboundaries.org)." ] }, { @@ -1635,7 +1640,7 @@ "TESSELLATION.to_crs(epsg=3857).plot(ax=ax, alpha=0.25, color=\"red\", edgecolor=\"k\")\n", "\n", "ax.axis(\"off\")\n", - "ax.set_title(f\"Lebanon-Syria H3 Tessellation\", fontsize=24)\n", + "ax.set_title(\"Lebanon-Syria H3 Tessellation\", fontsize=24)\n", "\n", "cx.add_basemap(ax)" ] @@ -1665,7 +1670,9 @@ }, "outputs": [], "source": [ - "TESSELLATION.to_file(\"../../data/interim/tessellation/LNBSYRH3.geojson\", driver=\"GeoJSON\")" + "TESSELLATION.to_file(\n", + " \"../../data/interim/tessellation/LNBSYRH3.geojson\", driver=\"GeoJSON\"\n", + ")" ] }, { @@ -1680,10 +1687,14 @@ "outputs": [], "source": [ "TESSELLATION[\"coordinates\"] = TESSELLATION[\"tile_ID\"].apply(h3.h3_to_geo)\n", - "TESSELLATION[[\"lat\", \"lon\"]] = pd.DataFrame(TESSELLATION[\"coordinates\"].tolist(), columns=[\"lat\", \"lon\"])\n", + "TESSELLATION[[\"lat\", \"lon\"]] = pd.DataFrame(\n", + " TESSELLATION[\"coordinates\"].tolist(), columns=[\"lat\", \"lon\"]\n", + ")\n", "TESSELLATION[\"id\"] = TESSELLATION[\"tile_ID\"]\n", "TESSELLATION[\"name\"] = TESSELLATION[\"tile_ID\"]\n", - "TESSELLATION[[\"id\", \"name\", \"lat\", \"lon\"]].to_csv(\"../../data/final/locations.csv\", index=False)" + "TESSELLATION[[\"id\", \"name\", \"lat\", \"lon\"]].to_csv(\n", + " \"../../data/final/locations.csv\", index=False\n", + ")" ] } ], @@ -1703,7 +1714,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.9" + "version": "3.10.13" } }, "nbformat": 4, diff --git a/notebooks/mobility/stops/01b-convenience-sampling.ipynb b/notebooks/mobility/stops/01b-convenience-sampling.ipynb index 2d464f98..1c9fc66b 100644 --- a/notebooks/mobility/stops/01b-convenience-sampling.ipynb +++ b/notebooks/mobility/stops/01b-convenience-sampling.ipynb @@ -5,7 +5,7 @@ "id": "dba12959-03fc-4875-9ab5-35888785c799", "metadata": {}, "source": [ - "# Constructiong Samples\n", + "# Createp Longitudinal Panels\n", "\n", "In this step, we create sub-panels **A** (formal) and **B** (informal) as described in the [methodological notes](README.md) of this pilot study. The sub-panels are composed of longitudinal mobility data generated by GPS-enabled devices based on whether they were detected within the proximity of [Region A or Region B](01a-aoi-and-tessellation.ipynb#regions-a-b) throughout the time horizon. \n", "\n", @@ -38,8 +38,7 @@ "outputs": [], "source": [ "import dask.dataframe as dd\n", - "import geopandas\n", - "import pandas as pd" + "import geopandas" ] }, { @@ -93,7 +92,7 @@ "source": [ "### Area of Interest\n", "\n", - "On the previous step, we defiend the area(s) of interest. Here, we selected it." + "On the previous step, we defined the area(s) of interest. Here, we selected it." ] }, { @@ -326,8 +325,8 @@ "outputs": [], "source": [ "PATH = [\n", - " f\"../../data/external/outlogic/LB/date=*/*.parquet\",\n", - " f\"../../data/external/outlogic/SY/date=*/*.parquet\",\n", + " \"../../data/external/outlogic/LB/date=*/*.parquet\",\n", + " \"../../data/external/outlogic/SY/date=*/*.parquet\",\n", "]" ] }, @@ -529,17 +528,7 @@ "source": [ "## Repartitioning\n", "\n", - "Let's repartition on `country`, `year` and `quarter` (to reduce the overhead and improve performance)." - ] - }, - { - "cell_type": "markdown", - "id": "da365242-cafd-420f-be02-6d09a5c96bae", - "metadata": {}, - "source": [ - "### Apply tranformations\n", - "\n", - "In this step, we convert `datetime` to the **Asia/Damascus** timezone and calculate the quarter. " + "Let's repartition on `country`, `year` and `quarter` (to reduce the overhead and improve performance).In this step, we convert `datetime` to the **Asia/Damascus** timezone and calculate the quarter. " ] }, { @@ -625,7 +614,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.8" + "version": "3.10.13" }, "toc-showtags": false }, diff --git a/notebooks/mobility/stops/03a-count-within-aoi.ipynb b/notebooks/mobility/stops/03a-count-within-aoi.ipynb index f31e892a..7cddd09f 100644 --- a/notebooks/mobility/stops/03a-count-within-aoi.ipynb +++ b/notebooks/mobility/stops/03a-count-within-aoi.ipynb @@ -5,7 +5,7 @@ "id": "dba12959-03fc-4875-9ab5-35888785c799", "metadata": {}, "source": [ - "# Counting Devices within Areas of Interest\n", + "# Calculate Number of Devices within Areas of Interest\n", "\n", "In this step, we calculate the number of devices detected within the **areas of interest**, creating a time series." ] @@ -23,7 +23,6 @@ "source": [ "import dask.dataframe as dd\n", "import geopandas\n", - "import matplotlib.colors as mcolors\n", "import matplotlib.dates as mdates\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", @@ -123,8 +122,7 @@ " f\"../../data/interim/panels/{NAME}\",\n", "]\n", "\n", - "filters = [\n", - "]" + "filters = []" ] }, { @@ -557,7 +555,10 @@ ], "source": [ "gdf = geopandas.GeoDataFrame(\n", - " df[[\"longitude\", \"latitude\"]].iloc[:10000], geometry=geopandas.points_from_xy(df.longitude.iloc[:10000], df.latitude.iloc[:10000], crs=\"EPSG:4326\")\n", + " df[[\"longitude\", \"latitude\"]].iloc[:10000],\n", + " geometry=geopandas.points_from_xy(\n", + " df.longitude.iloc[:10000], df.latitude.iloc[:10000], crs=\"EPSG:4326\"\n", + " ),\n", ")\n", "gdf.explore()" ] @@ -595,9 +596,7 @@ "metadata": {}, "outputs": [], "source": [ - "count = (\n", - " ddf.groupby([\"date\"])[\"uid\"].nunique().compute().to_frame(\"count\")\n", - ")\n", + "count = ddf.groupby([\"date\"])[\"uid\"].nunique().compute().to_frame(\"count\")\n", "\n", "count.index = pd.to_datetime(count.index)" ] @@ -655,7 +654,7 @@ " fontweight=\"bold\",\n", ")\n", "ax.yaxis.set_label_text(\"Number of devices\")\n", - "ax.xaxis.set_major_formatter(mdates.DateFormatter('%b-%Y'));" + "ax.xaxis.set_major_formatter(mdates.DateFormatter(\"%b-%Y\"));" ] }, { @@ -669,7 +668,9 @@ }, "outputs": [], "source": [ - "fig.savefig(f\"../../reports/count_{NAME}.png\", dpi=300, transparent=True, bbox_inches=\"tight\")" + "fig.savefig(\n", + " f\"../../reports/count_{NAME}.png\", dpi=300, transparent=True, bbox_inches=\"tight\"\n", + ")" ] }, { @@ -697,7 +698,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.8" + "version": "3.10.13" }, "toc-showtags": false }, diff --git a/notebooks/mobility/stops/03b-estimate-stay-locations.ipynb b/notebooks/mobility/stops/03b-estimate-stay-locations.ipynb index c84ead45..8eb7d22b 100644 --- a/notebooks/mobility/stops/03b-estimate-stay-locations.ipynb +++ b/notebooks/mobility/stops/03b-estimate-stay-locations.ipynb @@ -29,7 +29,7 @@ "from dask.distributed import Client\n", "from shapely.geometry import Polygon\n", "from skmob.measures.individual import distance_straight_line\n", - "from skmob.preprocessing import clustering, compression, detection, filtering" + "from skmob.preprocessing import clustering, detection, filtering" ] }, { diff --git a/notebooks/mobility/stops/README.md b/notebooks/mobility/stops/README.md index 6bf4b76c..deda899f 100644 --- a/notebooks/mobility/stops/README.md +++ b/notebooks/mobility/stops/README.md @@ -1,5 +1,3 @@ -(mobility-stops)= - # Estimating cross-border movement in Lebanon and Syria through Mobility Data This pilot exploratory study seeks to leverage mobility data to estimate the location of activity and, in particular, tentatively estimate the location of formal and informal activity between Lebanon and Syria. The working hypothesis is that by identifying devices seen in the proximity of point of interests, such as border checkpoints, and mapping their mobility traces throughout the time horizon, the movement captured may indicate changes in trade patterns and potentially identify new trade centers and corridors. @@ -10,7 +8,7 @@ The results presented are the stay locations (or “stops”) generated by sub-p The WB Data Lab team obtained a high-frequency longitudinal panel of human mobility. The data consisted of anonymized timestamped geographical points generated by 639,233 GPS-enabled devices, located in Lebanon and Syria and spanning the period of January 1, 2020, to October 15, 2022. The mobility data panel has been provided pro-bono by [Outlogic](https://outlogic.io) through the proposal [Syria Economic Monitor (Outlogic)](https://portal.datapartnership.org/readableproposal/407) of the [Development Data Partnership](https://datapartnership.org). -During the project's execution, [Outlogic Observation Panel](https://outlogic.io)'s global daily data feed was ingested and processed through the [Mobility](https://docs.datapartnership.org/collections/mobility/README.html) pipeline maintained by the [Development Data Partnership](https://datapartnership.org). As input, the mobility data has been acessed as an Apache Parquet dataset, partitioned on country, year, and date. For additional information, please refer to the [Mobility](https://docs.datapartnership.org/collections/mobility/README.html) documentation accessible to all World Bank staff. +During the project's execution, [Outlogic Observation Panel](https://outlogic.io)'s global daily data feed was ingested and processed through the [Mobility](https://docs.datapartnership.org/collections/mobility/README.html) pipeline maintained by the [Development Data Partnership](https://datapartnership.org). As input, the mobility data has been accessed as an Apache Parquet dataset, partitioned on country, year, and date. For additional information, please refer to the [Mobility](https://docs.datapartnership.org/collections/mobility/README.html) documentation accessible to all World Bank staff. ## Methodology diff --git a/notebooks/ntl-analysis/README.md b/notebooks/ntl-analysis/README.md index 2b664a7f..04263c38 100644 --- a/notebooks/ntl-analysis/README.md +++ b/notebooks/ntl-analysis/README.md @@ -6,7 +6,7 @@ Nighttime lights have become a commonly used resource to estimate changes in loc We use nighttime lights data from the VIIRS Black Marble dataset. Raw nighttime lights data requires correction due to cloud cover and stray light, such as lunar light. The Black Marble dataset applies advanced algorithms to correct raw nighttime light values and calibrate data so that trends in lights over time can be meaningfully analyzed. From VIIRS Black Marble, we use monthly data from January 2012 through August 2022—where data is available at a 500-meter resolution. -For further information, please refer to {ref}`foundational_datasets`. +For more information, please refer to {ref}`datasets`. ## Methodology @@ -25,7 +25,7 @@ Data for the analysis can be downloaded from: * __Black Marble Nighttime Lights:__ There are two options to access the data: * The code [here](https://github.com/datapartnership/syria-economic-monitor/blob/main/notebooks/ntl-analysis/01_download_black_marble.R) downloads raw data from the [NASA archive](https://ladsweb.modaps.eosdis.nasa.gov/missions-and-measurements/products/VNP46A3/) and processes the data for Syria---mosaicing raster tiles together to cover Syria. Running the code requires a NASA bearer token; the documentation [here](https://github.com/ramarty/download_blackmarble) describes how to obtain a token. - + * Pre-processed data can be downloaded from [here](https://datacatalog.worldbank.org/int/data/dataset/0063879/syria__night_time_lights), using the __Night Time Lights BlackMarble Data__ ## Findings diff --git a/notebooks/ntl-analysis/ntl-update-12-2023.md b/notebooks/ntl-analysis/ntl-update-12-2023.md index 6bcabf18..4e139807 100644 --- a/notebooks/ntl-analysis/ntl-update-12-2023.md +++ b/notebooks/ntl-analysis/ntl-update-12-2023.md @@ -35,5 +35,3 @@ We examine trends in daily nighttime lights from before and after the February 6 ![](../../reports/figures/pchange_ntl_nogf_2022_2023.png) ![](../../reports/figures/pchange_ntl_monthly.png) - - diff --git a/notebooks/ntl-analysis/ntl-update-6-2023.md b/notebooks/ntl-analysis/ntl-update-6-2023.md index 91f9f56b..2f39a6f8 100644 --- a/notebooks/ntl-analysis/ntl-update-6-2023.md +++ b/notebooks/ntl-analysis/ntl-update-6-2023.md @@ -24,7 +24,7 @@ The below figures show daily trends (and a 7 day moving average from daily data) > *Trends in Nighttime Lights in Border Crossing Locations. The red line indicates February 6, 2023.* -### +### ### Maps of Percent Change in Nighttime Lights diff --git a/notebooks/traffic/README.md b/notebooks/traffic/README.md index f706dcbe..9e15cacb 100644 --- a/notebooks/traffic/README.md +++ b/notebooks/traffic/README.md @@ -12,7 +12,7 @@ We test to the use of three data sources for monitoring trends in traffic at bor * __Mobility Data:__ We leverage mobility data from GPS-enabled devices to monitor the number of unique devices at border crossing locations. Mobility data comes from Outlogic, which is further described in {doc}`../mobility/README`. The number of unique devices observed at border crossing locations can indicate activity---and traffic---at the crossing. A key advantage of mobility data over satellite imagery is that it is available at all points in time; the dataset captures the timestamp and location of GPS-enable devices. Consequently, the data can be aggregated hourly, daily, or at other intervals. Mobility data may underestimate activity as not everyone may have a GPS-enabled device; however, we check whether trends in mobility data are similar to trends captured from satellite imagery. -For further information, please refer to {ref}`foundational_datasets`. +For more information, please refer to {ref}`datasets`. ## Implementation diff --git a/notebooks/vegetation-conditions/README.md b/notebooks/vegetation-conditions/README.md index 73e049df..d7ac6840 100644 --- a/notebooks/vegetation-conditions/README.md +++ b/notebooks/vegetation-conditions/README.md @@ -10,13 +10,13 @@ The importance of monitoring vegetation conditions cannot be overstated, particu **Figure 2.** Syria land cover. -Remote sensing techniques, such as those employed through the use of Moderate Resolution Imaging Spectroradiometer ([MODIS](https://modis.gsfc.nasa.gov/about/)) Terra ([MOD13Q1](https://lpdaac.usgs.gov/products/mod13q1v061/)) and Aqua ([MYD13Q1](https://lpdaac.usgs.gov/products/myd13q1v061/)) Vegetation Indices 16-day L3 Global 250m time series data, have revolutionized the way we monitor vegetation conditions [2]. By deriving variables such as ratio anomaly, difference anomaly, standardized anomaly, and vegetation condition index, these analyses enable the quantification of vegetation changes over time and across vast spatial extents. +Remote sensing techniques, such as those employed through the use of Moderate Resolution Imaging Spectroradiometer ([MODIS](https://modis.gsfc.nasa.gov/about/)) Terra ([MOD13Q1](https://lpdaac.usgs.gov/products/mod13q1v061/)) and Aqua ([MYD13Q1](https://lpdaac.usgs.gov/products/myd13q1v061/)) Vegetation Indices 16-day L3 Global 250m time series data, have revolutionized the way we monitor vegetation conditions [2]. By deriving variables such as ratio anomaly, difference anomaly, standardized anomaly, and vegetation condition index, these analyses enable the quantification of vegetation changes over time and across vast spatial extents. Regular vegetation monitoring provides numerous benefits, including the ability to detect and mitigate the impacts of deforestation, land degradation, and desertification, as well as monitor crop health and inform agricultural decision-making [3]. Furthermore, such monitoring can guide policymakers in the development of adaptive strategies and environmental policies, ensuring a sustainable future for the country and its people. Building on the foundation of monitoring vegetation conditions using MODIS MOD13Q1 and MYD13Q1 time series data, a more comprehensive understanding of vegetation dynamics in cropland areas can be achieved by incorporating phenological analysis. This is particularly important in a country like Syria, where agriculture is a vital sector for the economy and food security [4]. Accurate and timely information on crop phenology can significantly enhance agricultural management, resource allocation, and the overall resilience of the farming sector. -To achieve this, the use of [TIMESAT](https://web.nateko.lu.se/timesat/timesat.asp), a software tool designed for the analysis of time series data, can be employed to extract critical phenological parameters such as the Start of Season (SOS), Mid of Season (MOS), and End of Season (EOS) from the Enhanced Vegetation Index (EVI) data [5]. By first clipping the EVI data to the cropland extent, the analysis becomes more focused on the regions of interest, ensuring that the extracted parameters are directly relevant to agricultural practices. +To achieve this, the use of [TIMESAT](https://web.nateko.lu.se/timesat/timesat.asp), a software tool designed for the analysis of time series data, can be employed to extract critical phenological parameters such as the Start of Season (SOS), Mid of Season (MOS), and End of Season (EOS) from the Enhanced Vegetation Index (EVI) data [5]. By first clipping the EVI data to the cropland extent, the analysis becomes more focused on the regions of interest, ensuring that the extracted parameters are directly relevant to agricultural practices. This phenological information can then be used to guide farmers and agricultural stakeholders in making timely and informed decisions, such as when to plant, irrigate, or harvest their crops. Moreover, it can help identify potential threats to crop health and yield, such as disease outbreaks, pest infestations, or the effects of climate change, allowing for proactive and targeted interventions. Ultimately, incorporating phenological analysis into vegetation monitoring efforts provides a more holistic and actionable understanding of the complex dynamics that govern agricultural productivity and environmental sustainability. @@ -26,7 +26,7 @@ In this study, we utilize a range of high-quality datasets to analyze vegetation ### Crop extent -We used the new [ESA World Cover](https://esa-worldcover.org/en) map 10m LULC to mask out areas which aren't of interest in computing the EVI, i.e. built-up, water, forest, etc. The cropland class has value equal to 40, which will be used within Google Earth Engine to generate the mask. +We used the new [ESA World Cover](https://esa-worldcover.org/en) map 10m LULC to mask out areas which aren't of interest in computing the EVI, i.e. built-up, water, forest, etc. The cropland class has value equal to 40, which will be used within Google Earth Engine to generate the mask. There are many ways to download the WorldCover, as explained in the WorldCover [Data Access](https://esa-worldcover.org/en/data-access) page. @@ -66,13 +66,13 @@ Similarly, rainfall patterns play a crucial role in determining water availabili **Figure 6.** Accumulated rainfall, September 2023. -By analyzing these climate data alongside vegetation indices and phenological information, we can correlate climate trends with vegetation dynamics. +By analyzing these climate data alongside vegetation indices and phenological information, we can correlate climate trends with vegetation dynamics. Monthly temperature derived from [ERA5-Land](https://doi.org/10.24381/cds.68d2bb30), and rainfall data come from [CHRIPS](https://www.chc.ucsb.edu/data/chirps). ## Limitations and Assumptions -Getting VI data with good quality for all period are challenging (pixels covered with cloud, snow/ice, aerosol quantity, shadow) for optic data (MODIS). Cultivated area year by year are varies, due to MODIS data quality and crop type is not described, so the seasonal parameters are for general cropland. +Getting VI data with good quality for all period are challenging (pixels covered with cloud, snow/ice, aerosol quantity, shadow) for optic data (MODIS). Cultivated area year by year are varies, due to MODIS data quality and crop type is not described, so the seasonal parameters are for general cropland. At this point, the analysis is also limited to seasonal crops due to difficulty to capture the dynamics of perennial crops within a year. The value may not represent for smaller cropland and presented result are only based upon the most current available remote sensing data. As the climate phenomena is a dynamic situation, the current realities may differ from what is depicted in this document. @@ -136,7 +136,7 @@ A how-to guideline on calculating the phenological metrics are available through We utilized GEE to acquire a time series of EVI data. The EVI data was then processed using the ArcPy library in ArcGIS to generate long-term statistics and derive various vegetation indices products. Following this, we employed the TIMESAT software to extract seasonality parameters from the processed vegetation data. -In this study, we employed a three-step coding approach to analyze the time series EVI data and derive vegetation index products. The first step utilized GEE to efficiently batch download the time series EVI data. +In this study, we employed a three-step coding approach to analyze the time series EVI data and derive vegetation index products. The first step utilized GEE to efficiently batch download the time series EVI data. * The code for downloading timeseries EVI in GEE: [gee-batch-export-mxd13q1.js](/gee-batch-export-mxd13q1.js) @@ -150,7 +150,7 @@ Lastly, another ArcPy script was employed to compute various vegetation index de ## Result -We present a summary of the key derived variables employed in our analysis to monitor vegetation conditions and dynamics within Syria's cropland areas. +We present a summary of the key derived variables employed in our analysis to monitor vegetation conditions and dynamics within Syria's cropland areas. ### Anomaly and Vegetation Condition @@ -479,35 +479,35 @@ This section presents a comprehensive analysis of planting and harvest cycles in **Figure 42.** Annual Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Tartous :sync: key10 ![AHP43](./images/plot_syr_adm1_Tartous_annual_pheno.png) **Figure 43.** Annual Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Ar-Raqqa :sync: key11 ![AHP44](./images/plot_syr_adm1_Ar-Raqqa_annual_pheno.png) **Figure 44.** Annual Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Dar'a :sync: key12 ![AHP45](./images/plot_syr_adm1_Dar'a_annual_pheno.png) **Figure 45.** Annual Harvest and Planting, 2003-2023 ::: - + :::{tab-item} As-Sweida :sync: key13 ![AHP46](./images/plot_syr_adm1_As-Sweida_annual_pheno.png) **Figure 46.** Annual Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Quneitra :sync: key14 ![AHP47](./images/plot_syr_adm1_Quneitra_annual_pheno.png) @@ -581,35 +581,35 @@ This section presents a comprehensive analysis of planting and harvest cycles in **Figure 56.** Monthly Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Tartous :sync: key10 ![MHP57](./images/plot_syr_adm1_Tartous_monthly_pheno.png) **Figure 57.** Monthly Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Ar-Raqqa :sync: key11 ![MHP58](./images/plot_syr_adm1_Ar-Raqqa_monthly_pheno.png) **Figure 58.** Monthly Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Dar'a :sync: key12 ![MHP59](./images/plot_syr_adm1_Dar'a_monthly_pheno.png) **Figure 59.** Monthly Harvest and Planting, 2003-2023 ::: - + :::{tab-item} As-Sweida :sync: key13 ![MHP60](./images/plot_syr_adm1_As-Sweida_monthly_pheno.png) **Figure 60.** Monthly Harvest and Planting, 2003-2023 ::: - + :::{tab-item} Quneitra :sync: key14 ![MHP61](./images/plot_syr_adm1_Quneitra_monthly_pheno.png) @@ -703,35 +703,35 @@ This section presents a comprehensive analysis of planting and harvest cycles in **Figure 72.** Annual Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Tartous :sync: key10 ![ART73](./images/plot_syr_adm1_Tartous_annual_preciptavg.png) **Figure 73.** Annual Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Ar-Raqqa :sync: key11 ![ART74](./images/plot_syr_adm1_Ar-Raqqa_annual_preciptavg.png) **Figure 74.** Annual Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Dar'a :sync: key12 ![ART75](./images/plot_syr_adm1_Dar'a_annual_preciptavg.png) **Figure 75.** Annual Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} As-Sweida :sync: key13 ![ART76](./images/plot_syr_adm1_As-Sweida_annual_preciptavg.png) **Figure 76.** Annual Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Quneitra :sync: key14 ![ART77](./images/plot_syr_adm1_Quneitra_annual_preciptavg.png) @@ -805,35 +805,35 @@ This section presents a comprehensive analysis of planting and harvest cycles in **Figure 86.** Monthly Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Tartous :sync: key10 ![MRT87](./images/plot_syr_adm1_Tartous_monthly_preciptavg.png) **Figure 87.** Monthly Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Ar-Raqqa :sync: key11 ![MRT88](./images/plot_syr_adm1_Ar-Raqqa_monthly_preciptavg.png) **Figure 88.** Monthly Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Dar'a :sync: key12 ![MRT89](./images/plot_syr_adm1_Dar'a_monthly_preciptavg.png) **Figure 89.** Monthly Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} As-Sweida :sync: key13 ![MRT90](./images/plot_syr_adm1_As-Sweida_monthly_preciptavg.png) **Figure 90.** Monthly Rainfall and Temperature, 2003-2023 ::: - + :::{tab-item} Quneitra :sync: key14 ![MRT91](./images/plot_syr_adm1_Quneitra_monthly_preciptavg.png) @@ -1068,7 +1068,7 @@ This section delves into the analysis of annual and monthly trends in planting a The aggregate data in admin0, 1, 2 and 3 level, along with the maps and charts are available in the Sharepoint: [link](https://worldbankgroup.sharepoint.com/:f:/r/teams/DevelopmentDataPartnershipCommunity-WBGroup/Shared%20Documents/Projects/Data%20Lab/Syria%20Economic%20Monitor/Data/vegetation-conditions?csf=1&web=1&e=4XhouQ) - accessible from internal network. -## Potential Application +## Potential Application Above products provides an important starting point for continuous monitoring of the crop planting status. Continuous monitoring could inform the following assessments: @@ -1095,6 +1095,3 @@ This information is necessary for both policy makers, farmers, and other agricul 7. Peel, M. C., Finlayson, B. L., and McMahon, T. A.: Updated world map of the Köppen-Geiger climate classification, Hydrol. Earth Syst. Sci., 11, 1633–1644, https://doi.org/10.5194/hess-11-1633-2007, 2007 8. Porter John R and Semenov Mikhail A 2005Crop responses to climatic variation. Phil. Trans. R. Soc. B. 360:2021–2035. http://doi.org/10.1098/rstb.2005.1752 9. Rockström, J., & Falkenmark, M. (2000). Semiarid crop production from a hydrological perspective: Gap between potential and actual yields. Critical Reviews in Plant Sciences, 19(4), 319-346. https://doi.org/10.1080/07352680091139259 - - - diff --git a/notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md b/notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md index 19167e4a..05f78fef 100644 --- a/notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md +++ b/notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md @@ -28,7 +28,7 @@ For this tutorial, we are working on these folder `X:/Temp/modis/syr/` directory Place to put downloaded VI data, and pre-process temporary files. -2. `output` +2. `output` 1. `01_raw_seasonality_metrics` Place for raw outputs seasonality files generated by TIMESAT 2. `02_tif_seasonality_metrics` Place for ready-to-used seasonality raster (`.tif`) files 3. `03_extract_date` Place for extract each seasonality date @@ -43,7 +43,7 @@ This whole process requires the support of several software. ### 1.1. TIMESAT -To investigate the seasonality of satellite time-series data and their relationship with dynamic properties of vegetation, such as phenology and temporal development, we will use [TIMESAT](https://web.nateko.lu.se/timesat/timesat.asp) software - a software package for analysing time-series of satellite sensor data. +To investigate the seasonality of satellite time-series data and their relationship with dynamic properties of vegetation, such as phenology and temporal development, we will use [TIMESAT](https://web.nateko.lu.se/timesat/timesat.asp) software - a software package for analysing time-series of satellite sensor data. TIMESAT available for download via this link [https://web.nateko.lu.se/timesat/timesat.asp?cat=4](https://web.nateko.lu.se/timesat/timesat.asp?cat=4). You are required to register first before downloading the software. @@ -69,9 +69,9 @@ The main driver for all TIMESAT processing, Matlab or Fortran, is a menu system. **Images input** -* TIMESAT needs a sequence of vegetation index images covering a particular geographical area. Images should be converted to headerless binary format. +* TIMESAT needs a sequence of vegetation index images covering a particular geographical area. Images should be converted to headerless binary format. -* The number of images needs to be identical for each year, and each image should represent the same time interval (e.g. one day, 8-days, 10-days, 1 month etc.). +* The number of images needs to be identical for each year, and each image should represent the same time interval (e.g. one day, 8-days, 10-days, 1 month etc.). * If an image representing a certain date is missing, an image denoting missing data should be added. This image should contain numerical values outside the range of the valid data. @@ -98,9 +98,9 @@ The first row contains the number of data files (images), then comes one image n **Viewing images** -* Start TSM_imageview from the TIMESAT menu system. Under File, Open image file, browse to the folder `02_bil` and click on one of `.bil` file. +* Start TSM_imageview from the TIMESAT menu system. Under File, Open image file, browse to the folder `02_bil` and click on one of `.bil` file. -* The files contain EVI data from the MODIS sensor. Change the choice under Image file type to 16-bit signed integer. Type `2229` under No of rows in image, and `3016` under No of columns per row. Click the Draw button. +* The files contain EVI data from the MODIS sensor. Change the choice under Image file type to 16-bit signed integer. Type `2229` under No of rows in image, and `3016` under No of columns per row. Click the Draw button. ![TIMESAT imageview](./images/climag-timesat-imageview.png) @@ -110,11 +110,11 @@ The first row contains the number of data files (images), then comes one image n **Browsing through several files** -* If you have made sure that your file list correctly points to your vegetation index image data, you may use the function Open file list under File. +* If you have made sure that your file list correctly points to your vegetation index image data, you may use the function Open file list under File. -* Click on the Open file list button and browse to folder the file `syr_data_2022_gee_raw.txt`, select it. +* Click on the Open file list button and browse to folder the file `syr_data_2022_gee_raw.txt`, select it. -* Click on one of the files, leave the window open and go over to the main window. +* Click on one of the files, leave the window open and go over to the main window. * Choose the correct settings under Format and click the Draw button. You can then point to another file in the list and just click the Draw button again to view this image file. @@ -130,9 +130,9 @@ The first row contains the number of data files (images), then comes one image n * Click on TSM_GUI in the TIMESAT menu system -* Then select File, Open ASCII data file. Use the Browse button to open the file `syr_data_2022_gee_raw.txt`. +* Then select File, Open ASCII data file. Use the Browse button to open the file `syr_data_2022_gee_raw.txt`. -* This file contains EVI data from MODIS for the time period 2021 – 2023. Note the preview of the file contents loaded into the window. +* This file contains EVI data from MODIS for the time period 2021 – 2023. Note the preview of the file contents loaded into the window. * The first row shows that there are 3 years of data, 46 observations per year. Press Load data. The raw data from the first row of the file will load into the plotting area of TSM_GUI. @@ -140,9 +140,9 @@ The first row contains the number of data files (images), then comes one image n **Figure 7.** Specify input data -* Next, select and unselect the different check boxes under Data plotting. Note the different fits achieved with `Gaussian`, `Logistic` and `Savitsky-Golay`. +* Next, select and unselect the different check boxes under Data plotting. Note the different fits achieved with `Gaussian`, `Logistic` and `Savitsky-Golay`. -* The fits are affected by a number of options for detecting spikes, adapting to the upper envelope etc. These options can be controlled by the check boxes and buttons in the GUI either under Common settings or Class-specific settings. +* The fits are affected by a number of options for detecting spikes, adapting to the upper envelope etc. These options can be controlled by the check boxes and buttons in the GUI either under Common settings or Class-specific settings. * There are more options, including the Spike method, Number of envelope iterations and Adaptation strength, that you might want to explore. @@ -160,11 +160,11 @@ The first row contains the number of data files (images), then comes one image n ### 2.5. Process -* To start the program click on `TSF_process` in the TIMESAT menu system. +* To start the program click on `TSF_process` in the TIMESAT menu system. -* TIMESAT will ask for the input settings file. Select the file `syr_2022_gee_raw.set`. +* TIMESAT will ask for the input settings file. Select the file `syr_2022_gee_raw.set`. -* A command window will then open and `TSF_process` will start running immediately. +* A command window will then open and `TSF_process` will start running immediately. * You can also start `TSF_process` directly from a separate command window by typing `TSF_process`. @@ -184,18 +184,18 @@ As example, we only utilized `TSF_seas2img` to extract the seasonal parameters **TSF_seas2img: Creating an image from the seasonality data** -* This program generates an image from the seasonality parameters generated by Timesat. +* This program generates an image from the seasonality parameters generated by Timesat. -* Click on `TSF_seas2img` in TSM_menu. +* Click on `TSF_seas2img` in TSM_menu. -* Now enter the seasonality file `syr_2022_gee_raw_TS.tpa`. Then specify the seasonality parameter to map. Here we will map the Time of Middle Season, `5`. +* Now enter the seasonality file `syr_2022_gee_raw_TS.tpa`. Then specify the seasonality parameter to map. Here we will map the Time of Middle Season, `5`. -* Then specify an interval that is wide enough to cover the second season. A suggestion is to specify this interval to `47` to `92` since this will overlap the second year. +* Then specify an interval that is wide enough to cover the second season. A suggestion is to specify this interval to `47` to `92` since this will overlap the second year. -* Define codes for missing data due to no season found within the interval, and no pixel data at all found at that location. -Finally, give a name of the output file, and specify its format. +* Define codes for missing data due to no season found within the interval, and no pixel data at all found at that location. +Finally, give a name of the output file, and specify its format. -* Carefully note the format since it is important when viewing the file with TSM_imageview. Here we will give the name begin, and specify output in full precision, `16-bit`. +* Carefully note the format since it is important when viewing the file with TSM_imageview. Here we will give the name begin, and specify output in full precision, `16-bit`. ![TIMESAT seas2img](./images/climag-timesat-seas2img.png) @@ -234,4 +234,3 @@ Finally, give a name of the output file, and specify its format. ``` * After all the output data available in GeoTIFF format, feel free to use other software for further processing. - diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 00000000..0335eb85 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,46 @@ +[build-system] +requires = ["hatchling>=1.21.0", "hatch-vcs>=0.3.0"] +build-backend = "hatchling.build" + +[project] +name = "syria-economic-monitor" +description = "Support for the World Bank Syria Economic Monitor" +readme = { file = "README.md", content-type = "text/markdown" } +license = { file = "LICENSE" } +keywords = ["nighttime lights", "nasa black marble"] +authors = [{ name = "World Bank Data Lab", email = "datalab@worldbank.org" }] +maintainers = [ + { name = "Gabriel Stefanini Vicente", email = "gvicente@worldbank.org" }, + { name = "Robert Marty", email = "rmarty@worldbank.org" }, +] +classifiers = [ + "Development Status :: 3 - Alpha", + "Intended Audience :: Science/Research", + "Topic :: Scientific/Engineering", +] +requires-python = ">=3.10" +dynamic = ["version"] + +[project.optional-dependencies] +docs = [ + "docutils==0.17.1", # https://jupyterbook.org/en/stable/content/citations.html?highlight=docutils#citations-and-bibliographies + "jupyter-book>=0.15.1", +] +[project.urls] +"Homepage" = "https://datapartnership.github.io/syria-economic-monitor" +"Bug Reports" = "https://github.com/datapartnership/syria-economic-monitor/issues" +"Source" = "https://github.com/datapartnership/syria-economic-monitor" + +[tool.codespell] +skip = './.git,docs/_build,docs/bibliography.bib,*.py,*.R,*.png,*.gz,*.whl' +ignore-regex = '^\s*"image\/png":\s.*' +ignore-words-list = "gost," + +[tool.hatch.build.targets.wheel] +packages = ["src/*"] + +[tool.hatch.version] +source = "vcs" + +[tool.ruff.lint.pydocstyle] +convention = "numpy"