Skip to content

Commit

Permalink
Introduce a couple of minor changes to the validation notebook.
Browse files Browse the repository at this point in the history
  • Loading branch information
milos-simic committed Aug 21, 2020
1 parent 1b2ea11 commit 01ce20c
Showing 1 changed file with 62 additions and 21 deletions.
83 changes: 62 additions & 21 deletions validation_and_output.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
},
"source": [
"<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Initialization\" data-toc-modified-id=\"Initialization-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Initialization</a></span><ul class=\"toc-item\"><li><span><a href=\"#Script-setup\" data-toc-modified-id=\"Script-setup-1.1\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Script setup</a></span></li><li><span><a href=\"#Load-data\" data-toc-modified-id=\"Load-data-1.2\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Load data</a></span></li><li><span><a href=\"#Download-coastline-data\" data-toc-modified-id=\"Download-coastline-data-1.3\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Download coastline data</a></span><ul class=\"toc-item\"><li><span><a href=\"#Load-the-list-of-sources\" data-toc-modified-id=\"Load-the-list-of-sources-1.3.1\"><span class=\"toc-item-num\">1.3.1&nbsp;&nbsp;</span>Load the list of sources</a></span></li></ul></li></ul></li><li><span><a href=\"#Validation-Markers\" data-toc-modified-id=\"Validation-Markers-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Validation Markers</a></span><ul class=\"toc-item\"><li><span><a href=\"#Germany-DE\" data-toc-modified-id=\"Germany-DE-2.1\"><span class=\"toc-item-num\">2.1&nbsp;&nbsp;</span>Germany DE</a></span></li><li><span><a href=\"#France-FR\" data-toc-modified-id=\"France-FR-2.2\"><span class=\"toc-item-num\">2.2&nbsp;&nbsp;</span>France FR</a></span></li><li><span><a href=\"#United-Kingdom-UK\" data-toc-modified-id=\"United-Kingdom-UK-2.3\"><span class=\"toc-item-num\">2.3&nbsp;&nbsp;</span>United Kingdom UK</a></span></li></ul></li><li><span><a href=\"#Harmonization\" data-toc-modified-id=\"Harmonization-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Harmonization</a></span><ul class=\"toc-item\"><li><span><a href=\"#Harmonizing-column-order\" data-toc-modified-id=\"Harmonizing-column-order-3.1\"><span class=\"toc-item-num\">3.1&nbsp;&nbsp;</span>Harmonizing column order</a></span></li><li><span><a href=\"#Cleaning-fields\" data-toc-modified-id=\"Cleaning-fields-3.2\"><span class=\"toc-item-num\">3.2&nbsp;&nbsp;</span>Cleaning fields</a></span></li><li><span><a href=\"#Sort\" data-toc-modified-id=\"Sort-3.3\"><span class=\"toc-item-num\">3.3&nbsp;&nbsp;</span>Sort</a></span></li><li><span><a href=\"#Leave-unspecified-cells-blank\" data-toc-modified-id=\"Leave-unspecified-cells-blank-3.4\"><span class=\"toc-item-num\">3.4&nbsp;&nbsp;</span>Leave unspecified cells blank</a></span></li><li><span><a href=\"#Separate-dirty-from-clean\" data-toc-modified-id=\"Separate-dirty-from-clean-3.5\"><span class=\"toc-item-num\">3.5&nbsp;&nbsp;</span>Separate dirty from clean</a></span></li></ul></li><li><span><a href=\"#Capacity-time-series\" data-toc-modified-id=\"Capacity-time-series-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Capacity time series</a></span><ul class=\"toc-item\"><li><span><a href=\"#Make-separate-series-for-Great-Britain-and-Northern-Ireland\" data-toc-modified-id=\"Make-separate-series-for-Great-Britain-and-Northern-Ireland-4.1\"><span class=\"toc-item-num\">4.1&nbsp;&nbsp;</span>Make separate series for Great Britain and Northern Ireland</a></span></li><li><span><a href=\"#Create-total-wind-columns\" data-toc-modified-id=\"Create-total-wind-columns-4.2\"><span class=\"toc-item-num\">4.2&nbsp;&nbsp;</span>Create total wind columns</a></span></li><li><span><a href=\"#Create-one-time-series-file-containing-al-countries\" data-toc-modified-id=\"Create-one-time-series-file-containing-al-countries-4.3\"><span class=\"toc-item-num\">4.3&nbsp;&nbsp;</span>Create one time series file containing al countries</a></span></li></ul></li><li><span><a href=\"#Make-the-normalized-dataframe-for-all-the-countries\" data-toc-modified-id=\"Make-the-normalized-dataframe-for-all-the-countries-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span>Make the normalized dataframe for all the countries</a></span></li><li><span><a href=\"#Output\" data-toc-modified-id=\"Output-6\"><span class=\"toc-item-num\">6&nbsp;&nbsp;</span>Output</a></span><ul class=\"toc-item\"><li><span><a href=\"#Write-data-files\" data-toc-modified-id=\"Write-data-files-6.1\"><span class=\"toc-item-num\">6.1&nbsp;&nbsp;</span>Write data files</a></span><ul class=\"toc-item\"><li><span><a href=\"#Write-CSV-files\" data-toc-modified-id=\"Write-CSV-files-6.1.1\"><span class=\"toc-item-num\">6.1.1&nbsp;&nbsp;</span>Write CSV-files</a></span></li><li><span><a href=\"#Write-XLSX-files\" data-toc-modified-id=\"Write-XLSX-files-6.1.2\"><span class=\"toc-item-num\">6.1.2&nbsp;&nbsp;</span>Write XLSX-files</a></span></li><li><span><a href=\"#Write-SQLite\" data-toc-modified-id=\"Write-SQLite-6.1.3\"><span class=\"toc-item-num\">6.1.3&nbsp;&nbsp;</span>Write SQLite</a></span></li></ul></li><li><span><a href=\"#Write-meta-data\" data-toc-modified-id=\"Write-meta-data-6.2\"><span class=\"toc-item-num\">6.2&nbsp;&nbsp;</span>Write meta data</a></span></li><li><span><a href=\"#Generate-checksums\" data-toc-modified-id=\"Generate-checksums-6.3\"><span class=\"toc-item-num\">6.3&nbsp;&nbsp;</span>Generate checksums</a></span></li></ul></li></ul></div>"
"<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Initialization\" data-toc-modified-id=\"Initialization-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Initialization</a></span><ul class=\"toc-item\"><li><span><a href=\"#Script-setup\" data-toc-modified-id=\"Script-setup-1.1\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Script setup</a></span><ul class=\"toc-item\"><li><span><a href=\"#Load-the-list-of-sources\" data-toc-modified-id=\"Load-the-list-of-sources-1.1.1\"><span class=\"toc-item-num\">1.1.1&nbsp;&nbsp;</span>Load the list of sources</a></span></li></ul></li><li><span><a href=\"#Load-data\" data-toc-modified-id=\"Load-data-1.2\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Load data</a></span></li><li><span><a href=\"#Download-coastline-data\" data-toc-modified-id=\"Download-coastline-data-1.3\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Download coastline data</a></span></li></ul></li><li><span><a href=\"#Validation-Markers\" data-toc-modified-id=\"Validation-Markers-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Validation Markers</a></span><ul class=\"toc-item\"><li><span><a href=\"#Define-the-Markers-for-Germany\" data-toc-modified-id=\"Define-the-Markers-for-Germany-2.1\"><span class=\"toc-item-num\">2.1&nbsp;&nbsp;</span>Define the Markers for Germany</a></span></li><li><span><a href=\"#Define-the-Markers-for-France\" data-toc-modified-id=\"Define-the-Markers-for-France-2.2\"><span class=\"toc-item-num\">2.2&nbsp;&nbsp;</span>Define the Markers for France</a></span></li><li><span><a href=\"#Define-the-Markers-for-the-United-Kingdom\" data-toc-modified-id=\"Define-the-Markers-for-the-United-Kingdom-2.3\"><span class=\"toc-item-num\">2.3&nbsp;&nbsp;</span>Define the Markers for the United Kingdom</a></span></li><li><span><a href=\"#Mark-the-data\" data-toc-modified-id=\"Mark-the-data-2.4\"><span class=\"toc-item-num\">2.4&nbsp;&nbsp;</span>Mark the data</a></span></li></ul></li><li><span><a href=\"#Harmonization\" data-toc-modified-id=\"Harmonization-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Harmonization</a></span><ul class=\"toc-item\"><li><span><a href=\"#Harmonizing-column-order\" data-toc-modified-id=\"Harmonizing-column-order-3.1\"><span class=\"toc-item-num\">3.1&nbsp;&nbsp;</span>Harmonizing column order</a></span></li><li><span><a href=\"#Cleaning-columns\" data-toc-modified-id=\"Cleaning-columns-3.2\"><span class=\"toc-item-num\">3.2&nbsp;&nbsp;</span>Cleaning columns</a></span></li><li><span><a href=\"#Sort\" data-toc-modified-id=\"Sort-3.3\"><span class=\"toc-item-num\">3.3&nbsp;&nbsp;</span>Sort</a></span></li><li><span><a href=\"#Leave-unspecified-cells-blank\" data-toc-modified-id=\"Leave-unspecified-cells-blank-3.4\"><span class=\"toc-item-num\">3.4&nbsp;&nbsp;</span>Leave unspecified cells blank</a></span></li><li><span><a href=\"#Separate-dirty-from-clean\" data-toc-modified-id=\"Separate-dirty-from-clean-3.5\"><span class=\"toc-item-num\">3.5&nbsp;&nbsp;</span>Separate dirty from clean</a></span></li></ul></li><li><span><a href=\"#Drop-duplicates\" data-toc-modified-id=\"Drop-duplicates-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Drop duplicates</a></span></li><li><span><a href=\"#Capacity-time-series\" data-toc-modified-id=\"Capacity-time-series-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span>Capacity time series</a></span><ul class=\"toc-item\"><li><span><a href=\"#Make-separate-series-for-Great-Britain-and-Northern-Ireland\" data-toc-modified-id=\"Make-separate-series-for-Great-Britain-and-Northern-Ireland-5.1\"><span class=\"toc-item-num\">5.1&nbsp;&nbsp;</span>Make separate series for Great Britain and Northern Ireland</a></span></li><li><span><a href=\"#Create-total-wind-columns\" data-toc-modified-id=\"Create-total-wind-columns-5.2\"><span class=\"toc-item-num\">5.2&nbsp;&nbsp;</span>Create total wind columns</a></span></li><li><span><a href=\"#Create-one-time-series-file-containing-al-countries\" data-toc-modified-id=\"Create-one-time-series-file-containing-al-countries-5.3\"><span class=\"toc-item-num\">5.3&nbsp;&nbsp;</span>Create one time series file containing al countries</a></span></li></ul></li><li><span><a href=\"#Make-the-normalized-dataframe-for-all-the-countries\" data-toc-modified-id=\"Make-the-normalized-dataframe-for-all-the-countries-6\"><span class=\"toc-item-num\">6&nbsp;&nbsp;</span>Make the normalized dataframe for all the countries</a></span></li><li><span><a href=\"#Output\" data-toc-modified-id=\"Output-7\"><span class=\"toc-item-num\">7&nbsp;&nbsp;</span>Output</a></span><ul class=\"toc-item\"><li><span><a href=\"#Write-data-files\" data-toc-modified-id=\"Write-data-files-7.1\"><span class=\"toc-item-num\">7.1&nbsp;&nbsp;</span>Write data files</a></span><ul class=\"toc-item\"><li><span><a href=\"#Write-CSV-files\" data-toc-modified-id=\"Write-CSV-files-7.1.1\"><span class=\"toc-item-num\">7.1.1&nbsp;&nbsp;</span>Write CSV-files</a></span></li><li><span><a href=\"#Write-XLSX-files\" data-toc-modified-id=\"Write-XLSX-files-7.1.2\"><span class=\"toc-item-num\">7.1.2&nbsp;&nbsp;</span>Write XLSX-files</a></span></li><li><span><a href=\"#Write-SQLite\" data-toc-modified-id=\"Write-SQLite-7.1.3\"><span class=\"toc-item-num\">7.1.3&nbsp;&nbsp;</span>Write SQLite</a></span></li></ul></li><li><span><a href=\"#Write-meta-data\" data-toc-modified-id=\"Write-meta-data-7.2\"><span class=\"toc-item-num\">7.2&nbsp;&nbsp;</span>Write meta data</a></span></li><li><span><a href=\"#Generate-checksums\" data-toc-modified-id=\"Generate-checksums-7.3\"><span class=\"toc-item-num\">7.3&nbsp;&nbsp;</span>Generate checksums</a></span></li></ul></li></ul></div>"
]
},
{
Expand All @@ -44,11 +44,16 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:31.407554Z",
"start_time": "2020-08-21T01:29:31.402967Z"
}
},
"outputs": [],
"source": [
"settings = {\n",
" 'version': '2020-07-25',\n",
" 'version': '2020-08-25',\n",
" 'changes': 'Updated all countries with new data available (DE, FR, PL, CH, DK, UK), added data for CZ and SE.'\n",
"}\n",
"\n",
Expand All @@ -66,6 +71,10 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:36.956012Z",
"start_time": "2020-08-21T01:29:32.227259Z"
},
"scrolled": false
},
"outputs": [],
Expand Down Expand Up @@ -119,7 +128,12 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:37.286266Z",
"start_time": "2020-08-21T01:29:36.958463Z"
}
},
"outputs": [],
"source": [
"source_df = pd.read_csv(os.path.join('input', 'sources.csv'))\n",
Expand All @@ -145,7 +159,12 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:37.299457Z",
"start_time": "2020-08-21T01:29:37.288117Z"
}
},
"outputs": [],
"source": [
"set(source_df['country'].unique().tolist()) - set(['EU'])"
Expand All @@ -154,7 +173,12 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:38.385196Z",
"start_time": "2020-08-21T01:29:38.381027Z"
}
},
"outputs": [],
"source": [
"# Fill in this array with the codes of the countries you want to validate.\n",
Expand Down Expand Up @@ -1131,7 +1155,9 @@
"# drop column DE_hydro because it is not all of hydro but only subsidised hydro, which could be misleading\n",
"if 'DE' in countries and 'DE_hydro_capacity' in unified_daily_timeseries.columns:\n",
" unified_daily_timeseries.drop(columns='DE_hydro_capacity', inplace=True)\n",
"\n",
"# do the same for CH_hydro for the same reason\n",
"if 'CH' in countries and 'CH_hydro_capacity' in unified_daily_timeseries.columns:\n",
" unified_daily_timeseries.drop(columns='CH_hydro_capacity', inplace=True)\n",
"# Show some rows\n",
"unified_daily_timeseries.tail(2)"
]
Expand Down Expand Up @@ -1584,7 +1610,12 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:53.267728Z",
"start_time": "2020-08-21T01:29:52.153125Z"
}
},
"outputs": [],
"source": [
"# Automatically generate some metadata strings such as the list of countries covered, the list of sources etc.\n",
Expand Down Expand Up @@ -1639,12 +1670,17 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:29:53.590389Z",
"start_time": "2020-08-21T01:29:53.269779Z"
}
},
"outputs": [],
"source": [
"metadata = \"\"\"\n",
"hide: yes\n",
"profile: tabular-data-package\n",
"profile: data-package\n",
"_metadataVersion: 1.2\n",
"name: opsd_renewable_power_plants\n",
"title: Renewable power plants\n",
Expand Down Expand Up @@ -2521,6 +2557,16 @@
" - name: country\n",
" description: The country in which the facility is located\n",
" type: string\n",
" - name: renewable_power_plants_xlsx\n",
" profile: data-resource\n",
" path: renewable_power_plants.xlsx\n",
" title: The whole package\n",
" description: The whole package as an Excel file with each country written in a sheet except for Germany which is spread across two, alongside with timeseries and validation markers.\n",
" format: xlsx\n",
" mediatype: application/vnd.ms-excel\n",
" encoding: UTF-8\n",
" schema:\n",
" missingValues: [\"\"]\n",
" - name: renewable_capacity_timeseries\n",
" path: renewable_capacity_timeseries.csv\n",
" profile: tabular-data-resource\n",
Expand All @@ -2545,16 +2591,6 @@
" source:\n",
" title: Own calculation based on plant-level data from Swiss Federal Office of Energy\n",
" path: input/original_data/CH/BFE/9669-Liste aller KEV-Bezüger im Jahr 2018.xlsx\n",
" - name: CH_hydro_capacity\n",
" description: Cumulative hydro electrical capacity for Switzerland in MW\n",
" unit: MW\n",
" opsdProperties:\n",
" Region: Switzerland\n",
" Variable: Hydro\n",
" type: number\n",
" source:\n",
" title: Own calculation based on plant-level data from Swiss Federal Office of Energy\n",
" path: input/original_data/CH/BFE/9669-Liste aller KEV-Bezüger im Jahr 2018.xlsx\n",
" - name: CH_solar_capacity\n",
" description: Cumulative solar electrical capacity for Switzerland in MW\n",
" unit: MW\n",
Expand Down Expand Up @@ -2967,7 +3003,12 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"ExecuteTime": {
"end_time": "2020-08-21T01:30:07.752025Z",
"start_time": "2020-08-21T01:29:58.513008Z"
}
},
"outputs": [],
"source": [
"# Add metadata fields to conform to the OPDF Metadata version 1.2\n",
Expand Down

0 comments on commit 01ce20c

Please sign in to comment.