update election post the day after

danielroelfs · Nov 23, 2023 · fe2a686 · fe2a686
1 parent b3b26eb
commit fe2a686
Show file tree

Hide file tree

Showing 7 changed files with 582 additions and 12 deletions.
diff --git a/config.toml b/config.toml
@@ -2,10 +2,9 @@ baseurl = "/"
 title = "Daniel Roelfs"
 theme = "typography"
 
-ignoreFiles = ["\\.qmd$", "\\.ipynb$", "\\.Rmd$", "\\.Rmarkdown$", "_cache$", "index\\.html", "about\\.html", "static/blog/\\*_files/", "README.md",
-    "content/blog/2022-dutch-performance-olympic-speed-skating/.venv/",
-    "content/blog/2023-nyt-books-api/.venv",
-    "content/blog/2023-scientific-publishing/.venv"]
+ignoreFiles = ["\\.qmd$", "\\.ipynb$", "\\.Rmd$", "\\.Rmarkdown$",
+    "_cache$", "index\\.html", "about\\.html", "static/blog/\\*_files/", "README.md",
+    ".venv"]
 
 [markup]
   defaultMarkdownHandler = "goldmark"

diff --git a/...ns/index.markdown_strict_files/figure-markdown_strict/plot-municipalities-1.png b/...ns/index.markdown_strict_files/figure-markdown_strict/plot-municipalities-1.png
diff --git a/...dex.markdown_strict_files/figure-markdown_strict/plot-preliminary-results-1.png b/...dex.markdown_strict_files/figure-markdown_strict/plot-preliminary-results-1.png
diff --git a/...wn_strict_files/figure-markdown_strict/plot-seat-historical-distributions-1.png b/...wn_strict_files/figure-markdown_strict/plot-seat-historical-distributions-1.png
diff --git a/content/blog/2023-dutch-elections/index.md b/content/blog/2023-dutch-elections/index.md
@@ -34,6 +34,7 @@ library(tidyverse)
 library(ggtext)
 library(rvest)
 library(showtext)
+library(patchwork)
 
 font_add_google("Bitter", family = "custom")
 showtext_auto()
@@ -366,10 +367,10 @@ data_polls_pre_new |>
 The biggest thing to notice here is the party NSC which was started by Pieter Omtzigt, formerly a member of the CDA. This party participates in the 2023 parliamentary elections for the first time, so it doesn't have anything to compare to. According to these polls, the NSC will go in one go to 26 seats. The VVD is still the biggest, but loses a few seats. Other than the VVD, the biggest losers in these polls are the D66, CDA, FvD, and SP.
 
 {{< sidenote >}}
-Tom Louwerse also contributes to the Irish Polling Indicator using the same methods, which describes methods in English
+Tom Louwerse also contributes to the [Irish Polling Indicator](https://pollingindicator.com) using the same methods, which describes the methods in English (also, see [this article](https://doi.org/10.1080/07907184.2016.1213719))
 {{< /sidenote >}}
 
-This is just one plot from one pollster. There is however a polling aggregator, called the [*Peilingwijzer*](https://peilingwijzer.tomlouwerse.nl), which is maintained by political scientist [Tom Louwerse](https://www.tomlouwerse.nl) at Leiden University. It uses a Bayesian approach to weigh a collection of polls from various sources (description in [Dutch](https://peilingwijzer.tomlouwerse.nl/methode.html#statistisch-model) and [English](https://pollingindicator.com/method/)). This way he gets a better estimate of the uncertainty across several pollsters and polling dates. I know he does a lot of his analyses in R, so I'll try to recreate his plot on the [main website](https://peilingwijzer.tomlouwerse.nl) just as a challenge (and perhaps make one or two things a bit more aesthetically pleasing). It seems he uses the somewhat niche (at least in my field) `geom_crossbar()`.
+This is just one plot from one pollster. There is however a polling aggregator, called the [*Peilingwijzer*](https://peilingwijzer.tomlouwerse.nl), which is maintained by political scientist [Tom Louwerse](https://www.tomlouwerse.nl) at Leiden University. It uses a Bayesian approach to weigh a collection of polls from various sources (description in [Dutch](https://peilingwijzer.tomlouwerse.nl/methode.html#statistisch-model) and [English](https://pollingindicator.com/method/)). An (probably earlier) version of the code is available on [Dataverse](https://dataverse.harvard.edu/file.xhtml?fileId=4459988&version=1.0) This way he gets a better estimate of the uncertainty across several pollsters and polling dates. I know he does a lot of his analyses in R, so I'll try to recreate his plot on the [main website](https://peilingwijzer.tomlouwerse.nl) just as a challenge (and perhaps make one or two things a bit more aesthetically pleasing). It seems he uses the somewhat niche (at least in my field) `geom_crossbar()`.
 
 <details>
 <summary>Show code</summary>
@@ -400,7 +401,7 @@ polls_peilingwijzer <- tribble(
 polls_peilingwijzer |>
   inner_join(data_parties) |>
   inner_join(data_polls_pre_new |>
-               select(party, current_seats)) |>
+    select(party, current_seats)) |>
   mutate(
     color = ifelse(color == "grey92", "grey", color),
     range_max_label = ifelse(range_max <= 2,
@@ -666,4 +667,242 @@ exit_polls_2200 |>
 
 {{< omission >}}
 
-It looks like the results per municipality will take some time longer, so I'll go to bed in the meantime and I'll update this post tomorrow with results per municipality and/or province.
+## 23 November
+
+{{< sidenote br="2em" >}}
+The Dutch government is based on coalitions, so the PVV will need to collaborate with other parties to form a government
+{{< /sidenote >}}
+
+It's the day after and basically all news agencies (e.g. [NOS](https://nos.nl/collectie/13958/artikel/2498903), [BBC](https://www.bbc.com/news/world-europe-67504272), [CNN](https://edition.cnn.com/2023/11/23/europe/geert-wilders-dutch-election-analysis-intl/index.html), [NRK](https://www.nrk.no/urix/1.16648165)) are (justifiably) shocked by the fact that Geert Wilders likely will become the next prime minister in the Netherlands for however long his government will last. The formal results will be announced once all votes are properly tallied and checked again, so for this part of the analyses we can only rely on the latest results with almost all votes counted at least once. The official result will be published by the [*Kiesraad*](https://www.kiesraad.nl/) one about a week, but it publishes the [preliminary results](https://www.kiesraad.nl/verkiezingen/tweede-kamer/uitslagen/uitslagen-per-gemeente-tweede-kamer) also. However, this data is quite a headache to scrape, so I'll use the website [*AlleCijfers.nl*](https://allecijfers.nl/uitslag-tweede-kamer-verkiezingen-2023/) instead that more conveniently lists everything in an HTML table that we can scrape. See [here](https://github.com/danielroelfs/danielroelfs.com/tree/main/content/blog/2023-dutch-elections/scrape_municipality_results.py) for the Python code to scrape the website
+
+<details>
+<summary>Show code</summary>
+
+``` r
+results_municipality <- read_delim("./data/election_results.csv", delim = ";") |>
+  rename(region = Regionaam) |>
+  mutate(
+    across(everything(), ~ str_remove(.x, "%")),
+    across(everything(), ~ str_replace(.x, ",", ".")),
+    across(-region, parse_number)
+  ) |>
+  pivot_longer(cols = -region, names_to = "party", values_to = "perc") |>
+  left_join(data_parties) |>
+  mutate(
+    region = str_remove(region, "Gemeente"),
+    region = str_trim(region)
+  )
+```
+
+</details>
+
+I'll first look at the national results, which I can just copy from the TV. I already created the absolute seat comparison, and since not much changed I thought perhaps I could look at the percentage change from the current seats.
+
+<details>
+<summary>Show code for the plot</summary>
+
+``` r
+preliminary_results <- tribble(
+  ~party, ~prelim_results,
+  "VVD", 24,
+  "PVV", 37,
+  "GL-PvdA", 25,
+  "NSC", 20,
+  "D66", 9,
+  "BBB", 7,
+  "SP", 5,
+  "PvdD", 3,
+  "CU", 3,
+  "CDA", 5,
+  "FvD", 3,
+  "DENK", 3,
+  "Volt", 2,
+  "SGP", 3,
+  "JA21", 1,
+  "BVNL", 0,
+  "BIJ1", 0,
+  "50PLUS", 0
+)
+
+preliminary_results_diff <- preliminary_results |>
+  inner_join(data_parties, by = "party") |>
+  inner_join(data_current_seats, by = "party") |>
+  mutate(
+    rel_diff = (prelim_results / current_seats) - 1,
+  ) |>
+  replace_na(list(rel_diff = 0)) |>
+  mutate(
+    rel_diff_label = str_glue("{round(rel_diff * 100)}%"),
+    rel_diff_label = ifelse(str_detect(rel_diff_label, "Inf"), "New party", rel_diff_label),
+    text_color = ifelse(rel_diff == 0, "#333333", text_color)
+  )
+
+preliminary_results_diff |>
+  ggplot(aes(x = rel_diff, y = reorder(party, rel_diff), fill = color)) +
+  geom_vline(
+    xintercept = 0,
+    color = "#333333"
+  ) +
+  geom_vline(
+    xintercept = -1,
+    color = "#333333",
+    linetype = "dashed"
+  ) +
+  geom_col(
+    width = 0.7
+  ) +
+  geom_text(
+    aes(label = rel_diff_label, color = text_color),
+    nudge_x = ifelse(preliminary_results_diff$rel_diff > 0, -0.5, 0.05),
+    hjust = 0, family = "custom"
+  ) +
+  geom_text(
+    data = tibble(),
+    aes(x = 5.4, y = 15, label = "NSC is a new party"),
+    inherit.aes = FALSE,
+    size = 4, family = "custom",
+    hjust = 1, lineheight = 0.75
+  ) +
+  geom_curve(
+    data = tibble(),
+    aes(x = 5.5, y = 15, xend = 6.5, yend = 17.5),
+    inherit.aes = FALSE, curvature = 0.2,
+    arrow = arrow(length = unit(0.4, "lines"))
+  ) +
+  labs(
+    title = "Relative difference with current seats in parliament",
+    x = "Percentage change",
+    y = NULL,
+    caption = "**Source**: Ipsos, comissioned by NOS and RTL"
+  ) +
+  scale_x_continuous(
+    labels = scales::label_percent(),
+    expand = expansion(add = c(0.2, 1.2)),
+    position = "top"
+  ) +
+  scale_color_identity() +
+  scale_fill_identity() +
+  theme_minimal(base_family = "custom") +
+  theme(
+    plot.title.position = "plot",
+    plot.title = element_markdown(size = 16, face = "bold"),
+    plot.subtitle = element_markdown(lineheight = 0.67),
+    plot.caption.position = "plot",
+    plot.caption = element_markdown(),
+    axis.text.y = element_markdown(size = 10),
+    panel.grid.major.y = element_blank(),
+    legend.position = c(0.8, 0.2)
+  )
+```
+
+</details>
+
+<img src="index.markdown_strict_files/figure-markdown_strict/plot-preliminary-results-1.png" width="768" />
+
+Since the NSC is a new party it's increase (no matter how little or large it would have been) is infinite. Since the party got 20 seats, I think it looks fairly logical to have it appear at the top. If it was a much smaller party, I would have maybe forced it to be shown at the bottom to incidate that this is a statistical anomaly. The *Boer Burger Beweging* (BBB) has one seat in the current parliament after their first participation in the previous election, but will increase to 7 in the next parliament that will be seated early December. There are two parties that according to these numbers will be removed from parliament (indicated by the 100% decrease in seats in parliament). I would say that having the most important numbers be somewhat squished on the left of the plot is perhaps not ideal, but it's a trade-off from showing the large increase in the NSC and BBB.
+
+Next I wanted to look at the results per municipality to see if there were any trends I could identify. This means creating a map, so I downloaded a *geopackage* file from the [*Centraal Bureau voor de Statistiek*](https://www.cbs.nl) (CBS) page on [geographical areas](https://www.cbs.nl/nl-nl/dossier/nederland-regionaal/geografische-data/cbs-gebiedsindelingen) where they share current and historical files on a number of divisions (provinces, municipalities, security regions, etc.). The *geopackage* format can be parsed with the `{sf}` package. The file contains several "layers" that can be listed through the `sf::st_layers(<file>)` functionality.
+
+<details>
+<summary>Show code</summary>
+
+``` r
+geo_municipality <- sf::st_read(
+  "./data/cbsgebiedsindelingen2023.gpkg",
+  layer = "gemeente_gegeneraliseerd"
+) |>
+  janitor::clean_names()
+```
+
+</details>
+
+    Reading layer `gemeente_gegeneraliseerd' from data source 
+      `/Users/dtroelfs/Dropbox/Personal/scripts/danielroelfs/content/blog/2023-dutch-elections/data/cbsgebiedsindelingen2023.gpkg' 
+      using driver `GPKG'
+    Simple feature collection with 342 features and 5 fields
+    Geometry type: MULTIPOLYGON
+    Dimension:     XY
+    Bounding box:  xmin: 13565.4 ymin: 306846.2 xmax: 278026.1 ymax: 619231.6
+    Projected CRS: Amersfoort / RD New
+
+The thing I was most interested in at first is how progressive or conservative (I'll refer to this as "political identity" from now on), and how left- and right-wing the municipalities are ("political color"). As noted before, the political color of the PVV is somewhat controversial, where *KiesKompas* would put in the centre-right due to it's populist agenda, the party is usually mentioned among the far-right parties, both nationally and internationally. I could change the value for this part of the analyses, but I'm not comfortable setting another value, so look at the following plots with this caveat in mind. Considering the electoral victory the PVV won this election cycle, the plots should probabaly look more right-wing than the shown values represent.
+
+{{< sidenote >}}
+The weighted mean is implemented in R through the `weighted.mean()` function
+{{< /sidenote >}}
+
+For these plots I'll calculate two measures, the weighted mean of the political identity and the weighted mean of the political color. I'll use the percentages in each municipality as the weights and the values from *KiesKompas* to aggregate for each measure. It's important to note that there were many more parties the electorate could vote for, but not all were given a political identity or color by *KiesKompas*, so data from these (usually very small) parties is ignored.
+
+<details>
+<summary>Show code for the plot</summary>
+
+``` r
+summary_municipality <- results_municipality |>
+  group_by(region) |>
+  summarise(
+    mean_political_identity = weighted.mean(progcon, w = perc, na.rm = TRUE),
+    mean_political_color = weighted.mean(leftright, w = perc, na.rm = TRUE)
+  ) |>
+  inner_join(geo_municipality, by = c("region" = "statnaam"))
+
+(summary_municipality |>
+  ggplot() +
+  geom_sf(
+    aes(geometry = geom, fill = mean_political_identity)
+  ) +
+  labs(
+    title = "Political identity",
+    subtitle = "Is the municipalitiy more<br><span style='color: #B24334'>progressive</span> or <span style='color: #4180B4'>conservative</span>?",
+    fill = NULL
+  ) +
+  scico::scale_fill_scico(
+    palette = "vik",
+    limits = c(-50, 50),
+    guide = guide_colorbar(ticks = FALSE, barheight = 0.75, barwidth = 15, reverse = TRUE)
+  )
+) + (
+  summary_municipality |>
+    ggplot() +
+    geom_sf(
+      aes(geometry = geom, fill = mean_political_color)
+    ) +
+    labs(
+      title = "Political color",
+      subtitle = "Is the municipalitiy more<br><span style='color: #B24334'>left</span> or <span style='color: #4180B4'>right</span> leaning?",
+      fill = NULL
+    ) +
+    scico::scale_fill_scico(
+      palette = "vik",
+      direction = -1,
+      limits = c(-50, 50),
+      guide = guide_colorbar(ticks = FALSE, barheight = 0.75, barwidth = 15, reverse = FALSE)
+    )
+) +
+  plot_layout(guides = "collect") +
+  plot_annotation(
+    caption = "Election results scraped from _AlleCijfers.nl_,<br>
+      Political identity and color mapped from _KiesKompas_"
+  ) &
+  theme_void(base_family = "custom") &
+  theme(
+    plot.title.position = "plot",
+    plot.title = element_markdown(size = 16, face = "bold"),
+    plot.subtitle = element_markdown(),
+    plot.caption = element_markdown(lineheight = 0.75),
+    plot.caption.position = "plot",
+    legend.text = element_blank(),
+    legend.position = "bottom"
+  )
+```
+
+</details>
+
+<img src="index.markdown_strict_files/figure-markdown_strict/plot-municipalities-1.png" width="768" />
+
+The first important thing to mention is that the *geopackage* file and the dataset with the election results have a slight mismatch. There is a difference of 5 municipalities. Should I use some time to find out which these 5 municipalities are? Probably. Do I have either time or motivation to do it right now? No, I don't. I'm gonna assume it's due to different spellings, outdated files on either side, or some merging or aggregation in either dataset prior to downloading it that may be the cause. Another caveat to keep in mind.
+
+Anyone with some familiarity of Dutch geography might immediately notice the large red spots in some familiar areas. As usual, the big cities and the cities with a significant population of younger people and/or students are markedly more progressive and left-wing than the surrounding areas, in particular the municipality in the east and south and following the *Biblebelt*. This is not a surprise for me, or anyone, but nice to see confirmed. To address the issue of the political color of the PVV, I tested replacing the value for the political color with 50 (the same as the VVD), which is a big increase from 9 given by *KiesKompas*. In this experiment (which I intentionally don't show here to stay true to the data reported), the two plots look very similar in color too in addition to trend already visible in the plot based on the data reported.
+
+{{< omission >}}
+
+*More updates may follow when data changes, or when I get new inspiration for further analyses*