-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmaps3.qmd
342 lines (241 loc) · 11.7 KB
/
maps3.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
# Advanced Spatial Visualization {#sec-maps3}
::: {.callout-note appearance="simple"}
Today we will be focusing on the theory and practice of fancy geospatial data visualization.
:::
## Visual Categories and Encodings
Let's go back to the beginning of this course. There are 3 categories of information that can be displayed.
1. Quantitative
2. Qualitative
3. Spatial
[Lecture 1.2.1](http://radicalresearch.llc/EDVcourse/information.html#categories-of-information-illustrated-by-the-newspaper-weather-visualization)
The three types of data can be encoded in:
- Geometric primitives - points, lines, and areas
- Visual channels - size, color, shape, position, angle, and texture
An advanced spatial visualization covering multiple layers of information needs to use multiple sets of encodings to convey information quickly and intuitively while not overwhelming the audience.
## Circles, Lines, and Polygons - Oh My!
Fancy maps need distinct visual encodings, so the eye can be drawn the salient features.
One key way to do this is through ensuring different types/styles/aesthetics are displayed as unique fingerprints of visual encodings.
Let's combine the three datasets we have showed in class for the sacrifice zones projects as example 1.
First, get all the libraries we need loaded up.
```{r}
#| label: load libraries
#| echo: true
#| message: false
library(tidycensus)
library(tidyverse)
library(leaflet)
library(sf)
library(htmltools)
```
### Example 1 - Race and Toxic Emissions
#### Acquire datasets
##### Demographic data from `tidycensus`
We pulled racial data from the census for Jefferson County, TX last week in Section [23.2.2](http://radicalresearch.llc/EDVcourse/import3.html#example-2---jefferson-county-texas---racial-variables). Repeat that query.
```{r}
#| label: Pull Jefferson, County demographic information
#| echo: true
# These are the list of racial variables from the decennial census
racevars <- c(White = "P2_005N",
Black = "P2_006N",
Asian = "P2_008N",
Hispanic = "P2_002N")
JeffCo <- get_decennial(
geography = "tract",
variables = racevars,
state = "TX",
county = "Jefferson County",
geometry = TRUE,
summary_var = "P2_001N",
year = 2020
)
```
##### Toxics Release Inventory Emissions
The TRI dataset provides self-reported emissions of air toxics from large regulated industrial facilities.
```{r}
#| label: Pull TRI emissions for JeffCo, TX - keep 3 key air toxics, fix lat and lng
#| echo: true
TRI_2021 <- read_csv('https://data.epa.gov/efservice/downloads/tri/mv_tri_basic_download/2021_US/csv') %>%
janitor::clean_names() %>%
filter(x8_st == 'TX') %>%
filter(x7_county == 'JEFFERSON') %>%
filter(x34_chemical %in% c('1,3-Butadiene', 'Benzene', 'Ethylene oxide')) %>%
rename(pollutant = x34_chemical,
lat = x12_latitude,
lng = x13_longitude,
emissions = x62_on_site_release_total,
name = x4_facility_name) %>%
select(pollutant, name, lat, lng, emissions)
```
#### Make some exploratory maps
First, let's show the fraction of African Americans by census tract in Jefferson County. In Figure 23.2, we used `ggplot` and `geom_sf()`. Let's show it using `leaflet` this time.
In `leaflet()`, we need to define a color palette for our variable of interest.
First, let's examine our demographic census dataset of _JeffCo_. Key columns include:
- **GEOID** - [FIPS](https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards) census tract ID
- **NAME** - Census tract name and number
- **variable** - racial category
- **value** - population count for the racial category in a census tract
- **summary_value** - population count for _ALL_ racial categories in a census tract
- **geometry** - geospatial info
```{r}
#| label: Loot at JeffCo
#| echo: true
head(JeffCo, 10)
```
The data I want to show is the **Percentage** of a census tract that is _White_. I can use the `mutate()` function to create a new _percent_ variable and the `filter()` function to only keep the _White_ subset.
```{r}
#| label: munge the demographic data
#| echo: true
JeffCo2 <- JeffCo %>%
mutate(percent = (100*value/summary_value)) %>%
filter(variable == 'White') %>%
filter(summary_value > 0) %>%
st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84")
head(JeffCo2)
```
And now I can make a basic `leaflet()` map. @fig-WhiteJeffCo shows the percent of a census tract that is white. The `addLegend()` function provides a helpful interpretative map.
```{r}
#| label: fig-WhiteJeffCo
#| echo: true
#| fig-cap: Demographics of Jefferson County, TX by census tract 2020
palJeff <- colorNumeric(palette = 'Oranges', domain = JeffCo2$percent, reverse = T)
leaflet(data = JeffCo2) %>%
addTiles() %>%
addPolygons(color = 'black',
weight = 1,
fillColor = ~palJeff(percent),
fillOpacity = 0.6) %>%
addLegend(title = 'White Population (%)',
pal = palJeff,
values = ~percent)
```
Now let's show the TRI emissions locations in a second map.
- use `addCircles()` and the **radius** argument to show magnitude of emissions
- define a `colorFactor()` palette for the pollutants
- show the names of facilities using the **label** argument.
@fig-JeffCoToxics shows the map of emissions sources.
```{r}
#| label: fig-JeffCoToxics
#| echo: true
#| fig-cap: TRI emissions of Benzene, 1,3-butadiene, and ethylene oxice in Jefferson County, TX in 2021
palEmissions <- colorFactor(palette = c('red', 'blue', 'black'), domain = TRI_2021$pollutant)
leaflet(data = TRI_2021) %>%
addTiles() %>%
addCircles(weight = 1,
color = ~palEmissions(pollutant),
radius = ~emissions * 0.1,
label = ~htmlEscape(name)) %>%
addLegend(title = 'Pollutant',
pal = palEmissions,
values = ~pollutant)
```
And we can put those two together. Remember, when there are multiple data layers and datasets, each layer needs to have its data assigned within each function call where it is relied on. So we'll move the `data = <DATALAYER>` argument from the `leaflet()` function to the `addPolygons`, `addCircles`, and `addLegends` functions.
@fig-combinedTox_Race
```{r}
#| label: fig-combinedTox_Race
#| echo: true
#| fig-cap: Racial demographics and TRI emissions of Benzene, 1,3-butadiene, and ethylene oxice in Jefferson County, TX in 2020-2021
leaflet() %>%
addTiles() %>%
addPolygons(data = JeffCo2,
color = 'black',
weight = 1,
fillColor = ~palJeff(percent),
fillOpacity = 0.6) %>%
addLegend(data = JeffCo2,
title = 'White Population (%)',
pal = palJeff,
values = ~percent) %>%
addCircles(data = TRI_2021,
weight = 1,
color = ~palEmissions(pollutant),
radius = ~emissions * 0.1,
label = ~htmlEscape(name)) %>%
addLegend(data = TRI_2021,
title = 'Pollutant',
pal = palEmissions,
values = ~pollutant)
```
#### Discussion and Critique
Do Circles and Polygons overlay in a useful way?
What changes do you think would make this map more usable and intuitive?
#### Exercise 1. Team Sacrifice
1. Add a [minimap](http://radicalresearch.llc/EDVcourse/debt.html#mono-lake-case-study)
2. Adjust the `fillOpacity` of both layers to improve salience
3. Decide as a team to implement another feature.
### Example 2. Water in LA County
LA County runs an annual water deficit, which requires large annual imports of water from multiple sources. Let's try to quantify these flows.
Let's start with the supply of water data from the city of Los Angeles.
Here is a [LADWP water supply in acre-feet.](https://data.lacity.org/City-Infrastructure-Service-Requests/LADWP-Water-Supply-in-Acre-Feet/qyvz-diiw/data)
```{r}
#| label: acquire LA City water data
#| echo: true
library(janitor)
H2O_data <- read_csv('https://data.lacity.org/api/views/qyvz-diiw/rows.csv?accessType=DOWNLOAD') %>%
clean_names()
```
#### Plot water supply over time
@fig-BarWater shows a simple bar chart of LA City water supply sources. Note that there is some fancy data manipulation first though. Also, there's a call to the `scales` package to make the x-axis label nicer.
```{r}
#| label: fig-BarWater
#| echo: true
#| fig-cap: Water supply trends for LA City
H2O_data %>%
select(1, 3:6) %>%
pivot_longer(names_to = 'parameter', values_to = 'acreFeet', cols = 2:5) %>%
mutate(date_value = lubridate::mdy_hms(date_value)) %>%
ggplot(aes(x = date_value, y = acreFeet, fill = parameter)) +
#geom_line() +
#geom_point() +
geom_bar(stat = 'identity') +
theme_bw() +
scale_x_datetime(labels = scales::label_date_short(), date_breaks = '5 years') +
labs(x = '', y = 'Supply in Acre Feet')
```
It is very clear that groundwater and recycling are minimal. Most water is imported from MWD or the LA Aqueduct, with a general long-term trend to rely more on MWD over time.
The LA Aqueduct gets its water from the Owens Valley and by diverting water from Mono Lake. That source is restricted in the last 20 years due to agreements to stop diverting so much water in drought years in order to keep Mono Lake levels stable.
More complicated is the MWD - Municipal water district.
Water in the MWD comes from three main sources.
![MWD Water Sources](https://www.mwdh2o.com/media/voqpwgk1/wwgowchart_rev2-02.svg)
Now, there are three separate sources feeding into MWD supply. The Colorado River Aqueduct, the State Water Project, and the local sources.
Can we show the relative magnitudes for each of the aqueducts on a map?
Let's go get some aqueducts! We did this in [23.2.6](http://radicalresearch.llc/EDVcourse/import3.html#example-5.-aqueducts-in-california)
```{r}
#| label: aqueducts
#| echo: true
geojson.URL <- 'https://gis.data.cnra.ca.gov/datasets/b788fb2628844f54b92e46dac5bb7229_0.geojson?outSR=%7B%22latestWkid%22%3A3857%2C%22wkid%22%3A102100%7D'
aqua3 <- read_sf(dsn = geojson.URL) %>%
st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84")
```
@fig-aqueDucted shows the basic version of aqueducts feeding LA county water supply. Advanced version is possible!
```{r}
#| label: fig-aqueDucted
#| echo: true
#| fig-cap: Aqueducts supplying LA County water
LACounty_aqua <- c('Main Canal', 'Los Angeles Aqueduct', 'Colorado River Aqueduct')
aqua3 %>%
filter(Name %in% LACounty_aqua) %>%
leaflet() %>%
addTiles() %>%
addPolylines(weight = 2,
label = ~htmlEscape(Name))
```
Calculate latest ten-year average water contribution from LA aqueduct and MWD sources. @tbl-H2O_avg shows the values.
```{r}
#| label: tbl-H2O_avg
#| echo: true
#| tbl-cap: Percent of water from sources
H2O_2 <- H2O_data %>%
mutate(yr = lubridate::year(lubridate::mdy_hms(date_value))) %>%
filter(yr > 2006) %>%
summarize(avg_LA_aqueduct = mean(la_aqueduct_percent_of_total),
avg_MWD = mean(mwd_percent_of_total)) %>%
mutate(CO_aqueduct = 0.25*avg_MWD, state_water_project = 0.35*avg_MWD)
kableExtra::kable(H2O_2)
```
Ok, about 32% of water came from the LA aqueduct, 13.7% came from Colorado River Aqueduct, and 19.2% came from the State water project. That's about 65% of the water imported from the three aqueducts.
I can now overlay some markers or change the aqueduct thickness or think of some other way to visualize the source of the water coming to LA county.
#### Exercise #2
Team Debt -
1. Add some markers to the source of the aqueducts with popup labels showing the magnitude of water
2. Add the aqueducts as separte polyline layers with separate weights to indicate relative flows?
3. What other way would you show this information? A [sankey](https://github.com/davidsjoberg/ggsankey) diagram?