From a43abb5200b19eb45ecbce52ea2f47011d20e87b Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci Date: Thu, 11 Apr 2024 08:30:02 +0200 Subject: [PATCH 1/7] Update first raster episode --- episodes/13-intro-to-raster-data.Rmd | 47 +++++++++------------------- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/episodes/13-intro-to-raster-data.Rmd b/episodes/13-intro-to-raster-data.Rmd index 1b58a73a..90fa71b7 100644 --- a/episodes/13-intro-to-raster-data.Rmd +++ b/episodes/13-intro-to-raster-data.Rmd @@ -38,7 +38,7 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. - +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. ::: @@ -64,7 +64,7 @@ In this and lesson, we will use: ## View Raster File Attributes -We will be working with a series of GeoTIFF files in this lesson. The GeoTIFF format contains a set of embedded tags with metadata about the raster data. We can use the function `describe()` from the `terra` package to get information about our raster data before we read that data into R. It is recommended to do this before importing your data. We first examine the file `tud-dsm-5m.tif`. +We will be working with a series of GeoTIFF files in this lesson. The GeoTIFF format contains a set of embedded tags with metadata about the raster data. We can use the function `describe()` from the `terra` package to get information about our raster. It is recommended to do this before importing the data. We first examine the file `tud-dsm-5m.tif`. ```{r attr} describe("data/tud-dsm-5m.tif") @@ -73,20 +73,22 @@ We will be using this information throughout this episode. By the end of the epi ## Open a Raster in R -Now that we've previewed the metadata for our GeoTIFF, let's import this raster dataset into R and explore its metadata more closely. We can use the `rast()` function to import a raster file in R. +Now that we've previewed the metadata for our GeoTIFF, let's import this raster file into R and explore its metadata more closely. We can use the `rast()` function to import a raster file in R. ::: callout # Data tip - Object names To improve code readability, use file and object names that make it clear what is in the file. The raster data for this episode contain the TU Delft campus and its surroundings so we’ll use a naming convention of `datatype_TUD`. The first object is a Digital Surface Model (DSM) in GeoTIFF format stored in a file `tud-dsm-5m.tif` which we will load into an object named according to our naming convention `DSM_TUD`. ::: -First we will load our raster file into R and view the data structure. +Let's load our raster file into R and view its data structure. ```{r} DSM_TUD <- rast("data/tud-dsm-5m.tif") DSM_TUD ``` -The information above includes a report on dimension, resolution, extent and CRS, but no information about the values. Similar to other data structures in R like vectors and data frame columns, descriptive statistics for raster data can be retrieved with the `summary()` function. +The information above includes a report on dimension, resolution, extent and CRS, but no information about the values. + +Similar to other data structures in R like vectors and data frame columns, descriptive statistics for raster data can be retrieved with the `summary()` function. ```{r, warning=TRUE} summary(DSM_TUD) @@ -100,7 +102,7 @@ summary(values(DSM_TUD)) With a summary on all cells of the raster, the values range from a smaller minimum of `-5.3907` to a higher maximum of `92.0910`. -To visualise the DSM in R using `ggplot2`, we need to convert it to a data frame. We learned about data frames in an [earlier lesson](../episodes/03-explore-data.Rmd). The `terra` package has the built-in function `as.data.frame()` for conversion to a data frame. +To visualise the DSM in R using `ggplot2`, we need to convert it to a data frame. We learned about data frames in an [earlier lesson](../episodes/03-explore-data.Rmd). The `terra` package has the built-in method `as.data.frame()` for conversion to a data frame. ```{r} DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE) @@ -124,7 +126,7 @@ ggplot() + ::: callout # Plotting tip -More information about the viridis palette used above at [viridis package documentation](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html). +More information about the viridis palette used above can be found in the [viridis package documentation](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html). ::: ::: callout @@ -133,11 +135,11 @@ More information about the viridis palette used above at [viridis package docume For faster previews, you can use the `plot()` function on a `terra` object. ::: -This map shows our Digital Surface Model, that is, the elevation of our study site including buildings and vegetation. From the legend we can confirm that the maximum elevation is around 90, but we cannot tell whether this is 90 feet or 90 meters because the legend doesn’t show us the units. We can look at the metadata of our object to see what the units are. Much of the metadata that we’re interested in is part of the CRS. +This map shows our Digital Surface Model (DSM), that is, the elevation of our study site including buildings and vegetation. From the legend we can confirm that the maximum elevation is around 90, but we cannot tell whether this is 90 feet or 90 meters because the legend doesn’t show us the units. We can look at the metadata of our object to see what the units are. Much of the metadata that we’re interested in is part of the CRS. -Now we will see how features of the CRS appear in our data file and what meanings they have. +Now we will see how features of the CRS appear in our data file and what meaning they have. -## View Raster Coordinate Reference System (CRS) in R +## View Raster Coordinate Reference System (CRS) We can view the CRS string associated with our R object using the `crs()` function. @@ -209,32 +211,13 @@ nlyr(DSM_TUD) This dataset has only 1 band. However, raster data can also be multi-band, meaning that one raster file contains data for more than one variable or time period for each cell. We will discuss multi-band raster data in a [later episode](../episodes/17-work-with-multi-band-rasters.Rmd). +::: callout ## Creating a histogram of raster values -A histogram can be used to inspect the distribution of raster values visually. It can show if there are values above the max or below the min of the expected range. - -We can inspect the distribution of values contained in a raster using the `ggplot2` function `geom_histogram()`. Histograms are often useful in identifying outliers and bad data values in our raster data. - -```{r rast-hist, message=TRUE} -ggplot() + - geom_histogram(data = DSM_TUD_df, aes(`tud-dsm-5m`)) -``` - -Notice that a message is displayed when R creates the histogram: - -``` -`stat_bin()` using `bins = 30`. Pick better value with `binwidth`. -``` - -This message is caused by a default setting in `geom_histogram()` enforcing that there are 30 bins for the data. We can define the number of bins we want in the histogram by giving another value to the `bins` argument. 60 bins, for instance, will give a more detailed histogram of the same distribution: - -```{r} -ggplot() + - geom_histogram(data = DSM_TUD_df, aes(`tud-dsm-5m`), bins = 60) -``` +A histogram can be used to inspect the distribution of raster values visually. It can show if there are values above the max or below the min of the expected range. We can inspect the distribution of values contained in a raster using the `ggplot2` function `geom_histogram()`. Histograms are often useful in identifying outliers and bad data values in our raster data. Read more on the use of histograms in [this lesson](https://datacarpentry.org/r-raster-vector-geospatial/01-raster-structure.html#create-a-histogram-of-raster-values) -Note that the shape of this histogram looks similar to the previous one that was created using the default of 30 bins. The distribution of elevation values for our Digital Surface Model (DSM) looks reasonable. It is likely that there are no bad data values in this particular raster. +::: ::: challenge From 8979ff366ddb0f2149ee7f8d62c74160a9b010ac Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci Date: Thu, 11 Apr 2024 08:43:11 +0200 Subject: [PATCH 2/7] Update raster plotting episode --- episodes/14-plot-raster-data.Rmd | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/episodes/14-plot-raster-data.Rmd b/episodes/14-plot-raster-data.Rmd index dd53a5b2..95962e3b 100644 --- a/episodes/14-plot-raster-data.Rmd +++ b/episodes/14-plot-raster-data.Rmd @@ -22,7 +22,7 @@ DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE) :::::::::::::::::::::::::::::::::::::: questions -- How can I create categorized or customized maps of raster data? +- How can I create categorized maps of raster data? - How can I customize the colour scheme of a raster image? :::::::::::::::::::::::::::::::::::::::::::::::: @@ -41,13 +41,14 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. - +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. ::: -In this part, we will plot our raster object using `ggplot2` with customized coloring schemes. We will continue working with the Digital Surface Model (DSM) raster from the [previous episode](../episodes/13-intro-to-raster-data.Rmd). +In this part, we will plot our raster object using `ggplot2` with customized colouring schemes. We will continue working with the Digital Surface Model (DSM) raster from the [previous episode](../episodes/13-intro-to-raster-data.Rmd). + +## Plotting Data Using Breaks -### Plotting Data Using Breaks In the previous plot, our DSM was coloured with a continuous colour range. For clarity and visibility, we may prefer to view the data “symbolized” or coloured according to ranges of values. This is comparable to a “classified” map. For that, we need to tell `ggplot()` how many groups to break our data into and where those breaks should be. To make these decisions, it is useful to first explore the distribution of the data using a bar plot. To begin with, we will use `dplyr`’s `mutate()` function combined with `cut()` to split the data into 3 bins. ```{r} @@ -69,7 +70,7 @@ DSM_TUD_df %>% count(fct_elevation) ``` -We might prefer to customize the cut-off values for these groups. Lets round the cut-off values so that we have groups for the ranges of -10 - 0m, 0 - 5m, and 5 - 100m. To implement this we will give `cut()` a numeric vector of break points instead of the number of breaks we want. +We might prefer to customize the cut-off values for these groups. Lets round the cut-off values so that we have groups for the ranges of -10-0 m, 0-5 m, and 5-100 m. To implement this we will give `cut()` a numeric vector of break points instead of the number of breaks we want. ```{r} custom_bins <- c(-10, 0, 5, 100) @@ -114,7 +115,7 @@ The plot above uses the default colours inside `ggplot2` for raster objects. We terrain.colors(3) ``` -The `terrain.colors()` function returns hex colours - each of these character strings represents a colour. To use these in our map, we pass them across using the `scale_fill_manual()` function. +The `terrain.colors()` function returns hex colours - each of these character strings represents a colour. To use these in our map, we pass them to the `values` argument in the `scale_fill_manual()` function. ```{r} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + @@ -126,7 +127,7 @@ ggplot() + If we need to create multiple plots using the same colour palette, we can create an R object (`my_col`) for the set of colours that we want to use. We can then quickly change the palette across all plots by modifying the `my_col` object, rather than each individual plot. -We can give the legend a more meaningful title by passing a value to the `name` argument of the `scale_fill_manual()` function. +We can give the legend a more meaningful title with the `name` argument of the `scale_fill_manual()` function. ```{r} my_col <- terrain.colors(3) @@ -136,7 +137,7 @@ ggplot() + scale_fill_manual(values = my_col, name = "Elevation") + coord_quickmap() ``` -The axis labels x and y are not necessary, so we can turn them off by passing `element_blank()` to the relevant part of the `theme()` function. +The axis labels x and y are not necessary, so we can turn them off by passing `element_blank()` to the `axis.title` argumnet in the `theme()` function. ```{r} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, From d3e0754a53d89f3690eca53f0e593d35f3b09a7a Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci Date: Wed, 1 May 2024 12:52:27 +0200 Subject: [PATCH 3/7] Update raster episodes --- episodes/13-intro-to-raster-data.Rmd | 60 ++++++++++---------- episodes/14-plot-raster-data.Rmd | 20 +++---- episodes/15-reproject-raster-data.Rmd | 23 ++++---- episodes/16-raster-calculations.Rmd | 27 +++++---- episodes/17-work-with-multi-band-rasters.Rmd | 38 ++++--------- 5 files changed, 76 insertions(+), 92 deletions(-) diff --git a/episodes/13-intro-to-raster-data.Rmd b/episodes/13-intro-to-raster-data.Rmd index 90fa71b7..6bbe5e11 100644 --- a/episodes/13-intro-to-raster-data.Rmd +++ b/episodes/13-intro-to-raster-data.Rmd @@ -18,15 +18,15 @@ library(terra) ::: questions - What is a raster dataset? -- How do I work with and plot raster data in R? +- How do I import, examine and plot raster data in R? ::: ::: objectives After completing this episode, participants should be able to… -- Explore raster attributes and metadata using R. - Import rasters into R using the `terra` package. +- Explore raster attributes and metadata using the `terra` package. - Plot a raster file in R using the `ggplot2` package. - Describe the difference between single- and multi-band rasters. @@ -38,7 +38,7 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. -This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before attaching it with `library(terra)`. ::: @@ -77,7 +77,7 @@ Now that we've previewed the metadata for our GeoTIFF, let's import this raster ::: callout # Data tip - Object names -To improve code readability, use file and object names that make it clear what is in the file. The raster data for this episode contain the TU Delft campus and its surroundings so we’ll use a naming convention of `datatype_TUD`. The first object is a Digital Surface Model (DSM) in GeoTIFF format stored in a file `tud-dsm-5m.tif` which we will load into an object named according to our naming convention `DSM_TUD`. +To improve code readability, use file and object names that make it clear what is in the file. The raster data for this episode contain the TU Delft campus and its surroundings so we will use the naming convention `_TUD`. The first object is a Digital Surface Model (DSM) in GeoTIFF format stored in a file `tud-dsm-5m.tif` which we will load into an object named according to our naming convention `DSM_TUD`. ::: Let's load our raster file into R and view its data structure. @@ -86,9 +86,7 @@ Let's load our raster file into R and view its data structure. DSM_TUD <- rast("data/tud-dsm-5m.tif") DSM_TUD ``` -The information above includes a report on dimension, resolution, extent and CRS, but no information about the values. - -Similar to other data structures in R like vectors and data frame columns, descriptive statistics for raster data can be retrieved with the `summary()` function. +The information above includes a report on dimension, resolution, extent and CRS, but no information about the values. Similar to other data structures in R like vectors and data frames, descriptive statistics for raster data can be retrieved with the `summary()` function. ```{r, warning=TRUE} summary(DSM_TUD) @@ -108,39 +106,44 @@ To visualise the DSM in R using `ggplot2`, we need to convert it to a data frame DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE) ``` -Now when we view the structure of our data, we will see a standard data frame format. +Now when we view the structure of our data, we will see a standard data frame format in which every row is a cell from the raster, each containing information about the `x` and `y` coordinates and the raster value stored in the `tud-dsm-5m` column. ```{r} str(DSM_TUD_df) ``` -We can use `ggplot()` to plot this data with a specific `geom_` function called `geom_raster()`. We will make the colour scale in our plot colour-blindness friendly with `scale_fill_viridis_c`, introduced in an [earlier lesson](../episodes/04-intro-to-visualisation.Rmd). We will also use the `coord_quickmap()` function to use an approximate Mercator projection for our plots. This approximation is suitable for small areas that are not too close to the poles. Other coordinate systems are available in `ggplot2` if needed, you can learn about them at their help page `?coord_map`. +We can use `ggplot()` to plot this data with a specific `geom_` function called `geom_raster()`. We will make the colour scale in our plot colour-blindness friendly with `scale_fill_viridis_c`, introduced in an [earlier lesson](../episodes/04-intro-to-visualisation.Rmd). We will also use the `coord_equal()` function to ensure that the units (meters in our case) on the two axes are equal. ```{r first-rast-plot, fig.cap="Raster plot with `ggplot2` using the viridis color scale"} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = `tud-dsm-5m`)) + - scale_fill_viridis_c(option = "H") + # `option = "H"` provides a contrasting colour scale - coord_quickmap() + scale_fill_viridis_c(option = "turbo") + + coord_equal() ``` ::: callout # Plotting tip -More information about the viridis palette used above can be found in the [viridis package documentation](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html). +The `"turbo"` scale in our code provides a good contrasting scale for our raster, but another colour scale may be preferred when plotting other rasters. More information about the viridis palette used above can be found in the [viridis package documentation](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html). ::: ::: callout # Plotting tip For faster previews, you can use the `plot()` function on a `terra` object. -::: -This map shows our Digital Surface Model (DSM), that is, the elevation of our study site including buildings and vegetation. From the legend we can confirm that the maximum elevation is around 90, but we cannot tell whether this is 90 feet or 90 meters because the legend doesn’t show us the units. We can look at the metadata of our object to see what the units are. Much of the metadata that we’re interested in is part of the CRS. +```{r} +plot(DSM_TUD) +``` -Now we will see how features of the CRS appear in our data file and what meaning they have. +::: ## View Raster Coordinate Reference System (CRS) +The map above shows our Digital Surface Model (DSM), that is, the elevation of our study site including buildings and vegetation. From the legend we can confirm that the maximum elevation is around 90, but we cannot tell whether that is 90 feet or 90 meters because the legend does not show us the units. We can look at the metadata of our object to see what the units are. Much of the metadata that we are interested in is part of the CRS. + +Now we will see how features of the CRS appear in our data file and what meaning they have. + We can view the CRS string associated with our R object using the `crs()` function. ```{r} @@ -162,12 +165,12 @@ What units are our data in? The CRS for our data is given to us by R in PROJ.4 format. Let’s break down the pieces of a PROJ.4 string. The string contains all of the individual CRS elements that R or another GIS might need. Each element is specified with a `+` sign, similar to how a `.csv` file is delimited or broken up by a `,`. After each `+` we see the CRS element such as projection (`proj=`) being defined. -See more about CRS and PROJ.4 strings in [this lesson](https://datacarpentry.org/organization-geospatial/03-crs). +See more about CRS and PROJ.4 strings in [this lesson](https://datacarpentry.org/organization-geospatial/03-crs#describing-coordinate-reference-systems). ::: ## Calculate Raster Min and Max values -It is useful to know the minimum and maximum values of a raster dataset. In this case, as we are working with elevation data, these values represent the min/max elevation range at our site. +It is useful to know the minimum and maximum values of a raster dataset. In this case, as we are working with elevation data, these values represent the minimum-to-maximum elevation range at our site. Raster statistics are often calculated and embedded in a GeoTIFF for us. We can view these values: ```{r} @@ -184,6 +187,8 @@ DSM_TUD <- setMinMax(DSM_TUD) ``` ::: +A call to `minmax(DSM_TUD)` will now give us the correct values. Alternatively, `min(values())` and `max(values())` will return the minimum and maximum values respectively. + ```{r} min(values(DSM_TUD)) ``` @@ -192,32 +197,25 @@ min(values(DSM_TUD)) max(values(DSM_TUD)) ``` - We can see that the elevation at our site ranges from `-5.39069`m to `92.08102`m. ## Raster bands -The Digital Surface Model object (`DSM_TUD`) that we’ve been working with is a single band raster. This means that there is only one dataset stored in the raster: surface elevation in meters for one time period. - -![Single- and multi-band raster](https://datacarpentry.org/r-raster-vector-geospatial/fig/dc-spatial-raster/single_multi_raster.png) - - +The Digital Surface Model object (`DSM_TUD`) that we have been working with is a single band raster. This means that there is only one layer stored in the raster: surface elevation in meters for one time period. -A raster dataset can contain one or more bands. We can view the number of bands in a raster using the `nlyr()` function. +We can view the number of bands in a raster using the `nlyr()` function. ```{r} nlyr(DSM_TUD) ``` -This dataset has only 1 band. However, raster data can also be multi-band, meaning that one raster file contains data for more than one variable or time period for each cell. We will discuss multi-band raster data in a [later episode](../episodes/17-work-with-multi-band-rasters.Rmd). +![Single- and multi-band raster](https://datacarpentry.org/r-raster-vector-geospatial/fig/dc-spatial-raster/single_multi_raster.png) -::: callout +Our DSM data has only one band. However, raster data can also be multi-band, meaning that one raster file contains data for more than one variable or time period for each cell. We will discuss multi-band raster data in a [later episode](../episodes/17-work-with-multi-band-rasters.Rmd). ## Creating a histogram of raster values -A histogram can be used to inspect the distribution of raster values visually. It can show if there are values above the max or below the min of the expected range. We can inspect the distribution of values contained in a raster using the `ggplot2` function `geom_histogram()`. Histograms are often useful in identifying outliers and bad data values in our raster data. Read more on the use of histograms in [this lesson](https://datacarpentry.org/r-raster-vector-geospatial/01-raster-structure.html#create-a-histogram-of-raster-values) - -::: +A histogram can be used to inspect the distribution of raster values visually. It can show if there are values above the maximum or below the minimum of the expected range. We can plot a histogram using the `ggplot2` function `geom_histogram()`. Histograms are often useful in identifying outliers and bad data values in our raster data. Read more on the use of histograms in [this lesson](https://datacarpentry.org/r-raster-vector-geospatial/01-raster-structure.html#create-a-histogram-of-raster-values) ::: challenge @@ -250,9 +248,9 @@ describe("data/tud-dsm-5m-hill.tif") ::::::::::::::::::::::::::::::::::::: keypoints -- The GeoTIFF file format includes metadata about the raster data. +- The GeoTIFF file format includes metadata about the raster data that can be inspected with the `describe()` function from the `terra` package. - To plot raster data with the `ggplot2` package, we need to convert them to data frames. -- R stores CRS information in the PROJ.4 format. +- PROJ is a widely used standard format to store, represent and transform CRS. - Histograms are useful to identify missing or bad data values. :::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/episodes/14-plot-raster-data.Rmd b/episodes/14-plot-raster-data.Rmd index 95962e3b..016851ff 100644 --- a/episodes/14-plot-raster-data.Rmd +++ b/episodes/14-plot-raster-data.Rmd @@ -41,11 +41,11 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. -This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before attaching it with `library(terra)`. ::: -In this part, we will plot our raster object using `ggplot2` with customized colouring schemes. We will continue working with the Digital Surface Model (DSM) raster from the [previous episode](../episodes/13-intro-to-raster-data.Rmd). +In this part, we will plot our raster object using `ggplot2` with customized colour schemes. We will continue working with the Digital Surface Model (DSM) raster from the [previous episode](../episodes/13-intro-to-raster-data.Rmd). ## Plotting Data Using Breaks @@ -87,7 +87,7 @@ levels(DSM_TUD_df$fct_elevation_cb) Note that 4 break values will result in 3 bins of data. -The bin intervals are shown using `(` to mean exclusive and `]` to mean inclusive. For example: (0, 10] means “from 0 through 10”. +The bin intervals are shown using `(` to mean exclusive and `]` to mean inclusive. For example: (0, 5] means “from 0 through 5”. ::: @@ -107,9 +107,9 @@ We can use those groups to plot our raster data, with each group being a differe ```{r} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + - coord_quickmap() + coord_equal() ``` -The plot above uses the default colours inside `ggplot2` for raster objects. We can specify our own colours to make the plot look a little nicer. R has a built in set of colours for plotting terrain, which are built in to the `terrain.colors()` function. Since we have three bins, we want to create a 3-colour palette: +The plot above uses the default colours inside `ggplot2` for raster objects. We can specify our own colours to make the plot look a little nicer. R has a built in set of colours for plotting terrain available through the `terrain.colors()` function. Since we have three bins, we want to create a 3-colour palette: ```{r} terrain.colors(3) @@ -120,7 +120,7 @@ The `terrain.colors()` function returns hex colours - each of these character st ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + scale_fill_manual(values = terrain.colors(3)) + - coord_quickmap() + coord_equal() ``` ## More Plot Formatting @@ -135,16 +135,16 @@ ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + scale_fill_manual(values = my_col, name = "Elevation") + - coord_quickmap() + coord_equal() ``` -The axis labels x and y are not necessary, so we can turn them off by passing `element_blank()` to the `axis.title` argumnet in the `theme()` function. +The axis labels x and y are not necessary, so we can turn them off by passing `element_blank()` to the `axis.title` argument in the `theme()` function. ```{r} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + scale_fill_manual(values = my_col, name = "Elevation") + theme(axis.title = element_blank()) + - coord_quickmap() + coord_equal() ``` @@ -171,7 +171,7 @@ ggplot() + geom_raster(data = DSM_TUD_df, aes(x = x, y = y, fill = fct_elevation_6)) + scale_fill_manual(values = my_col, name = "Elevation") + - coord_quickmap() + + coord_equal() + labs(title = "Elevation Classes of the Digital Surface Model (DSM)") ``` diff --git a/episodes/15-reproject-raster-data.Rmd b/episodes/15-reproject-raster-data.Rmd index f8f9db82..8ce316d0 100644 --- a/episodes/15-reproject-raster-data.Rmd +++ b/episodes/15-reproject-raster-data.Rmd @@ -43,15 +43,15 @@ See the [setup instructions](../learners/setup.md) for detailed information abou ::: -Sometimes we encounter raster datasets that do not “line up” when plotted or analysed. Rasters that don’t line up are most often in different Coordinate Reference Systems (CRS). This episode explains how to deal with rasters in different CRS. It will walk through reprojecting rasters in R using the `project()` function in the `terra` package. +Sometimes we encounter raster datasets that do not “line up” when plotted or analysed. Rasters that do not line up are most often in different Coordinate Reference Systems (CRS). This episode explains how to deal with rasters in different CRS. It will walk through reprojecting rasters in R using the `project()` function in the `terra` package. ## Raster Projection -For this episode, we will be working with the Digital Terrain Model data. This differs from the surface model data we’ve been working with so far in that the digital surface model (DSM) includes the tops of trees and buildings, while the digital terrain model (DTM) shows the ground level. +For this episode, we will be working with Digital Terrain Model data. This differs from the surface model data we have been working with so far in that the digital surface model (DSM) includes the tops of trees and buildings, while the digital terrain model (DTM) shows the ground level. -We’ll be looking at another model (the canopy/building height model, or CHM) in [a later episode](../episodes/16-raster-calculations.Rmd) and will see how to calculate the CHM from the DSM and DTM. Here, we will create a map of the Digital Terrain Model (`DTM_TUD`) of TU Delft and its surrounding draped (or layered) on top of the hillshade (`DTM_hill_TUD`). +We will be looking at another model (the canopy/building height model, or CHM) in [a later episode](../episodes/16-raster-calculations.Rmd) and will see how to calculate the CHM from the DSM and DTM. Here, we will create a map of the Digital Terrain Model (`DTM_TUD`) of TU Delft and its surrounding draped (or layered) on top of the hillshade (`DTM_hill_TUD`). :::::::::::::::::::::::::::::::::::::: callout @@ -59,7 +59,7 @@ We’ll be looking at another model (the canopy/building height model, or CHM) i We can layer a raster on top of a hillshade raster for the same area, and use a transparency factor to create a 3-dimensional shaded effect. A hillshade is a raster that maps the terrain using light and shadow to create a 3D-looking image that you would see from above when viewing the terrain. We will add a custom colour, making the plot grey. -Read more about layering raster [here](https://datacarpentry.org/r-raster-vector-geospatial/instructor/02-raster-plot.html#layering-rasters). +Read more about layering rasters [here](https://datacarpentry.org/r-raster-vector-geospatial/instructor/02-raster-plot.html#layering-rasters). :::::::::::::::::::::::::::::::::::::::::::::::: @@ -86,9 +86,9 @@ ggplot() + aes(x = x, y = y, alpha = `tud-dtm-5m-hill`)) + scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + - coord_quickmap() + coord_equal() ``` -Our results are curious - neither the Digital Terrain Model (`DTM_TUD_df`) nor the DTM Hillshade (`DTM_hill_TUD_df`) plotted. Let’s try to plot the DTM on its own to make sure there are data there. +Our results are curious - neither the DTM (`DTM_TUD_df`) nor the hillshade (`DTM_hill_TUD_df`) are plotted. Let’s try to plot the DTM on its own to make sure the data are there. ```{r} ggplot() + @@ -96,7 +96,7 @@ ggplot() + aes(x = x, y = y, fill = `tud-dtm-5m`)) + scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + - coord_quickmap() + coord_equal() ``` Our DTM seems to contain data and plots just fine. @@ -108,10 +108,10 @@ ggplot() + geom_raster(data = DTM_hill_TUD_df, aes(x = x, y = y, alpha = `tud-dtm-5m-hill`)) + - coord_quickmap() + coord_equal() ``` -If we look at the axes, we can see that the units, and therefore the projections, of the two rasters are different. When this is the case, `ggplot2` won’t render the image. It won’t even throw an error message to tell you something has gone wrong. We can look at Coordinate Reference Systems (CRSs) of the DTM and the hillshade data to see how they differ. +If we look at the axes, we can see that the units, and therefore the projections, of the two rasters are different. When this is the case, `ggplot2` will not render the image. It will not even throw an error message to tell you something has gone wrong. We can look at Coordinate Reference Systems (CRSs) of the DTM and the hillshade data to see how they differ. ::: challenge @@ -128,13 +128,14 @@ crs(DTM_TUD, parse = TRUE) ```{r} crs(DTM_hill_TUD, parse = TRUE) ``` + `DTM_TUD` is in the Amersfoort / RD New projection, whereas `DTM_hill_TUD` is in WGS 84. ::: ::: -Because the two rasters are in different CRS, they don’t line up when plotted in R. We need to reproject (or change the projection of) `DTM_hill_TUD` into the Amersfoort / RD New CRS. +Because the two rasters are in different CRS, they do not line up when plotted in R. We need to reproject (or change the projection of) `DTM_hill_TUD` into the Amersfoort / RD New CRS. ## Reproject Rasters @@ -187,7 +188,7 @@ res(DTM_hill_EPSG28992_TUD) res(DTM_TUD) ``` -These two resolutions are different, but they’re representing the same data. We can tell R to force our newly reprojected raster to be the same as `DTM_TUD` by adding a line of code `res = res(DTM_TUD)` within the `project()` function. +These two resolutions are different, but they represent the same data. We can tell R to force our newly reprojected raster to be the same as `DTM_TUD` by adding a line of code `res = res(DTM_TUD)` within the `project()` function. ```{r} DTM_hill_EPSG28992_TUD <- project(DTM_hill_TUD, crs(DTM_TUD), diff --git a/episodes/16-raster-calculations.Rmd b/episodes/16-raster-calculations.Rmd index 791326cf..ada80678 100644 --- a/episodes/16-raster-calculations.Rmd +++ b/episodes/16-raster-calculations.Rmd @@ -51,7 +51,7 @@ We often want to combine values of and perform calculations on rasters to create ## Raster calculations in R -We often want to perform calculations on two or more rasters to create a new output raster. For example, if we are interested in mapping the heights of trees and buildings across an entire field site, we might want to calculate the difference between the Digital Surface Model (DSM, tops of trees and buildings) and the Digital Terrain Model (DTM, ground level). The resulting dataset is referred to as a Canopy Height Model (CHM) and represents the actual height of trees, buildings, etc. with the influence of ground elevation removed. +Let's say we are interested in mapping the heights of trees and buildings across an entire urban area. To that end, we can calculate the difference between the Digital Surface Model (DSM, tops of trees and buildings) and the Digital Terrain Model (DTM, ground level). The resulting dataset is referred to as a Canopy Height Model (CHM) and represents the actual height of trees, buildings, etc. with the influence of ground elevation removed. ![Source: National Ecological Observatory Network (NEON).](https://datacarpentry.org/r-raster-vector-geospatial/fig/dc-spatial-raster/lidarTree-height.png) @@ -65,13 +65,13 @@ describe("data/tud-dtm-5m.tif") describe("data/tud-dsm-5m.tif") ``` -We’ve already loaded and worked with these two data files in earlier episodes. Let’s plot them each once more to remind ourselves what this data looks like. First we’ll plot the DTM elevation data: +We have already loaded and worked with these two data files in earlier episodes. Let’s plot them each once more to remind ourselves what this data looks like. First we plot the DTM elevation data: ```{r} ggplot() + geom_raster(data = DTM_TUD_df , aes(x = x, y = y, fill = `tud-dtm-5m`)) + scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + - coord_quickmap() + coord_equal() ``` And then the DSM elevation data: @@ -80,7 +80,7 @@ And then the DSM elevation data: geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = `tud-dsm-5m`)) + scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) + - coord_quickmap() + coord_equal() ``` ## Raster math and Canopy Height Models @@ -101,7 +101,7 @@ We can now plot the output CHM. geom_raster(data = CHM_TUD_df , aes(x = x, y = y, fill = `tud-dsm-5m`)) + scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + - coord_quickmap() + coord_equal() ``` Let’s have a look at the distribution of values in our newly created Canopy Height Model (CHM). @@ -115,11 +115,11 @@ Notice that the range of values for the output CHM starts right below 0 and rang ### Challenge: Explore CHM Raster Values -It’s often a good idea to explore the range of values in a raster dataset just like we might explore a dataset that we collected in the field. +It is often a good idea to explore the range of values in a raster dataset just like we might explore a dataset that we collected in the field. 1. What is the minimum and maximum value for the Canopy Height Model `CHM_TUD` that we just created? 2. What is the distribution of all the pixel values in the CHM? -3. Plot the `CHM_TUD` raster using breaks that make sense for the data. Include an appropriate colour palette for the data, plot title and no axes ticks / labels. +3. Plot the `CHM_TUD` raster using breaks that make sense for the data. ::: solution @@ -151,7 +151,7 @@ ggplot() + We can calculate the difference between two rasters in two different ways: -- by directly subtracting the two rasters in R using raster math, +- by directly subtracting the two rasters in R using raster math, as we did above, or for more efficient processing, particularly if our rasters are large and/or the calculations we are performing are complex: @@ -163,16 +163,15 @@ See how `lapp()` is used in [this lesson](https://datacarpentry.org/r-raster-vec ## Export a GeoTIFF -Now that we’ve created a new raster, let’s export the data as a GeoTIFF file using the `writeRaster()` function. +Now that we have created a new raster, let’s export the data as a GeoTIFF file using the `writeRaster()` function. -When we write this raster object to a GeoTIFF file we’ll name it `CHM_TUD.tiff`. This name allows us to quickly remember both what the data contains (CHM data) and for where (TU Delft campus and surroundings). The `writeRaster()` function by default writes the output file to your working directory unless you specify a full file path. +When we write this raster object to a GeoTIFF file we name it `CHM_TUD.tiff`. This name allows us to quickly remember both what the data contains (CHM data) and for where (TU Delft campus and surroundings). The `writeRaster()` function by default writes the output file to your working directory unless you specify a full file path. -We will specify the output format (“GTiff”), the no data value `NAflag = -9999`. We will also tell R to overwrite any data that is already in a file of the same name. +We will specify the output format `"GTiff"` and tell R to overwrite any data that is already in a file of the same name. ```{r} writeRaster(CHM_TUD, "fig/CHM_TUD.tiff", - filetype="GTiff", - overwrite=TRUE, - NAflag=-9999) + filetype = "GTiff", + overwrite = TRUE) ``` ::::::::::::::::::::::::::::::::::::: keypoints diff --git a/episodes/17-work-with-multi-band-rasters.Rmd b/episodes/17-work-with-multi-band-rasters.Rmd index 39322ebb..dc74dd3f 100644 --- a/episodes/17-work-with-multi-band-rasters.Rmd +++ b/episodes/17-work-with-multi-band-rasters.Rmd @@ -47,9 +47,9 @@ This episode explores how to import and plot a multi-band raster in R. ## Getting Started with Multi-Band Data in R -In this episode, the multi-band data that we are working with is [Beeldmateriaal Open Data](https://opendata.beeldmateriaal.nl/) from the Netherlands. Each RGB image is a 3-band raster. The same steps would apply to working with a multi-spectral image with 4 or more bands - like Landsat imagery. +The multi-band data that we are working with is [Beeldmateriaal Open Data](https://opendata.beeldmateriaal.nl/) from the Netherlands. Each RGB image is a 3-band raster. The same steps would apply to working with a multi-spectral image with 4 or more bands like Landsat imagery. -By using the `rast()` function along with the `lyrs` parameter, we can read specific raster bands; omitting this parameter would read instead all bands. +By using the `rast()` function along with the `lyrs` parameter, we can read specific raster bands; omitting this parameter would read instead all bands. We specify which band we want with `lyrs = N` (N represents the band number we want to work with). To import the red band, we would use `lyrs = 1`. ```{r} RGB_band1_TUD <- rast("data/tudlib-rgb.tif", lyrs = 1) @@ -59,6 +59,7 @@ We need to convert this data to a data frame in order to plot it with `ggplot`. ```{r} RGB_band1_TUD_df <- as.data.frame(RGB_band1_TUD, xy = TRUE) +# Plotting this can take up to 1 minute ggplot() + geom_raster(data = RGB_band1_TUD_df, aes(x = x, y = y, alpha = `tudlib-rgb_1`)) + @@ -69,29 +70,11 @@ ggplot() + This raster contains values between 0 and 255. These values represent degrees of brightness associated with the image band. In the case of a RGB image (red, green and blue), band 1 is the red band. When we plot the red band, larger numbers (towards 255) represent pixels with more red in them (a strong red reflection). Smaller numbers (towards 0) represent pixels with less red in them (less red was reflected). To plot an RGB image, we mix red + green + blue values into one single colour to create a full colour image - similar to the colour image a digital camera creates. -## Import a Specific Band - -We can use the `rast()` function to import specific bands in our raster object by specifying which band we want with `lyrs = N` (N represents the band number we want to work with). To import the green band, we would use `lyrs = 2`. - -```{r} -RGB_band2_TUD <- rast("data/tudlib-rgb.tif", lyrs = 2) -``` - -We can convert this data to a data frame and plot the same way we plotted the red band: -```{r} -RGB_band2_TUD_df <- as.data.frame(RGB_band2_TUD, xy = TRUE) - -ggplot() + - geom_raster(data = RGB_band2_TUD_df, - aes(x = x, y = y, alpha = `tudlib-rgb_2`)) + - coord_equal() -``` - ## Raster Stacks Next, we will work with all three image bands (red, green and blue) as an R raster object. We will then plot a 3-band composite, or full-colour, image. -To bring in all bands of a multi-band raster, we use the `rast()` function. +To bring in all bands of a multi-band raster, we use the `rast()` function without specifying a `lyrs` value. ```{r} RGB_stack_TUD <- rast("data/tudlib-rgb.tif") ``` @@ -100,7 +83,7 @@ Let’s preview the attributes of our stack object: ```{r} RGB_stack_TUD ``` -We can view the attributes of each band in the stack in a single output. For example, if we had hundreds of bands, we could specify which band we’d like to view attributes for using an index value: +We can view the attributes of each band in the stack in a single output. For example, if we had hundreds of bands, we could specify which band we would like to view attributes for using an index value: ```{r} RGB_stack_TUD[[2]] ``` @@ -110,17 +93,20 @@ We can also use `ggplot2` to plot the data in any layer of our raster object. Re RGB_stack_TUD_df <- as.data.frame(RGB_stack_TUD, xy = TRUE) ``` -Each band in our RasterStack gets its own column in the data frame. Thus we have: +Each band in our raster stack gets its own column in the data frame. Thus we have: ```{r} str(RGB_stack_TUD_df) ``` + Let’s create a histogram of the first band: ```{r} ggplot() + geom_histogram(data = RGB_stack_TUD_df, aes(`tudlib-rgb_1`)) ``` + And a raster plot of the second band: ```{r} +# Plotting this can take up to 1 minute ggplot() + geom_raster(data = RGB_stack_TUD_df, aes(x = x, y = y, alpha = `tudlib-rgb_2`)) + @@ -129,16 +115,16 @@ ggplot() + We can access any individual band in the same way. -## Create a three-band image +## Plotting a three-band image To render a final three-band, coloured image in R, we use the `plotRGB()` function. This function allows us to: -- Identify what bands we want to render in the red, green and blue regions. The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band order. However, you can define what bands you’d like to plot manually. Manual definition of bands is useful if you have, for example a near-infrared band and want to create a colour infrared image. +- Identify what bands we want to render in the red, green and blue regions. The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band order. However, you can define what bands you would like to plot manually. Manual definition of bands is useful if you have, for example a near-infrared band and want to create a colour infrared image. - Adjust the stretch of the image to increase or decrease contrast. -Let’s plot our 3-band image. Note that we can use the `plotRGB()` function directly with our RasterStack object (we don’t need a data frame as this function isn’t part of the `ggplot2` package). +Let’s plot our 3-band image. Note that we can use the `plotRGB()` function directly with our raster stack object (we do not need a data frame as this function is not part of the `ggplot2` package). ```{r} plotRGB(RGB_stack_TUD, From 58188d614454606bd6efc4972481f74ccb1ee237 Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci Date: Wed, 1 May 2024 16:21:08 +0200 Subject: [PATCH 4/7] Add code chunk labels and challenge titles --- episodes/13-intro-to-raster-data.Rmd | 29 +++++++++-------- episodes/14-plot-raster-data.Rmd | 24 +++++++------- episodes/15-reproject-raster-data.Rmd | 34 ++++++++++---------- episodes/16-raster-calculations.Rmd | 16 ++++----- episodes/17-work-with-multi-band-rasters.Rmd | 20 ++++++------ 5 files changed, 62 insertions(+), 61 deletions(-) diff --git a/episodes/13-intro-to-raster-data.Rmd b/episodes/13-intro-to-raster-data.Rmd index 6bbe5e11..eb8d6c49 100644 --- a/episodes/13-intro-to-raster-data.Rmd +++ b/episodes/13-intro-to-raster-data.Rmd @@ -82,19 +82,19 @@ To improve code readability, use file and object names that make it clear what i Let's load our raster file into R and view its data structure. -```{r} +```{r read-dsm} DSM_TUD <- rast("data/tud-dsm-5m.tif") DSM_TUD ``` The information above includes a report on dimension, resolution, extent and CRS, but no information about the values. Similar to other data structures in R like vectors and data frames, descriptive statistics for raster data can be retrieved with the `summary()` function. -```{r, warning=TRUE} +```{r dsm-summary, warning=TRUE} summary(DSM_TUD) ``` This output gives us information about the range of values in the DSM. We can see, for instance, that the lowest elevation is `-5.2235`, the highest is `89.7838`. But note the warning. Unless you force R to calculate these statistics using every cell in the raster, it will take a random sample of 100,000 cells and calculate from them instead. To force calculation all the values, you can use the function `values`: -```{r} +```{r dsm-summary2} summary(values(DSM_TUD)) ``` @@ -102,13 +102,13 @@ With a summary on all cells of the raster, the values range from a smaller minim To visualise the DSM in R using `ggplot2`, we need to convert it to a data frame. We learned about data frames in an [earlier lesson](../episodes/03-explore-data.Rmd). The `terra` package has the built-in method `as.data.frame()` for conversion to a data frame. -```{r} +```{r dsm-df} DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE) ``` Now when we view the structure of our data, we will see a standard data frame format in which every row is a cell from the raster, each containing information about the `x` and `y` coordinates and the raster value stored in the `tud-dsm-5m` column. -```{r} +```{r dsm-df-str} str(DSM_TUD_df) ``` @@ -132,7 +132,7 @@ The `"turbo"` scale in our code provides a good contrasting scale for our raster For faster previews, you can use the `plot()` function on a `terra` object. -```{r} +```{r dsm-plot} plot(DSM_TUD) ``` @@ -146,12 +146,13 @@ Now we will see how features of the CRS appear in our data file and what meaning We can view the CRS string associated with our R object using the `crs()` function. -```{r} +```{r dsm-crs} crs(DSM_TUD, proj = TRUE) ``` ::: challenge -What units are our data in? + +# Challenge: What units are our data in? ::: solution @@ -173,7 +174,7 @@ See more about CRS and PROJ.4 strings in [this lesson](https://datacarpentry.org It is useful to know the minimum and maximum values of a raster dataset. In this case, as we are working with elevation data, these values represent the minimum-to-maximum elevation range at our site. Raster statistics are often calculated and embedded in a GeoTIFF for us. We can view these values: -```{r} +```{r dsm-minmax} minmax(DSM_TUD) ``` @@ -182,18 +183,18 @@ minmax(DSM_TUD) If the `min` and `max` values are `Inf` and `-Inf` respectively, it means that they haven't been calculated. We can calculate them using the `setMinMax()` function. -```{r} +```{r dsm-setminmax} DSM_TUD <- setMinMax(DSM_TUD) ``` ::: A call to `minmax(DSM_TUD)` will now give us the correct values. Alternatively, `min(values())` and `max(values())` will return the minimum and maximum values respectively. -```{r} +```{r dsm-min} min(values(DSM_TUD)) ``` -```{r} +```{r dsm-max} max(values(DSM_TUD)) ``` @@ -205,7 +206,7 @@ The Digital Surface Model object (`DSM_TUD`) that we have been working with is a We can view the number of bands in a raster using the `nlyr()` function. -```{r} +```{r dsm-nlyr} nlyr(DSM_TUD) ``` @@ -232,7 +233,7 @@ Note that this file is a hillshade raster. We will learn about hillshades in the ::: solution -```{r} +```{r dsm-describe} describe("data/tud-dsm-5m-hill.tif") ``` diff --git a/episodes/14-plot-raster-data.Rmd b/episodes/14-plot-raster-data.Rmd index 016851ff..5f773be2 100644 --- a/episodes/14-plot-raster-data.Rmd +++ b/episodes/14-plot-raster-data.Rmd @@ -51,7 +51,7 @@ In this part, we will plot our raster object using `ggplot2` with customized col In the previous plot, our DSM was coloured with a continuous colour range. For clarity and visibility, we may prefer to view the data “symbolized” or coloured according to ranges of values. This is comparable to a “classified” map. For that, we need to tell `ggplot()` how many groups to break our data into and where those breaks should be. To make these decisions, it is useful to first explore the distribution of the data using a bar plot. To begin with, we will use `dplyr`’s `mutate()` function combined with `cut()` to split the data into 3 bins. -```{r} +```{r dsm-fct} DSM_TUD_df <- DSM_TUD_df %>% mutate(fct_elevation = cut(`tud-dsm-5m`, breaks = 3)) @@ -60,19 +60,19 @@ ggplot() + ``` To see the cut-off values for the groups, we can ask for the levels of `fct_elevation`: -```{r} +```{rdsm-fct-levels} levels(DSM_TUD_df$fct_elevation) ``` And we can get the count of values (that is, number of pixels) in each group using `dplyr`’s `count()` function: -```{r} +```{r dsm-fct-count} DSM_TUD_df %>% count(fct_elevation) ``` We might prefer to customize the cut-off values for these groups. Lets round the cut-off values so that we have groups for the ranges of -10-0 m, 0-5 m, and 5-100 m. To implement this we will give `cut()` a numeric vector of break points instead of the number of breaks we want. -```{r} +```{r dsm-fct-cb} custom_bins <- c(-10, 0, 5, 100) DSM_TUD_df <- DSM_TUD_df %>% @@ -92,31 +92,31 @@ The bin intervals are shown using `(` to mean exclusive and `]` to mean inclusiv ::: And now we can plot our bar plot again, using the new groups: -```{r} +```{r plot-dsm-fct-cb} ggplot() + geom_bar(data = DSM_TUD_df, aes(fct_elevation_cb)) ``` And we can get the count of values in each group in the same way we did before: -```{r} +```{r dsm-fct-cb-count} DSM_TUD_df %>% count(fct_elevation_cb) ``` We can use those groups to plot our raster data, with each group being a different colour: -```{r} +```{r plot-dsm-fct-cb2} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + coord_equal() ``` The plot above uses the default colours inside `ggplot2` for raster objects. We can specify our own colours to make the plot look a little nicer. R has a built in set of colours for plotting terrain available through the `terrain.colors()` function. Since we have three bins, we want to create a 3-colour palette: -```{r} +```{r terrain-colours} terrain.colors(3) ``` The `terrain.colors()` function returns hex colours - each of these character strings represents a colour. To use these in our map, we pass them to the `values` argument in the `scale_fill_manual()` function. -```{r} +```{r plot-dsm-fct-cb3} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + scale_fill_manual(values = terrain.colors(3)) + @@ -128,7 +128,7 @@ ggplot() + If we need to create multiple plots using the same colour palette, we can create an R object (`my_col`) for the set of colours that we want to use. We can then quickly change the palette across all plots by modifying the `my_col` object, rather than each individual plot. We can give the legend a more meaningful title with the `name` argument of the `scale_fill_manual()` function. -```{r} +```{r plot-dsm-fct-cb4} my_col <- terrain.colors(3) ggplot() + @@ -138,7 +138,7 @@ ggplot() + coord_equal() ``` The axis labels x and y are not necessary, so we can turn them off by passing `element_blank()` to the `axis.title` argument in the `theme()` function. -```{r} +```{r plot-dsm-fct-cb5} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) + @@ -159,7 +159,7 @@ Create a plot of the TU Delft Digital Surface Model (`DSM_TUD`) that has: ::: solution -```{r} +```{r plot-dsm-fct-cb6} DSM_TUD_df <- DSM_TUD_df %>% mutate(fct_elevation_6 = cut(`tud-dsm-5m`, breaks = 6)) diff --git a/episodes/15-reproject-raster-data.Rmd b/episodes/15-reproject-raster-data.Rmd index 8ce316d0..920da70b 100644 --- a/episodes/15-reproject-raster-data.Rmd +++ b/episodes/15-reproject-raster-data.Rmd @@ -65,19 +65,19 @@ Read more about layering rasters [here](https://datacarpentry.org/r-raster-vecto First, we need to import the DTM and DTM hillshade data. -```{r} +```{r dtm-hill-read} DTM_TUD <- rast("data/tud-dtm-5m.tif") DTM_hill_TUD <- rast("data/tud-dtm-5m-hill-WGS84.tif") ``` Next, we will convert each of these datasets to a data frame for plotting with `ggplot`. -```{r} +```{r dtm-hill-df} DTM_TUD_df <- as.data.frame(DTM_TUD, xy = TRUE) DTM_hill_TUD_df <- as.data.frame(DTM_hill_TUD, xy = TRUE) ``` Now we can create a map of the DTM layered over the hillshade. -```{r} +```{r plot-dtm-hill} ggplot() + geom_raster(data = DTM_TUD_df , aes(x = x, y = y, @@ -90,7 +90,7 @@ ggplot() + ``` Our results are curious - neither the DTM (`DTM_TUD_df`) nor the hillshade (`DTM_hill_TUD_df`) are plotted. Let’s try to plot the DTM on its own to make sure the data are there. -```{r} +```{r plot-dtm} ggplot() + geom_raster(data = DTM_TUD_df, aes(x = x, y = y, @@ -103,7 +103,7 @@ Our DTM seems to contain data and plots just fine. Next we plot the DTM Hillshade on its own to see whether everything is OK. -```{r} +```{r plot-hill} ggplot() + geom_raster(data = DTM_hill_TUD_df, aes(x = x, y = y, @@ -115,17 +115,17 @@ If we look at the axes, we can see that the units, and therefore the projections ::: challenge -# Exercise +# Challenge: check the CRS View the CRS for each of these two datasets. What projection does each use? ::: solution -```{r} +```{r dtm-crs} crs(DTM_TUD, parse = TRUE) ``` -```{r} +```{r hill-crs} crs(DTM_hill_TUD, parse = TRUE) ``` @@ -159,19 +159,19 @@ The syntax is `project(RasterObject, crs)` We want the CRS of our hillshade to match the `DTM_TUD` raster. We can thus assign the CRS of our `DTM_TUD` to our hillshade within the `project()` function as follows: `crs(DTM_TUD)`. Note that we are using the `project()` function on the raster object, not the `data.frame()` we use for plotting with `ggplot`. First we will reproject our `DTM_hill_TUD` raster data to match the `DTM_TUD` raster CRS: -```{r} +```{r hill-projected} DTM_hill_EPSG28992_TUD <- project(DTM_hill_TUD, crs(DTM_TUD)) ``` Now we can compare the CRS of our original DTM hillshade and our new DTM hillshade, to see how they are different. -```{r} +```{r hill-projected-crs} crs(DTM_hill_EPSG28992_TUD, parse = TRUE) crs(DTM_hill_TUD, parse = TRUE) ``` We can also compare the extent of the two objects. -```{r} +```{r hill-projected-ext} ext(DTM_hill_EPSG28992_TUD) ext(DTM_hill_TUD) ``` @@ -180,34 +180,34 @@ ext(DTM_hill_TUD) Let’s next have a look at the resolution of our reprojected hillshade versus our original data. -```{r} +```{r hill-projected-res} res(DTM_hill_EPSG28992_TUD) ``` -```{r} +```{r dtm-res} res(DTM_TUD) ``` These two resolutions are different, but they represent the same data. We can tell R to force our newly reprojected raster to be the same as `DTM_TUD` by adding a line of code `res = res(DTM_TUD)` within the `project()` function. -```{r} +```{r hill-projected-crs-res} DTM_hill_EPSG28992_TUD <- project(DTM_hill_TUD, crs(DTM_TUD), res = res(DTM_TUD)) ``` Now both our resolutions and our CRSs match, so we can plot these two data sets together. Let’s double-check our resolution to be sure: -```{r} +```{r dtm-hill-res} res(DTM_hill_EPSG28992_TUD) res(DTM_TUD) ``` For plotting with `ggplot()`, we will need to create a data frame from our newly reprojected raster. -```{r} +```{r hill-df2} DTM_hill_TUD_2_df <- as.data.frame(DTM_hill_EPSG28992_TUD, xy = TRUE) ``` We can now create a plot of this data. -```{r} +```{r plot-dtm-hill-projected} ggplot() + geom_raster(data = DTM_TUD_df , aes(x = x, y = y, diff --git a/episodes/16-raster-calculations.Rmd b/episodes/16-raster-calculations.Rmd index ada80678..93e1203f 100644 --- a/episodes/16-raster-calculations.Rmd +++ b/episodes/16-raster-calculations.Rmd @@ -60,13 +60,13 @@ Let's say we are interested in mapping the heights of trees and buildings across For this episode, we will use the DTM and DSM data which we already have loaded from previous episodes. We use the `describe()` function to view information about the DTM and DSM data files. -```{r} +```{r dtm-dsm-describe} describe("data/tud-dtm-5m.tif") describe("data/tud-dsm-5m.tif") ``` We have already loaded and worked with these two data files in earlier episodes. Let’s plot them each once more to remind ourselves what this data looks like. First we plot the DTM elevation data: -```{r} +```{r plot-dtm} ggplot() + geom_raster(data = DTM_TUD_df , aes(x = x, y = y, fill = `tud-dtm-5m`)) + @@ -75,7 +75,7 @@ We have already loaded and worked with these two data files in earlier episodes. ``` And then the DSM elevation data: -```{r} +```{r plot-dsm} ggplot() + geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = `tud-dsm-5m`)) + @@ -90,13 +90,13 @@ We can perform raster calculations by subtracting (or adding, multiplying, etc.) Let’s subtract the DTM from the DSM to create a Canopy Height Model. After subtracting, let’s create a data frame so we can plot with `ggplot`. -```{r} +```{r chm} CHM_TUD <- DSM_TUD - DTM_TUD CHM_TUD_df <- as.data.frame(CHM_TUD, xy = TRUE) ``` We can now plot the output CHM. -```{r} +```{r plot-chm} ggplot() + geom_raster(data = CHM_TUD_df , aes(x = x, y = y, fill = `tud-dsm-5m`)) + @@ -105,7 +105,7 @@ We can now plot the output CHM. ``` Let’s have a look at the distribution of values in our newly created Canopy Height Model (CHM). -```{r} +```{r chm-hist} ggplot(CHM_TUD_df) + geom_histogram(aes(`tud-dsm-5m`)) ``` @@ -123,7 +123,7 @@ It is often a good idea to explore the range of values in a raster dataset just ::: solution -```{r} +```{r chm-challenge} min(CHM_TUD_df$`tud-dsm-5m`, na.rm = TRUE) max(CHM_TUD_df$`tud-dsm-5m`, na.rm = TRUE) @@ -168,7 +168,7 @@ Now that we have created a new raster, let’s export the data as a GeoTIFF file When we write this raster object to a GeoTIFF file we name it `CHM_TUD.tiff`. This name allows us to quickly remember both what the data contains (CHM data) and for where (TU Delft campus and surroundings). The `writeRaster()` function by default writes the output file to your working directory unless you specify a full file path. We will specify the output format `"GTiff"` and tell R to overwrite any data that is already in a file of the same name. -```{r} +```{r chm-write} writeRaster(CHM_TUD, "fig/CHM_TUD.tiff", filetype = "GTiff", overwrite = TRUE) diff --git a/episodes/17-work-with-multi-band-rasters.Rmd b/episodes/17-work-with-multi-band-rasters.Rmd index dc74dd3f..8933ae6d 100644 --- a/episodes/17-work-with-multi-band-rasters.Rmd +++ b/episodes/17-work-with-multi-band-rasters.Rmd @@ -51,12 +51,12 @@ The multi-band data that we are working with is [Beeldmateriaal Open Data](https By using the `rast()` function along with the `lyrs` parameter, we can read specific raster bands; omitting this parameter would read instead all bands. We specify which band we want with `lyrs = N` (N represents the band number we want to work with). To import the red band, we would use `lyrs = 1`. -```{r} +```{r rgb-band1} RGB_band1_TUD <- rast("data/tudlib-rgb.tif", lyrs = 1) ``` We need to convert this data to a data frame in order to plot it with `ggplot`. -```{r} +```{r plot-rgb-band1} RGB_band1_TUD_df <- as.data.frame(RGB_band1_TUD, xy = TRUE) # Plotting this can take up to 1 minute @@ -75,37 +75,37 @@ This raster contains values between 0 and 255. These values represent degrees of Next, we will work with all three image bands (red, green and blue) as an R raster object. We will then plot a 3-band composite, or full-colour, image. To bring in all bands of a multi-band raster, we use the `rast()` function without specifying a `lyrs` value. -```{r} +```{r rgb-stack} RGB_stack_TUD <- rast("data/tudlib-rgb.tif") ``` Let’s preview the attributes of our stack object: -```{r} +```{r rgb-stack-examine} RGB_stack_TUD ``` We can view the attributes of each band in the stack in a single output. For example, if we had hundreds of bands, we could specify which band we would like to view attributes for using an index value: -```{r} +```{r rgb-stack-index} RGB_stack_TUD[[2]] ``` We can also use `ggplot2` to plot the data in any layer of our raster object. Remember, we need to convert to a data frame first. -```{r} +```{r rgb-stack-df} RGB_stack_TUD_df <- as.data.frame(RGB_stack_TUD, xy = TRUE) ``` Each band in our raster stack gets its own column in the data frame. Thus we have: -```{r} +```{r rgb-stack-df-str} str(RGB_stack_TUD_df) ``` Let’s create a histogram of the first band: -```{r} +```{r plot-rgb-band1-hist} ggplot() + geom_histogram(data = RGB_stack_TUD_df, aes(`tudlib-rgb_1`)) ``` And a raster plot of the second band: -```{r} +```{r plot-rgb-band2} # Plotting this can take up to 1 minute ggplot() + geom_raster(data = RGB_stack_TUD_df, @@ -126,7 +126,7 @@ This function allows us to: Let’s plot our 3-band image. Note that we can use the `plotRGB()` function directly with our raster stack object (we do not need a data frame as this function is not part of the `ggplot2` package). -```{r} +```{r plot-rgb} plotRGB(RGB_stack_TUD, r = 1, g = 2, b = 3) ``` From 51011a4f18616c3fc0438164baa8c09525279797 Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci <33600128+cforgaci@users.noreply.github.com> Date: Mon, 20 May 2024 18:06:17 +0200 Subject: [PATCH 5/7] Apply Ignacio's suggestions from code review Co-authored-by: iurriayanez <152204909+iurriayanez@users.noreply.github.com> --- episodes/13-intro-to-raster-data.Rmd | 2 +- episodes/17-work-with-multi-band-rasters.Rmd | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/episodes/13-intro-to-raster-data.Rmd b/episodes/13-intro-to-raster-data.Rmd index eb8d6c49..2469559d 100644 --- a/episodes/13-intro-to-raster-data.Rmd +++ b/episodes/13-intro-to-raster-data.Rmd @@ -94,7 +94,7 @@ summary(DSM_TUD) This output gives us information about the range of values in the DSM. We can see, for instance, that the lowest elevation is `-5.2235`, the highest is `89.7838`. But note the warning. Unless you force R to calculate these statistics using every cell in the raster, it will take a random sample of 100,000 cells and calculate from them instead. To force calculation all the values, you can use the function `values`: -```{r dsm-summary2} +```{r dsm-summary-values} summary(values(DSM_TUD)) ``` diff --git a/episodes/17-work-with-multi-band-rasters.Rmd b/episodes/17-work-with-multi-band-rasters.Rmd index 8933ae6d..19d3814a 100644 --- a/episodes/17-work-with-multi-band-rasters.Rmd +++ b/episodes/17-work-with-multi-band-rasters.Rmd @@ -121,7 +121,7 @@ To render a final three-band, coloured image in R, we use the `plotRGB()` functi This function allows us to: -- Identify what bands we want to render in the red, green and blue regions. The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band order. However, you can define what bands you would like to plot manually. Manual definition of bands is useful if you have, for example a near-infrared band and want to create a colour infrared image. +- Identify what bands we want to render in the red, green and blue regions. The `plotRGB()` function defaults to a 1=red, 2=green, and 3=blue band order. However, you can define what bands you would like to plot manually. Manual definition of bands is useful if you have, for example, a near-infrared band and want to create a colour infrared image. - Adjust the stretch of the image to increase or decrease contrast. Let’s plot our 3-band image. Note that we can use the `plotRGB()` function directly with our raster stack object (we do not need a data frame as this function is not part of the `ggplot2` package). From 8f5e087b36f0ceb234e60e3a293b5f628706be1a Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci <33600128+cforgaci@users.noreply.github.com> Date: Tue, 21 May 2024 08:37:47 +0200 Subject: [PATCH 6/7] Update episodes/13-intro-to-raster-data.Rmd --- episodes/13-intro-to-raster-data.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/13-intro-to-raster-data.Rmd b/episodes/13-intro-to-raster-data.Rmd index 2469559d..b6adfc4d 100644 --- a/episodes/13-intro-to-raster-data.Rmd +++ b/episodes/13-intro-to-raster-data.Rmd @@ -38,7 +38,7 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. -This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before attaching it with `library(terra)`. +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. ::: From 9bb22178c93ad52b7a5e15c6c5b9f2ce5589e95c Mon Sep 17 00:00:00 2001 From: Claudiu Forgaci <33600128+cforgaci@users.noreply.github.com> Date: Tue, 21 May 2024 08:39:11 +0200 Subject: [PATCH 7/7] Update episodes/14-plot-raster-data.Rmd --- episodes/14-plot-raster-data.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/14-plot-raster-data.Rmd b/episodes/14-plot-raster-data.Rmd index 5f773be2..ac72fbab 100644 --- a/episodes/14-plot-raster-data.Rmd +++ b/episodes/14-plot-raster-data.Rmd @@ -41,7 +41,7 @@ After completing this episode, participants should be able to… See the [setup instructions](../learners/setup.md) for detailed information about the software, data, and other prerequisites you will need to work through the examples in this episode. -This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before attaching it with `library(terra)`. +This lesson uses the `terra` package in particular. If you have not installed it yet, do so by running `install.packages("terra")` before loading it with `library(terra)`. :::