-
Notifications
You must be signed in to change notification settings - Fork 9
/
01-introduction.Rmd
129 lines (72 loc) · 8.54 KB
/
01-introduction.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
\mainmatter
```{r, include=FALSE}
knitr::opts_chunk$set(out.width = "100%")
```
# (PART) Foundations {-}
# Introduction: GIS ideas and concepts
## Learning Objectives
* Gain a basic understanding of what a GIS is and its relevance to transport studies
* Know what some of the key data types and file formats
* Import pre-cleaned vector data into QGIS and navigate basic QGIS features
* Select a subset of the data
* Make a map showing point, line and polygon features and export it.
* Understand the key cartographic elements that must be present to allow interpretation of a map
In this part you will learn about the following key concepts:
* GIS ideas and concepts - we very briefly introduce some key GIS concepts and ideas.
* Opening QGIS and essential features - We then explain what the main components of QGIS are, explain how to set up a project and save work and what some of the most useful buttons do
* Downloading and loading data - we show you how to get some data we have prepared for the workshop and load it into QGIS
* Style and select features - change the way features look on the screen
* Making maps - Make a map which you can export and include in a document
## What is GIS?
GIS is an abbreviation. Sometimes it is used to mean **Geographic Information System** and sometimes it is used to mean **Geographic Information Science**. A more formal definition of a Geographical Information System is:
>A powerful set of tools for collecting, storing, retrieving at will, transforming and >displaying spatial data from the real world
(Burrough, 1986)
The emphasis here is on pieces of software like QGIS. In practical terms, a Geographic Information System is a database to store data, a calculator which can manipulate and analyse data, a visualisation window to show results. GIS software like QGIS also has a Graphical User Interface (GUI) with menus and buttons which lets the user 'do stuff'.
Part 1 focusses the basic features of the Geographic Information System application called QGIS.
A more formal definition of Geographical Information Science is given by
Goodchild, (1992) who argues that Geographical Information Science involves research that investigates spatial data acquisition, spatial statistics, modelling and theories of spatial data, development of analytical tools and consideration of the management and ethics of working with spatial data.
As an introduction, the Wikipedia entry contains a lot of information. In the further reading section, there are references to crucial GIS textbooks for a more formal treatment of key GIS concepts.
https://en.wikipedia.org/wiki/Geographic_information_science
GIS in both of the systems and science contexts described above is vital for anyone studying transport because it is an inherently spatial topic of study. Data used in the study of transport can be visualised and analysed using GIS. Learning how to use GIS software like QGIS is a handy skill in its own right. It is also an advantageous start point for people wanting to learn and apply GIScience.
There are other terms you may come across, including "spatial data analytics". The emphasis here is on a branch of science at the nexus of statistics, computer science and quantitative geography. You may also come across the term "geo-computation". Geocomputation is associated with computational methods that have been customised to address the unique characteristics of spatial data ( https://dx.doi.org/10.4135/9780857024442.d64). There are technical academic differences between these terms, but in practice, they get used interchangeably. This range of terms might sound confusing, but they are all related.
## Free and Open Source Software
QGIS is free to download and use. The university has a licence for commercial GIS software (e.g. ArcGIS), but we are aware that not all students will have access to commercial software after completing the course. QGIS is becoming popular with many organisations. For more information about FOSS see: https://en.wikipedia.org/wiki/Free_and_open-source_software
## Vector data
Vector data represents the features of the world as either 'points' 'lines' or 'areas' (also called polygons).
Each type of feature is displayed in the GIS as a distinct layer. A layer will only contain either points, or lines, or polygons. It is also good practice to have different layers for different types of line features. For example, it is good to have a layer for roads and another for rivers.
Maps can be made of several layers of vector data as seen here:
https://upload.wikimedia.org/wikipedia/commons/3/3b/USGS_The_National_Map.jpg
## Raster data:
'In its simplest form, a raster consists of a matrix of cells (or pixels) organised into rows and columns (or a grid) where each cell contains a value representing information, such as temperature. Rasters are digital aerial photographs, imagery from satellites, digital pictures, or even scanned maps'. (
webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=What_is_raster_data%3F)
(source: http://support.esri.com/other-resources/gis-dictionary/term/raster)
## Network data
A network dataset takes a line dataset and defines its topology explicitly. Defining topology means having data tables that explicitly lists which lines are connected and at which nodes.
In its simplest form, this means recording the connections between the ends of different lines. Data in this form is often called a graph as the branch of mathematics that deals with networks is called [graph theory](https://en.wikipedia.org/wiki/Graph_theory).
## Data formats
Data formats refer to how data stored in a GIS and on your computer.
QGIS can handle a vast number of different data formats.
In today's exercises, we will start by using use a file format called "Shapefile" as it is a prevalent type of vector GIS data format. The Shapefile format was developed by ESRI who make the commercial GIS software ArcGIS (ArcGIS is also available on university machine).
The Shapefile format appears on your computer as several files which may seem a little confusing. Each file contains a different type of information that the GIS needs to represent the spatial data.
NB. If you ever want to share a shapefile with someone, you have to send the whole group of files.
```{r shp,echo=FALSE, fig.cap="A shapefile is a group of files. It is a common file format"}
knitr::include_graphics("figures/Shapefile.jpg")
```
While shapefiles are still used, they have many limitations. Resulting in a [campaign against shapefiles](http://switchfromshapefile.org/).
There are other types of GIS files, including GeoJSON and geo-package formats. Data is stored in GIS as database tables.
GIS data can be stored and shared in a very large number of ways. (For an introduction see https://en.wikipedia.org/wiki/GIS_file_formats as well as the further reading in the appendix.
## Projections and coordinate systems
The world is not flat, but computer computer screens are.
Coordinate systems, also known as coordinate reference systems (CRS), allow GIS to represent the curved surface of the Earth on a flat screen or page.
```{r qgis-window-projection, echo=FALSE,fig.cap="How do we fit a curving earth onto a flat piece of paper"}
knitr::include_graphics("figures/Projection_mjfoster.png")
```
Image source comes from this presentation: http://mjfoster83.github.io/projections/#/5 .
The projection is a mathematical formula explaining how places in the real world which are on a near spherical globe can be represented on a flat map. More information on projections and coordinate systems can be found in GIS test books.
Coordinate Reference Systems (CRS) refer to different ways of defining the X and Y coordinates used in different projections. Broadly they fall into two categories:
* Geographical Coordinate Systems: use latitude and longitude to represent any place on the Earth
* Projected Coordinate Systems: use distances from an origin point to represent a small part of the Earth, e.g. a country. The advantage of a projects CRS is that it is easier to calculate properties such as distance and area as coordinates are in metres.
You can find a catalogue of different CRSs at http://spatialreference.org/
CRSs are often referred to by the EPSG number. The European Petroleum Survey Group publish a database of different coordinate systems. Two useful projections to commit to memory are:
* 4326 - the World Geodetic System 1984 which is a widely used geographical coordinate system, used in GPS datasets and the .geojson file format, for example.
* 27700 - the British National Grid