-
Notifications
You must be signed in to change notification settings - Fork 4
/
Network analysis (R).Rmd
112 lines (80 loc) · 2.85 KB
/
Network analysis (R).Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
title: Network analysis (R)
output: html_document
editor_options:
markdown:
wrap: 72
---
# Network analysis
Network analysis focuses on seeing the world via nodes and ties between
nodes. It can be used for various purposes, ranging from [textual
data](https://sicss.io/2020/materials/day3-text-analysis/text-networks/rmarkdown/Text_Networks.html)
to analysis of [sexual relationships in a high
school](https://people.duke.edu/~jmoody77/chains.pdf).
Now we learn how to \* read data in as a network \* visualise network \*
compute various network characteristics
For this, we use the package [igraph](https://igraph.org/r/).
[Narrated code walkthrough](https://www.youtube.com/watch?v=lXC2egj8lmY)
```{r}
library(igraph)
```
```{r}
data <- read.csv('data/org_x_collaboration.csv', header = F )
matrix <- as.matrix( data )
network <- graph_from_adjacency_matrix( matrix, 'directed' )
```
## Computing network characteristics
There are various measurements which can be executed both at the level
of individual nodes as well as at the level of the whole network.
```{r}
degree( network )
summary( degree( network ) )
hist( degree( network ) )
```
## Exercises
- Instead of this ugly histogram, use `ggplot` to draw a nice
histogram.
- Instead of degree use
[betweness](https://igraph.org/r/doc/betweenness.html) metrics.
- Do degree and betweness correlate?
- What is the diameter of the network?
## Plotting networks
Often network analysis is done through drawing the network. Here we
learn how to twist the drawing by changing node attributes, tie
attributes as well as the [layout of the
graph](https://igraph.org/r/doc/layout_.html). See
[documentation](https://igraph.org/r/doc/plot.common.html) for various
alternatives you can use.
```{r}
plot( network )
```
```{r}
V(network)$color <- 'white'
V(network)$size <- degree(network)
E(network)$color <- 'green'
l = layout_in_circle( network )
plot( network, layout = l )
```
## Exercises
- Draw a network which you would use to describe the outcomes to the
organisation.
- Examine different layout algorithms and see how different they are.
## Community detection
Beyond visualizations, one can try to find communities (or clusters) of
more dencly connected nodes from the materials. There are various
community detection algorithms out there [see for
example](https://www.rdocumentation.org/packages/igraph/versions/1.3.5/topics/membership),
and they do produce somewhat different outcomes.
```{r}
communities <- cluster_optimal( network )
V(network)$community <- membership( communities )
print( V(network)$community )
V(network)$color <- V(network)$community
V(network)$size <- degree(network)
E(network)$color <- 'green'
l = layout_in_circle( network )
plot( network, layout = l )
```
## Exercises
- Try out different clustering approaches: what kinds of insights they
provide to the data?