-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathWaiPractice Report - Masterfile.Rmd
971 lines (769 loc) · 55 KB
/
WaiPractice Report - Masterfile.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
---
title: "6 Nations, 3 Games - Data Analysis and Visualisation of Women’s Rugby using R"
author:
- name: Denise O'Sullivan (Data Analyst)
- name: Dana Smith (Data Analyst)
- name: Paola Vercesi (Data Analyst)
- name: Denise Earle (Project Supervisor)
output: github_document
---
```{r library load, message = FALSE, include = FALSE}
library(tidyverse)
library(RColorBrewer)
library(readxl)
library(fmsb) # For radar charts
library(ggrepel) # For clearer visualisations
library(ggiraph) # For interactive radar charts
library(ggiraphExtra)
library(devtools)
require(plyr)
require(reshape2)
require(moonBook)
require(sjmisc)
library(ggpubr)
```
```{r import datasets, message = FALSE, include = FALSE}
turnovers <- read_csv('Turnover Dataset.csv')
team_data_2021 <- read_excel('Match Results & Team Stats.xlsx', sheet='Team Level Stats - 2021')
data_dictionary <- read_excel('Match Results & Team Stats.xlsx',range='Dictionary!A1:B30')
historic_points <- read_csv("previous_results.csv")
rank <- read_csv("6N_rankings_performance.csv")
```
# Introduction
In a “normal” world the Women’s Rugby World Cup (RWC) 2021 would start in New Zealand in October 2021. However, due to the COVID-19 pandemic, the tournament is now rescheduled to October 2022, with a few spots still open for national teams across continents.
Representing Europe so far are England, France and Wales, who won their spots through direct qualification following their top-7 placement in the previous RWC. One more place representing Europe is still to be assigned to the best team that will emerge from a qualification round-robin tournament among Ireland, Italy, Scotland and Spain.
All the seven teams that we have just mentioned compete, or have competed at some point, in the European “Rugby’s Greatest Championship”, the Women’s Six Nations (W6N).
Similar to the men’s tournament, the original tournament started as a Home Nations Championship in 1996 between Ireland, England, Scotland and Wales. It was extended to France in 1999, until the W6N was born in 2001 with the final addition of Spain. After five editions, in 2006 Italy won their spot in the championship on merit, having overtaken Spain in the Women’s World Ranking (WWR).
The goal of this project was to showcase how R/RStudio can be used to carry out deep data exploration and visualisation of W6N match performance indicators and in doing so, potentially identify strengths and weaknesses in Ireland's playing style that could then be used to improve their chances of qualifying for the Women's Rugby World Cup in 2022.
## Data
To collect data for the project, we used the tool [YouTubeCoder by FC Python](https://fcpythonvideocoder.netlify.app/?pitch=rugby). This tool involves inputting the YouTube URL of a sports game and noting when specific events occur using the pitch marker. With this tool, you can select which type of pitch the game will be played on. There are buttons for each player and each event that you use to mark the pitch to record details of the events, including the location on the pitch where the event occurred. When the match is over, the data recorded can then be exported into a CSV file that can then be imported into R.
We also sourced data from the following websites:
* Team and match statistics from the [Six Nations website](https://www.sixnationsrugby.com/women/statistics-2021/) in order to compare and contrast each competing team's losses and achievements.
* Match results from the [Livesport Website](https://www.livesport.com/en/rugby-union/europe/six-nations-women-2019/results/).
* Women's World Ranking data came from the [World Rugby website](https://www.world.rugby/tournaments/rankings/wru).
## Data Analysis Software
All analysis was carried out exclusively in RStudio. The primary data manipulation package used was dplyr, with ggplot2 the primary data visualisation package. Various other packages were also used. The final report was then written using R Markdown and published via [RPubs](https://rpubs.com/).
# Women's World Rankings
The purpose of this section is to provide some background and context to each teams' playing ability by providing a detailed analysis of their World Rankings and overall performance in the 2018, 2019, 2020 and 2021 W6N tournaments.
Calculations to maintain the WWR are complex and take a number of factors into consideration, as explained in the [World Rugby website](https://www.world.rugby/tournaments/rankings/explanation).
However, W6N championship performance and WWR seem to be more interlinked than what happens for the Men’s Six Nations.
We wanted to investigate how a team’s ranking and performance metrics correlated.
```{r data manipulation, message = FALSE, echo = FALSE}
historic_points <- dplyr::rename(historic_points,
home_team = "Home Team",
away_team = "Away Team",
home_points = "Home Points",
away_points = "Away Points")
# England historic stats
## Home points for/against
ENG_hp <- historic_points %>%
filter(home_team == "England") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
ENG_ap <- historic_points %>%
filter(away_team == "England") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
ENG_jointstats <- inner_join(ENG_hp, ENG_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
ENG_jointstats$Team <- "England"
# France historic stats
## Home points for/against
FRA_hp <- historic_points %>%
filter(home_team == "France") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
FRA_ap <- historic_points %>%
filter(away_team == "France") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
FRA_jointstats <- inner_join(FRA_hp, FRA_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
FRA_jointstats$Team <- "France"
# Ireland historic stats
## Home points for/against
IRE_hp <- historic_points %>%
filter(home_team == "Ireland") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
IRE_ap <- historic_points %>%
filter(away_team == "Ireland") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
IRE_jointstats <- inner_join(IRE_hp, IRE_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
IRE_jointstats$Team <- "Ireland"
# Italy historic stats
## Home points for/against
ITA_hp <- historic_points %>%
filter(home_team == "Italy") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
ITA_ap <- historic_points %>%
filter(away_team == "Italy") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
ITA_jointstats <- inner_join(ITA_hp, ITA_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
ITA_jointstats$Team <- "Italy"
# Scotland historic stats
## Home points for/against
SCO_hp <- historic_points %>%
filter(home_team == "Scotland") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
SCO_ap <- historic_points %>%
filter(away_team == "Scotland") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
SCO_jointstats <- inner_join(SCO_hp, SCO_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
SCO_jointstats$Team <- "Scotland"
# Wales historic stats
## Home points for/against
WAL_hp <- historic_points %>%
filter(home_team == "Wales") %>%
group_by(Year) %>%
dplyr::summarise(home_points_for = sum(home_points),
home_points_against = sum(away_points))
## Away points for/against
WAL_ap <- historic_points %>%
filter(away_team == "Wales") %>%
group_by(Year) %>%
dplyr::summarise(away_points_against = sum(home_points),
away_points_for = sum(away_points))
## Collate
WAL_jointstats <- inner_join(WAL_hp, WAL_ap) %>%
group_by(Year) %>%
dplyr::summarise(points_for = sum(home_points_for,
away_points_for),
points_against = sum(home_points_against,
away_points_against))
WAL_jointstats$Team <- "Wales"
# Combine into dataset
ds_histpoints <- bind_rows(ENG_jointstats,
FRA_jointstats,
IRE_jointstats,
ITA_jointstats,
SCO_jointstats,
WAL_jointstats) %>%
mutate(diff_points = points_for - points_against, .before = "Team") %>%
arrange(Year)
```
Overall team performance in W6N 2018 - 2021 (see line graph on left below) shows at first glance how England had a consistent top performance with only one second placement to the benefit of France. At the other end of performance lines is Scotland, who always concluded the tournament in fifth or sixth position. What happens in between is a series of swinging performance of the remaining four teams.
When teams’ WWR is considered before and after each W6N tournament however, the line graph on the right below shows how nations compete on three (possibly four) levels, with England sitting on top of the WWR almost regardless of the tournament’s outcome, France enjoying similar safety a couple of ranking positions below, with Scotland being consistently the last in the WWR among the tournament participants. Ireland, Italy and Wales instead seem to be those really competing for a better WWR placement year after year.
```{r ranking logic, fig.show="hold", fig.dim = c(10, 6), echo = FALSE}
rank <- mutate (rank, Tot_Win = Home_Win + Away_Win,
WR_Change = WR_after_6N - WR_before_6N) %>%
inner_join(ds_histpoints, by = c("Year", "Team"))
# Overall performance
perf_lines <- ggplot(data = rank) +
geom_line(mapping = aes(x = Year, y = Performance_6N, colour = Team), size = 1.5) +
geom_point(aes(x = Year, y = Performance_6N, colour = Team), size = 3) +
scale_colour_manual(values=c("#A0A0A0", "dodgerblue2", "seagreen", "#00CCCC", "#000066", "firebrick3")) +
scale_y_reverse(breaks = c(6, 5, 4, 3, 2, 1)) +
labs(title = "Team performance at Women's 6 Nations \n2018 - 2021",
x = "Tournament Year", y = "W6N Performance") +
theme_classic() +
theme(plot.title = element_text(size=15),
axis.title = element_text(size=14),
axis.text = element_text(size=12),
legend.text = element_text(size=14),
legend.title = element_text(size=16))
# World Ranking after 6 Nations
WR_lines <- ggplot(data = rank) +
geom_line(mapping = aes(x = Year, y = WR_before_6N, colour = Team), linetype = "dashed", size = 1.5) +
geom_point(aes(x = Year, y = WR_before_6N, colour = Team), size = 3) +
geom_line(mapping = aes(x = Year, y = WR_after_6N, colour = Team), size = 1.5) +
geom_point(aes(x = Year, y = WR_after_6N, colour = Team), size = 3) +
scale_colour_manual(values=c("#A0A0A0", "dodgerblue2", "seagreen", "#00CCCC", "#000066", "firebrick3")) +
scale_y_reverse(breaks = c(12, 10, 8, 6, 4, 2)) +
labs(title = "World Ranking before/after Women's 6 Nations \n2018-2021",
subtitle="Legend: Dashed line: WR before W6N
Solid line: WR after W6N",
x = "Tournament Year", y = "World Ranking") +
theme_classic() +
theme(plot.title = element_text(size=15),
plot.subtitle = element_text(size=13),
axis.title = element_text(size=14),
axis.text = element_text(size=12))
ggarrange(perf_lines, WR_lines, common.legend = TRUE)
```
As we combine performance and ranking stats together for each participant team (see plots below), we can also observe how the outcome of the tournament had no effect on England’s ranking in any of the years considered, caused a one place negative change in ranking for France in 2019 after their drop from first to fourth W6N placement, with more evident ranking fluctuations for the remaining Nations. Wales in particular saw their ranking change every year after the W6N and shares with Ireland the largest fluctuation of two ranking spots gained or lost.
```{r, fig.align='center', fig.dim = c(9, 6), echo = FALSE}
# Plot combined stats
combined_stats <- ggplot(data = rank) +
geom_line(mapping = aes(x = Year, y = Performance_6N, size = 1.5, colour = Team)) +
scale_colour_manual(values=c("#A0A0A0", "dodgerblue2", "seagreen", "#00CCCC", "#000066", "firebrick3")) +
geom_point(aes(x = Year, y = WR_before_6N, size = 2)) +
geom_point(aes(x = Year, y = WR_after_6N, size = 1.6, colour = Team)) +
scale_y_reverse(breaks = c(12, 10, 8, 6, 4, 2)) +
facet_wrap ( ~ Team, nrow = 2) +
theme(legend.position ="none") +
labs(title = "Team performance at Women's 6 Nations 2018 - 2021 and World Ranking before/after each Tournament
Legend: Team colour line: W6N Performance
Black dot: WR before W6N
Team colour dot: WR after W6N",
x = "Tournament Year", y = "W6N Performance & World Ranking")
combined_stats
```
Considering that the W6N is the most important tournament of the year for these national teams, apart from the World Championship every four years, can we explain the different effect that a team’s performance has on their ranking after the tournament?
We looked at two important components of each team’s performance: total wins (home wins + away wins) and points difference (points for – points against) as two separate predictors of their WWR after the W6N championship. However, neither variable seems to offer a good answer to our question, with Ireland and Scotland escaping any possible prediction in both linear regression models below that portray those teams as consistently “under-ranked”. At the opposite end Italy looks “over-ranked”, especially when points difference is considered. Although, the point difference regression model fits well to the WWR assigned to France and Wales.
```{r performance/ranking correlations, fig.show="hold", out.width="50%", message = FALSE, echo = FALSE, warning = FALSE}
# Assess correlation between W6N performance and World Ranking
perform_corr <- ggplot(data = rank) +
geom_jitter(mapping = aes(x = Tot_Win, y = WR_after_6N, colour = Team),
size = 3, width = 0.1, height = 0.1) +
scale_colour_manual(values = c("#A0A0A0", "dodgerblue2", "seagreen", "#00CCCC", "#000066", "firebrick3")) +
scale_y_reverse(breaks = c(12, 10, 8, 6, 4, 2)) +
geom_text_repel(mapping = aes(x = Tot_Win, y = WR_after_6N, label = Year)) +
labs(size = 2, vjust = 2, width = 0.2,
title = "Can wins at Women's 6 Nations explain World Ranking?",
x = "Tournament Wins", y = "World Ranking after W6N") +
theme_classic()
perform_corr +
geom_smooth(mapping = aes(x = Tot_Win, y = WR_after_6N),
method = "lm")
point_corr <- ggplot(data = rank) +
geom_jitter(mapping = aes(x = diff_points, y = WR_after_6N, colour = Team, ),
size = 3, width = 0.1, height = 0.1) +
scale_colour_manual(values = c("#A0A0A0", "dodgerblue2", "seagreen", "#00CCCC", "#000066", "firebrick3")) +
scale_y_reverse(breaks = c(12, 10, 8, 6, 4, 2)) +
geom_text_repel(mapping = aes(x = diff_points, y = WR_after_6N, label = Year)) +
labs(size = 2, vjust = 2, width = 0.2,
title = "Can points difference at Women's 6 Nations explain World Ranking?",
x = "Points Difference", y = "World Ranking after W6N") +
theme_classic()
point_corr +
geom_smooth(mapping = aes(x = diff_points, y = WR_after_6N),
method = "lm")
```
We also explored each Nation’s home/away win level and points for/points against level by visualising these components of each team’s performance across the years using the radar charts below. Note that variables were normalised in order to offer a comparable view as opposed to what would have been produced if we used absolute numbers with different scales and ranges for each variable.
The interactive radar charts that we derived from this analysis show how England is consistent in all four KPIs, minimising points against and maximising the other three factors. France is also strong across the factors considered, apart from a huge slip in 2019 when the team recorded a peak in points against. Ireland and Italy have very convoluted results; however, a highlight of Ireland’s performance is a lack of away wins in 2018 and 2020. On the contrary, away wins are a low but consistent presence in Italy’s radar chart.
Scotland spikes almost exclusively on the direction of points against, with occasional away wins in 2018 and home wins in 2021.
With a peak of away wins in 2019 and occasional home wins in 2018 and 2019, Wales shows how low scores are an issue with their performance (minimal points-for) and they struggle with defence (overlapping peak of points against across the four years).
```{r W6N performance interactive radar charts, echo = FALSE}
# interactive radar chart of Team KPIs over years considered
Team_by_Year <- rank %>%
select(Year, Team,
away_win = Away_Win, home_win = Home_Win,
p_against = points_against, p_for = points_for) %>%
group_by(Team)
ggRadar(data = Team_by_Year,
aes(colour = Year, facet = Team),
rescale = TRUE, interactive = TRUE)
#Possible to reduce font size, but requires removing interactive element.
#ggRadar(data = Team_by_Year, aes(colour = Year, facet = Team), rescale = TRUE) +
# theme(text = element_text(size = 8))
```
# Women's Six Nations 2021
This section carries out data exploration and data visualisation of the 2021 Women's Six Nations Tournament. We begin by analysing each teams' overall performance before deep-diving into Ireland's performance in the three matches they played against Italy, Wales and France.
## A Comparison of Team Performance Indicators
This section provides an overview of each teams' overall performance in the 2021 W6N. Several performance indicators are available from the [Six Nations website](https://www.sixnationsrugby.com/women/statistics-2021/).
We used radar charts to compare the following performance indicators (see definitions at the [Ruck website](https://www.ruck.co.uk/rugby-glossary-a-dictionary-of-rugby-terms/)) for Ireland, Wales, Italy and France (note that these teams were chosen because Ireland played each of these teams in the 2021 W6N).
* Tackles Made
* Knock Ons
* Lineouts Stolen
* Lineouts Won
* Penalties Conceded
* Turnovers Conceded
* Turnovers Won in the Tackle
* Turnovers Won
```{r, fig.show="hold", out.width="50%", echo = FALSE}
maximum_value = team_data_2021 %>%
dplyr::summarise(dplyr::across(PF:RC, ~ max(.x, na.rm = TRUE)))
minimum_value = team_data_2021 %>%
dplyr::summarise(across(PF:RC, ~ min(.x, na.rm = TRUE)))
important_metrics = data_dictionary[which(data_dictionary$Definition %in% c('Turnovers Won',
'Turnovers Won in the Tackle',
'Turnovers Conceded',
'Lineouts Won',
'Lineouts Stolen',
'Tackles Made',
'Knock Ons',
'Pens Conceded')), ]
team_data_2021_min_max =
rbind(cbind('Year'=2021, 'Team'='maximum_value', maximum_value), cbind('Year'=2021, 'Team'='minimum_value', minimum_value)) %>%
rbind(team_data_2021) %>% rename_with(~ data_dictionary$`Definition`[which(data_dictionary$`Variable Name` == .x)], .cols = data_dictionary$`Variable Name`)
radar_dataset = team_data_2021_min_max[,which(colnames(team_data_2021_min_max) %in% important_metrics$`Definition` | colnames(team_data_2021_min_max) =='Team')]
# Ireland
radar_dataset %>%
filter(Team %in% c('maximum_value', 'minimum_value', 'Ireland')) %>%
select(-Team) %>%
radarchart(axistype = 2, pcol = "seagreen", pfcol = scales::alpha('seagreen', 0.5),
plwd = 2, title='Ireland', cglcol = "grey", cglty = 1, axislabcol = "grey")
# Wales
radar_dataset %>%
filter(Team %in% c('maximum_value', 'minimum_value', 'Wales')) %>%
select(-Team) %>%
radarchart(axistype = 2, pcol = "firebrick3", pfcol = scales::alpha('firebrick3', 0.5), plwd = 2, title='Wales',
cglcol = "grey", cglty = 1, axislabcol = "grey")
# Italy
radar_dataset %>%
filter(Team %in% c('maximum_value', 'minimum_value', 'Italy')) %>%
select(-Team) %>%
radarchart(axistype = 2, pcol = "#00CCCC", pfcol = scales::alpha('#00CCCC', 0.5), plwd = 2, title='Italy',
cglcol = "grey", cglty = 1, axislabcol = "grey")
# France
radar_dataset %>%
filter(Team %in% c('maximum_value', 'minimum_value', 'France')) %>%
select(-Team) %>%
radarchart(axistype = 2, pcol = "dodgerblue2", pfcol = scales::alpha('dodgerblue2', 0.5), plwd = 2, title='France',
cglcol = "grey", cglty = 1, axislabcol = "grey")
```
Italy had the highest number of knock ons, turnovers conceded and penalties conceded, all of which would have given the opposition advantage and negatively impacted Italy's performance. We can also see Italy made the most tackles but this didn't lead to a high number of turnovers won. In comparison we can see Wales also made a lot of tackles and also won many turnovers. Wales had a strong advantage in relation to knock ons and penalties conceded and we can also see they didn't concede many turnovers. One area where Wales could improve is winning the lineouts.
France had a very good performance and came second in the tournament but comparing their turnover metrics to the other teams does not seem to suggest this. This is probably because France had possession a lot more and there was no need for them to turn the ball over. France did have quite a few knock ons and their tactics in relation to lineouts could also be improved.
Ireland made fewer tackles compared to the other teams but were much better at turning the ball over. They were also very good at stealing lineouts from the other team. Areas where they could improve are in relation to knock ons and penalties conceded.
## A Deep-Dive into Turnovers
Turnovers are a key feature in rugby matches as they are the moments when a team can transform defence into attack.
A turnover won often leads to a team taking advantage of suddenly gained momentum to resolve into a try, or it can be key to neutralise dangerous phased attacks within 10 metres from the try line. Being a highly strategic feature of the game, turnovers have numerous nuances that often [characterise a team’s style of play](https://www.rugbyworldcup.com/news/464334). Given the importance of turnovers, we wanted to carry out a deeper analysis into Ireland's turnover-related performance when they played against Wales, Italy and France in the 2021 W6N.
For clarity, a turnover is defined as follows: when a team concedes possession of the ball they are said to have turned the ball over to the other team. This can happen due to defending players stealing the ball in various ways including: tackling an attacker, the attacker knocking the ball on, the defending team stealing the ball during a lineout, etc....
Each of the three analysts on this project took responsibility for one of Ireland's games and used the FC Python YouTubeCoder tool to track turnover related data during that match. To keep our analysis at conversational level, we decided to define turnovers in line with a [Six Nations pre-match media report template](https://d2cx26qpfwuhvu.cloudfront.net/sixnations/wp-content/uploads/2021/02/23161036/Wales-v-England-Six-Nations-EN-Pre-Match-Report.pdf), which targets a wide media audience with informative and easily understood content.
The following performance indicators were recorded for each match:
* **Opposition**: indicates Ireland's opposition and takes a value of "Wales", "Italy" or "France".
* **Team who won turnover**: indicates whether Ireland or their opposition won a turnover.
* **Turnover method**: indicates the method by which the turnover was won. Takes possible values of "kick", "penalty", "lineout", "tackle", "knock-on" and "scrum".
* **X/Y**: this records the X/Y location of where the turnover happened on the pitch.
* **Min/secs**: records the time at which the turnover happened in the match.
* **Quarter**: indicates if the turnover happened in the 1st, 2nd, 3rd or 4th quarter of the match. Each quarter represents 20 minutes.
* **Num passes before turnover**: the number of succesful passes made between players before a turnover occurred.
* **Num tackles before turnover**: the number of tackles made before a turnover occurred.
First, we analysed the total number of turnovers won and lost by each team across the entire W6N tournament. The barchart on the left below shows the number of turnovers won by team. Ireland won the most turnovers of the season. Scotland won the fewest. The number of turnovers won didn’t necessarily indicate a successful game. For example, when France played Ireland they didn’t have a high number of turnovers won but had an impressive winning performance. They managed to retain possession of the ball throughout most of the match which led to fewer turnovers.
The barchart on the right below shows the number of turnovers conceded per team. Italy conceded the most, with Wales conceding the fewest. Similarly to the turnovers won statistics, few turnovers conceded didn’t mean that a team was more successful. Wales ranked last in the final classification of teams despite their relatively low number of turnovers conceded.
```{r, fig.show="hold", out.width="50%", echo = FALSE}
#TW with geom_col
ggplot(team_data_2021) +
geom_col(aes(x = reorder(Team, -TW), y = TW), fill='seagreen4')+
labs(title = "Turnovers won by each team", x = 'Country', y = 'Turnovers Won')+
theme(plot.title = element_text(hjust = 0.5, size=18),
axis.title = element_text(size=16),
axis.text = element_text(size=14))
#TC with geomcol
ggplot(team_data_2021) +
geom_col(aes(x = reorder(Team, -TC), y = TC), fill='darkcyan') +
labs(title = "Turnovers conceded by each team", x = 'Country', y = 'Turnovers Conceded') +
theme(plot.title = element_text(hjust = 0.5, size=18),
axis.title = element_text(size=16),
axis.text = element_text(size=14))
```
We then analysed the data collected via the FC Python YouTubeCoder tool, which allows us to analyse turnover-related data for the 3 matches that Ireland played. The heatmap below was made using geomtile() and shows how frequently each team achieved their turnovers using each method. The darker shades indicate more frequent occurrences, with lighter shades denoting less frequent occurrences (note that Ireland played in all 3 matches and so the heatmap shows their average number of turnovers by method, in order to provide a fair comparison against other teams). From viewing this we can see that Ireland and Wales achieved a number of turnovers from penalties whereas France and Italy won no turnovers this way. France also did not win any turnovers via scrums, unlike the other teams.
```{r geom_tile, echo = FALSE}
turnovers %>%
dplyr::count(team_who_won_turnover, turnover_method) %>%
dplyr::mutate(n_adj = case_when(
team_who_won_turnover == "Ireland" ~ n/3,
team_who_won_turnover == "Wales" ~ n/1,
team_who_won_turnover == "Italy" ~ n/1,
team_who_won_turnover == "France" ~ n/1)) %>%
ggplot(mapping = aes(x = team_who_won_turnover, y = turnover_method)) +
geom_tile(mapping = aes(fill = n_adj)) +
scale_fill_gradient(low = 'skyblue1',
high = 'midnightblue',
na.value = 'white') +
labs(y = 'Turnover Method',
x = 'Team who won Turnover',
fill = "Count") +
theme(panel.background = element_rect(fill = 'white'))
```
```{r generate heatmaps, echo = FALSE}
# Create function for creating rugby pitch.
rugby_pitch <- function(background_colour = NULL){
return_list =
list(geom_segment(aes(x = 0, y = 0, xend = 0, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 100, y = 0, xend = 100, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 0, y = 0, xend = 100, yend = 0), size=1, colour='white'),
geom_segment(aes(x = 0, y = 100, xend = 100, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 50, y = 0, xend = 50, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 22, y = 0, xend = 22, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 78, y = 0, xend = 78, yend = 100), size=1, colour='white'),
geom_segment(aes(x = 5, y = 0, xend = 5, yend = 100), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 95, y = 0, xend = 95, yend = 100), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 40, y = 0, xend = 40, yend = 100), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 60, y = 0, xend = 60, yend = 100), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 0, y = 5, xend = 100, yend = 5), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 0, y = 95, xend = 100, yend = 95), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 0, y = 15, xend = 100, yend = 15), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 0, y = 85, xend = 100, yend = 85), linetype=2, size=1, colour='white'),
geom_segment(aes(x = 49.5, y = 50, xend = 50.5, yend = 50), linetype=2, size=1, colour='white'),
geom_segment(aes(x=-10, y=47.5, xend=0, yend = 47.5), size=1, colour='white'),
geom_segment(aes(x=-10, y=52.5, xend=0, yend = 52.5), size=1, colour='white'),
geom_segment(aes(x=-4, y=47.5, xend=-4, yend = 52.5), size=1, colour='white'),
geom_segment(aes(x=110, y=47.5, xend=100, yend = 47.5), size=1, colour='white'),
geom_segment(aes(x=110, y=52.5, xend=100, yend = 52.5), size=1, colour='white'),
geom_segment(aes(x=104, y=47.5, xend=104, yend = 52.5), size=1, colour='white') )
if(is.null(background_colour)){
return(return_list)
}
else{
append(return_list, list(theme_void(),
theme(panel.background = element_rect(fill=background_colour))))
}
}
```
### Ireland vs France
We now go even deeper into our analysis of turnover-related indicators by analysing Ireland's tactics in each of the three games they played. The first game analysed here was played against France. The heatmaps below show the location of turnovers won by Ireland and France in the first and second half of the match. The black arrows on the heatmaps indicate the direction Ireland was playing in for that half.
During the first half, most of the turnovers won by France were on the middle of the pitch towards Ireland's goal posts. Some were also won on the top right quarter of the pitch, close to their goal post.
Ireland made very few turnovers during the first half of the match, but they were spread more across the pitch. Their turnovers were concentrated to the middle of the pitch vertically but also expanding over the majority of the right half of the pitch.
During the second half, France’s turnovers spread over a larger area, but made much fewer turnovers than Ireland. Ireland were more concentrated, with most of their turnovers happening top and centre of the field.
```{r, fig.show="hold", out.width="50%", echo = FALSE}
turnovers %>% filter(opponent == 'France' & team_who_won_turnover=='Ireland' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.002),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) + # change colour scheme
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v France: Womens Six Nations 2021',
subtitle='First Half Locations of Turnovers Won by Ireland') +
theme_void() +
theme(plot.title = element_text(hjust=0.5), # move plot title to center
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by France in the 1st half
turnovers %>% filter(opponent == 'France' & team_who_won_turnover=='France' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.002),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v France: Womens Six Nations 2021',
subtitle='First Half Locations of Turnovers Won by France') +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by Ireland in 2nd half
turnovers %>% filter(opponent == 'France' & team_who_won_turnover=='Ireland' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.002),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v France: Womens Six Nations 2021',
subtitle='Second Half Locations of Turnovers Won by Ireland') +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by France in the 2nd half
turnovers %>% filter(opponent == 'France' & team_who_won_turnover=='France' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.002),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v France: Womens Six Nations 2021',
subtitle='Second Half Locations of Turnovers Won by France') +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
```
### Ireland vs Italy
Now looking at the Italy match, in a game characterised by a flow of handling errors on both sides, turnovers won by Ireland occurred mostly at midfield and on the blindside in both halves of the game. This suggest how their influence on both the attack and defence strategy of the Girls in Green was very low.
A similar pattern is visible in the density map of turnovers won by Italy in both halves of the match. The turnover area is slightly more heavily spread across and positioned deeper in the Italian defensive zone, as a backwards protective response.
```{r IREvITA heatmaps, fig.show="hold", out.width="50%", echo = FALSE}
# Heatmap of turnovers won by Ireland in the 1st half
turnovers %>% filter(opponent == 'Italy' & team_who_won_turnover=='Ireland' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00049),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title = "Ireland v Italy: Women's Six Nations 2021",
subtitle = "First Half Locations of Turnovers Won by Ireland") +
theme_void() +
theme(plot.title = element_text(hjust=0.5), # move plot title to center
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by Italy in the 1st half
turnovers %>% filter(opponent == 'Italy' & team_who_won_turnover=='Italy' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00049),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title = "Ireland v Italy: Women's Six Nations 2021",
subtitle = "First Half Locations of Turnovers Won by Italy") +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by Ireland in the 2nd half
turnovers %>% filter(opponent == 'Italy' & team_who_won_turnover=='Ireland' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00049),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title = "Ireland v Italy: Women's Six Nations 2021",
subtitle = "Second Half Locations of Turnovers Won by Ireland")+
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by Italy in 2nd half
turnovers %>% filter(opponent == 'Italy' & team_who_won_turnover=='Italy' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00049),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow())+
labs(title = "Ireland v Italy: Women's Six Nations 2021",
subtitle ="Second Half Locations of Turnovers Won by Italy") +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
```
What we observed in relation to the density maps above is confirmed by a visual breakdown of turnovers by type shown in the two plots below. There is a variety of different types of turnover in the first half of the game, however knock-on turnovers are the most frequent feature for both teams. In the second half, they become almost the exclusive type of turnover, thus confirming how handling errors increased along with the time played.
```{r turnover by half, fig.dim = c(9, 8), echo = FALSE}
# Need to create a dataset to show which direction Ireland were playing in each half
arrow_df = data.frame(x_start = c(10, 90),
x_end = c(1, 99),
y = c(90, 90),
half = c('First Half', 'Second Half'))
## Turnovers by half and type
turnovers %>%
mutate(half=ifelse(quarter %in% c(1,2), 'First Half', 'Second Half')) %>%
filter(opponent == 'Italy') %>%
ggplot() +
rugby_pitch(background_colour = 'chartreuse4') +
geom_point(aes(x = x, y = y, shape = turnover_method, col=team_who_won_turnover), size=3) +
scale_color_manual(values=c('black', 'orange')) +
labs(title = "Ireland v Italy: Women's Six Nations 2021",
subtitle = 'Position of turnover types',
col = "Team who won turnover",
shape = "Turnover method") +
facet_wrap(vars(half), ncol=1, strip.position = 'top') +
geom_text(aes(x=ifelse(half=='First Half', 15, 85), y=90, label='IRE'), size=6) +
geom_segment(data = arrow_df, aes(x = x_start, xend = x_end, y = y, yend = y),
colour = "black", size=2, arrow=arrow()) +
theme(legend.position = 'bottom',
plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5),
strip.text.x = element_text(size=12))
```
### Ireland vs Wales
Below we can see where turnovers were won in the final match played against Wales. The most turnovers were won by Wales in the first half on the left wing. Ireland were able to get the ball into Wales' inner third before they were turned over. In the second half we can see that Ireland were able to get closer to the goal line before being turned over by Wales, with most of Wales' turnovers being around the 22m line. In contrast most of the turnovers won by Ireland were in the middle of the pitch in the first half and in Wales inner third in the second half showing how Wales struggled to get close to their end line to score tries.
```{r, fig.show="hold", out.width="50%", echo = FALSE}
# Create heatmap of first half locations of Turnovers won by Wales.
turnovers %>% filter(opponent == 'Wales' & team_who_won_turnover=='Ireland' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00045),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v Wales: Womens Six Nations 2021',
subtitle='First Half Locations of Turnovers Won by Ireland') +
theme_void() + # Removes axis labels, tick marks and grey areas outside the actual pitch.
theme(plot.title = element_text(hjust=0.5), # move plot title to center
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
turnovers %>% filter(opponent == 'Wales' & team_who_won_turnover=='Wales' & quarter %in% c(1,2)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00045),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=85, y=90, label='IRE'), size=6) +
annotate("segment", x = 90, xend = 99, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v Wales: Womens Six Nations 2021',
subtitle='First Half Locations of Turnovers Won by Wales') +
theme_void() + # Removes axis labels, tick marks and grey areas outside the actual pitch.
theme(plot.title = element_text(hjust=0.5), # move plot title to center
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by Ireland in 2nd half
turnovers %>% filter(opponent == 'Wales' & team_who_won_turnover=='Ireland' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00045),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v Wales: Womens Six Nations 2021',
subtitle='Second Half Locations of Turnovers Won by Ireland') +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
# Heatmap of turnovers won by wales in the 2nd half
turnovers %>% filter(opponent == 'Wales' & team_who_won_turnover=='Wales' & quarter %in% c(3,4)) %>%
ggplot() +
geom_point(aes(x = x, y = y)) +
stat_density2d(aes(x = x, y = y, fill=..density..), geom='tile', contour=FALSE) +
scale_fill_gradientn(colors=rev(brewer.pal(10, "Spectral")), limits=c(0,0.00045),
breaks=c(0, 0.0001, 0.0002, 0.0003, 0.0004)) +
rugby_pitch() +
geom_text(aes(x=15, y=90, label='IRE'), size=6) +
annotate("segment", x = 10, xend = 1, y = 90, yend = 90, colour = "black", size=2, arrow=arrow()) +
labs(title='Ireland v Wales: Womens Six Nations 2021',
subtitle='Second Half Locations of Turnovers Won by Wales') +
theme_void() +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5), legend.position = "None")
```
The graphs below show the positioning of different turnovers. We can see that turnovers that happened on the wings are usually from kicks and lineouts, while turnovers in the middle of the pitch are usually from tackles. There is no obvious difference here between Ireland and Wales style of play, we can see that kicks are a feature of both of their tactics due to turnovers won from kicks and usually lineouts are a result of kicks as well. Wales lost the ball three times from knock-ons, twice in the first half and once in the second half. Ireland lost the ball five times from knock-ons, all in the second half. In the second half we can see that the ball was turned over more by tackles.
```{r, fig.dim = c(9, 8), echo = FALSE}
# Need to create a dataset to show which direction Ireland were playing in each half
arrow_df = data.frame(x_start = c(90, 10),
x_end = c(99, 1),
y = c(90, 90),
half = c('First Half', 'Second Half'))
turnovers %>% mutate(half=ifelse(quarter %in% c(1,2), 'First Half', 'Second Half')) %>%
filter(opponent == 'Wales') %>%
ggplot() +
rugby_pitch(background_colour = 'chartreuse4') +
geom_point(aes(x = x, y = y, shape = turnover_method, col=team_who_won_turnover), size=3) +
scale_color_manual(values=c('black', 'orange')) +
labs(title = 'Ireland v Wales: Womens Six Nations 2021',
subtitle = 'Position of turnover types',
col = 'Team who won turnover',
shape = 'Turnover method') +
facet_wrap(vars(half), ncol=1, strip.position = 'top') +
geom_text(aes(x=ifelse(half=='First Half', 85, 15), y=90, label='IRE'), size=6) +
geom_segment(data = arrow_df, aes(x = x_start, xend = x_end, y = y, yend = y),
colour = "black", size=2, arrow=arrow()) +
theme(legend.position = 'bottom',
plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5),
strip.text.x = element_text(size=12))
```
Digging deeper into the turnover methods, we also created the barcharts below to look at the average number of passes and tackles before a turnover to determine if there is any association between phases and turnover method. We can see that Ireland will pass more before kicking the ball with an average of 7 passes before kicking whereas Wales had an average of 2 passes before kicking. Ireland also had more passes before losing the ball to a tackle, which illustrates that Ireland were better at getting tackles in quick compared to Wales.
```{r, warning=FALSE, echo = FALSE}
turnovers %>%
filter(opponent == 'Wales') %>%
mutate(team_passing = ifelse(opponent==team_who_won_turnover, 'Ireland', opponent)) %>%
ggplot() +
geom_bar(aes(x=turnover_method, y=num_passes_before_to, fill=team_passing), stat = "summary", fun = "mean") +
scale_fill_manual(values=c('seagreen', 'firebrick3')) +
facet_grid(cols=vars(team_passing)) +
labs(title = 'Ireland v Wales: Womens Six Nations 2021',
subtitle='Average number of passes before a turnover',
x='turnover method',
y='avg # passes before a turnover') +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5),
legend.position = 'none')
```
Data on the number of tackles before a turnover was won reinforces what we learned from the passing data. The barcharts below show that Ireland's tackling helped them win the ball back quicker than Wales and Wales had to tackle more before forcing Ireland to kick the ball.
```{r, warning=FALSE, echo = FALSE}
turnovers %>%
filter(opponent == 'Wales') %>%
ggplot() +
geom_bar(aes(x=turnover_method, y=num_tackles_before_to, fill=team_who_won_turnover), stat = "summary", fun = "mean") +
scale_fill_manual(values=c('seagreen', 'firebrick3')) +
facet_grid(cols=vars(team_who_won_turnover)) +
labs(title = 'Ireland v Wales: Womens Six Nations 2021',
subtitle='Average number of tackles before a turnover',
x='turnover method',
y='avg # tackles before a turnover') +
theme(plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5),
legend.position = 'none')
```
# Other Analysis & Future Research
Although the main focus of this project was to analyse turnover-related performance indicators, we also explored other areas. For example, the line graph below shows the cumulative scores throughout the Ireland vs Wales match. Wales did not score at all and we can see a lull in Ireland's scoring for a period of over 40 minutes between the 30th and 80th minute.
```{r, echo = FALSE}
turnovers %>% filter(opponent == 'Wales') %>%
mutate(real_time_minutes = ifelse(mins < 58, mins-11, mins-23)) %>%
pivot_longer(cols=c(ireland_score, opponent_score), names_to='team', values_to='score') %>%
ggplot() +
geom_line(aes(x=real_time_minutes, y=score, group=team, col=team), size=1) +
labs(title = 'Ireland v Wales: Womens Six Nations 2021',
subtitle='Timeline of Scores',
x='Minutes',
y='Score',
col='Team') +
scale_colour_manual(labels = c('Ireland', 'Wales'),
values=c('seagreen', 'firebrick3')) +
theme(legend.position = 'top',
plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5))
```
The following plot shows the location of Ireland's attempts at converting a try when playing against wales. We can see that Ireland missed 2 out of 7 conversions, with both misses taken close to the wing.
```{r, echo = FALSE}
events = read.csv('events_six_nations_2021_ire_vs_wales.csv')
events %>% filter(grepl("conversion", Event)) %>%
mutate(half = if_else(Mins < 42, 'First Half', 'Second Half')) %>%
ggplot() +
rugby_pitch(background_colour = 'chartreuse4') +
geom_point(aes(x=X, y=Y, group=Event, shape=Event, col=half), size=3) +
scale_shape_manual(values = c(7, 19), labels = c('Miss', 'Score')) +
scale_colour_manual(values = c('black', 'red3')) +
labs(title = 'Ireland v Wales: Womens Six Nations 2021',
subtitle = 'Ireland Conversion Positions') +
theme(legend.position = 'top',
legend.title = element_blank(),
plot.title = element_text(hjust=0.5),
plot.subtitle = element_text(hjust=0.5))
```
There are many other things which could be done using video analysis data. We decided to analyze turnovers but we also could have looked at tries and what happened before a try to know what team's were doing well in regards to this. Individuals also could have been tracked to see where players were contributing most positively to the game and what their strengths and weaknesses were in relation to tackling, passing and turnovers.
In relation to the performance/ranking analysis, we considered only four years of data. Considering a longer period of time or breaking down the dataset by game may offer additional information on what KPIs are better predictors of a team’s WWR.
# Limitations
One limitation of this project was the lack of time to run a reliability test on the data captured using the FC Python YouTubeCoder tool. Normally sports analysts would watch the game multiple times or another analyst would watch the same game and record the same events. The results of each iteration would be compared to find any discrepancies or unreliable data. Due to time constraints and as this project was focused on learning and using R we decided not to run reliability tests and focus on visualization and analysis in R using the data we have. If we had more time, we may have conducted reliability tests.
Also if we had more time we could have analyzed more games in detail which would have given us a larger dataset to allow us to create models to predict a team's performance. One thing we would have liked to explore was to use the team's Six Nation's performance to predict which team could secure the last available place for the Women's Rugby World Cup. Due to the lack of historical data, time and the anomaly of 2020 and 2021 data due to COVID-19 we decided not to do this.
# Conclusion
Undertaking this project allowed us to experience sport performance analysis and how R can be used to analyse sports matches such as the Women's Six Nations. We gained further understanding of the process of sports analysis from using the FC Python YouTubeCoder tool to collect data on specific match events, which then led to creating visualisations and making inferences from the data.
Our findings allowed us to confirm existing beliefs, for example, that England would perform well and Scotland would take a low position. It also allowed us to make new realisations, such as that tackles don’t always lead to dominance and that a high number of turnovers doesn’t necessarily indicate successful play.
As mentioned in the Limitations section, the scope of the project was constrained by time. There is a lot more that we could have explored and there is definitely potential to take matches such as the Six Nations and explore them much further.
Overall, this was an interesting and insightful introduction to the field of sport performance analysis.