-
Notifications
You must be signed in to change notification settings - Fork 1
/
01_R_introduction_dplyr.html
1043 lines (1017 loc) · 189 KB
/
01_R_introduction_dplyr.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="author" content="Gregor Pirs, Jure Demsar and Erik Strumbelj" />
<title>R introduction and dplyr</title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
a.sourceLine { display: inline-block; line-height: 1.25; }
a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; }
a.sourceLine:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
a.sourceLine { text-indent: -1em; padding-left: 1em; }
}
pre.numberSource a.sourceLine
{ position: relative; left: -4em; }
pre.numberSource a.sourceLine::before
{ content: attr(title);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; pointer-events: all; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
a.sourceLine::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style type="text/css">@font-face{font-family:'Open Sans';font-style:normal;font-weight:400;src:local('Open Sans'),local(OpenSans),url(data:application/font-woff;base64,d09GRgABAAAAAE8YABIAAAAAhWwAAQABAAAAAAAAAAAAAAAAAAAAAAAAAABHREVGAAABlAAAABYAAAAWABAA3UdQT1MAAAGsAAAADAAAAAwAFQAKR1NVQgAAAbgAAABZAAAAdN3O3ptPUy8yAAACFAAAAF8AAABgoT6eyWNtYXAAAAJ0AAAAmAAAAMyvDbOdY3Z0IAAAAwwAAABZAAAAog9NGKRmcGdtAAADaAAABJsAAAe0fmG2EWdhc3AAAAgEAAAAEAAAABAAFQAjZ2x5ZgAACBQAADWFAABReBn1yj5oZWFkAAA9nAAAADYAAAA293bipmhoZWEAAD3UAAAAHwAAACQNzAapaG10eAAAPfQAAAIIAAADbLTLWYhrZXJuAAA//AAAChcAAB6Qo+uk42xvY2EAAEoUAAABuQAAAbz3ewp/bWF4cAAAS9AAAAAgAAAAIAJ2AgpuYW1lAABL8AAAAKwAAAEyFNwvSnBvc3QAAEycAAABhgAAAiiYDmoRcHJlcAAATiQAAADyAAABCUO3lqQAAQAAAAwAAAAAAAAAAgABAAAA3AABAAAAAQAAAAoACgAKAAB4AR3HNcJBAQDA8d+rLzDatEXOrqDd4S2ayUX1beTyDwEyyrqCbXrY+xPD8ylAsF0tUn/4nlj89Z9A7+tETl5RXdNNZGDm+vXYXWjgLDRzEhoLBAYv0/0NHAAAAHgBY2Bm2cY4gYGVgYN1FqsxAwOjPIRmvsiQxviRg4mJm42NmZWFiYnlAQPTewcGhWgGBgYNBiAwdAx2ZgAK/P/LJv9PhKGFo5cpQoGBcT5IjsWDdRuQUmBgBgD40BA5AHgBY2BgYGRgBmIGBh4GFoYDQFqHQYGBBcjzYPBkqGM4zXCe4T+jIWMw0zGmW0x3FEQUpBTkFJQU1BSsFFwUShTWKAn9/w/UpQBU7cWwgOEMwwWg6iCoamEFCQUZsGpLhOr/jxn6/z/6f5CB9//e/z3/c/7++vv877MHGx6sfbDmwcoHyx5MedD9IOGByr39QHeRAABARzfieAFjE2EQZ/Bj3QYkS1m3sZ5lQAEsHgwiDBMZGP6/AfEQ5D8REAnUJfxnyv+3/1r/v/q3Eigi8W8PA1mAA0J1MzQy3GWYwdDP0Mcwk6GDoZGRn6ELAE09H/8AAAB4AXVUR3fbxhPfhRqr/6Cr3h8pi4wpN9K9V4QEYCrq7b2F0gC1R+XkS3rjKWXlfJeBfaF88jH1M6TfoqNzdWaXxZ0NM7/ftJ2ZpXfzzeVILi0uzM/NzkxPTU68Md64GQZ+vfa6d+P6tatXLl+6eOH8uVMnTxyvVg4fGisfhNfcV0f3luz/7Srmc9nMyPDQ4IDFWUUgjwMcKItSmEAASaNaEcFo069WAghjFIlAegyOQaNhIEhQxALHEqIeg2P0yHLjKUuvY+n1LbktrrKrOgUI/MUH0ebLc5Lk73yIBO4YeUrL5GGUIimuSx6mKl2tCDD8oKmCmGrkaT5Xh/p6rlphaS5PYp4kPAy3Un74OjeCdTi4nFosU6Qg+qRBsoazczLwHdeNqpVx3AW+oVjdhMThOo6YkGJTl862RFq5r263bbYSHyuswVrylsSBhHzVQKDU11g6hkfAxyOf/DVKJ1/HCvgBHtNRJ+b7eSYepeQ4VLZBqAeMjgM7/zyJJF1kuGw/YFpEq458Xrr65YTUa6VCEKGKVdJ+2FoBYYNKCwV1K6B2s1mJnPB7Ww6GtyO04ya/HHWPHs5P4J65NyVa5VA0E0LocwPci45b6tvMvohm1BYc1h12Xd2GrbbHVkjB1pzs6IKtOHeYd+JYhFasmfs9Zt+SZlo9pu8eg0utWZAKB8vjaxBQx7cSbK3Qdr2nBwM27vrXcUHtLolLJyJjK3CAbDcFDo3hsPZ63IH2RrsoWyskdB47jiKitFtcAgqj4wQQxN3PB81RCiCo0Y1jnUVYlOj5JHhJd2JBevIEeSQxDWzTN8PEE3AL90KtP11dVrC5II1L1w331pHFq10vPBGYeyUCFRvB7PAEzMltdubhb+lZ4dw9w86yyNfG++u0ZWOBkmsb+GrsrKGIN4R0XPQimnAEcj3CI6ZDR35zzHJEZlcW5cQCTMwty4umkB5B4ajHwVNhQDqdMLSAmClnhLScgYgMbQJESALUrtIvjpQz9LVxuIPSiYgQkjusZ01l4BERrPtdO9KfDErKQLne6EUbJlXHqTccNzL163tuES26ickjo5va6FIkCyIyaFEYA+lejuqlFxLWIYKmQG9W0tlMe0yXu80wPe/OavEJrd8srSFziSal30wMj5H2mH7T6H218RQ93qOFysDEgtLBoRuQUeXjyPQKexdLjoa4vtAQJiBsEXYutEo9T1/m5mUdBMbXFCzIq8Z6Yl5+7nyic+1mE3xisVatpBarpcC/mUs9/s3Csty2GRPfLMo7FrfqcS1KDxIntwVjnkEtjRJoFKEVHWmelIyxd7Y9xlqGHTSA0VfbnBks08M4W21bHczuJBrTiYixiBnsMF7PepCwTAdrGcy8UqZb5uWGvIyX9QpW0XJSrqE7hNzjjGU5u1vgRe6k5DVv4DZvpVnP6Vi0yMKLOhUvPUq9tCzvFhi5mV9KVNMvWpfRJg1bggjEml6Uz6KmiiN92dh+Gg19OHK4TmOC61TIcAFzsF7DPNQ0fkPjNzr4sMZHaEX5fk7uLZr9LHK9AW9KF2wU///BUfaOnlREfyrK/rv6Hyn3ISkAAAEAAwAIAAoADQAH//8AD3gBhXwHfFRV1vg5974yvZdMQspkSIYkQkgmhdAyIIQQWsSADCLSpajUiMgiAkuJNGmhKyJGDCyybCiyiGBHRGQtyLIuf2UX19UPy7oWyFz+972ZBxOE72N+L2+Yd+be0+5p99wBAscBBIN4ACjI4D4oUJEIVAbIL8wPYX4oP1TQ3um3+0v5dZz2bj44nsyKLhYPXKkaL1wCAhuuXcQ69dsWyAu7qF5PBMFqQzQRkzQgYvIQCuXleXYHlCXl2x1YZg+F7HxMDNAQLQoVetwuKZCZjRUTQqc/f7RjebisqAeuEQJXmpZUdA/3KgcgsJA2kL1xDNPDZqCyQAWdXiIy5YOHThUq4/KB1XFpgPr5heVtJuSQvJzxOeKB6HfEplzKWCEA4Sc+Vgqkw8bwIF16K7fg0ttNJr3DajEKBqfT5UlNkwXJKyD4hCRRlFySwU+TvTTJkJTh1wkms6l/pBWa08Fmt/WP+Nz2AWYcYEez3WwXvU5qECE/VB5ylJXl5993Hyc3zw6hkHaPoerldxVjh7eMX/F3hYWxu0KF382pcKpXsV+9QlS93Mj/Sz/ujinsVE1dDTszcEk1u4LpPdjXmDdw6UAsqFlUg7rmf2J+d3aGLmC757GBuEe55mHNXGxifZVrLtuNNUBhwbU6wSQ5IAOyoS2MCxcH7VmpXkHIdZlFP4BPtOvFdvlZZsncL0Kl1pZcS99Iam5eK1erfhFvrkviL9HDKc5X6OV/ChUq7aGEvw5U6QuFVCbEhOSSZHegODM7WOzxhOzZ2cVFJaXFIbfHK2cH7WlELuK3EnR5vHZJEkzvHZw35S933n0ucur5ky/MO7SraN2mrVuqGiNPnIt+NnTy6HF4fMkfvf+6EEjfkpWPh7rtXrJgp+NAk9hzQScj6194/+yxlZE72Ow0KvcdloMLbPcBiDD+2jdSW/Ek6MENfk55AfQMtwabaPC0aZWZ2a6Nob1NKgxRc3qemb/aF0jtk3xZPtkpc4Xjr3KVXE7WDfpi+sfVJ1RotwUyJVFVbE4ZV3JUPi0pLsq++XMM4A9Vd+/YcXcVvrtx7bLN61av2oINVTU11dU1NVV4cuPaFRvXrV7xDGPNH6+heQJpbMQaHLiz8R9fXb5w8dLl5vO7XnzhD7uef37Xxa8u//3ipa9pxpUqrt5AYeq1b8QPxVNg5BQWw13h9k4PpEqB3Lx2eW0DlmxfqkdfUhoy9Y6EnNZgW0t7MZ/6smlubka+I0NfFckQoDwPkjih+d4yrpTleTdRqoinJE6Ts7AULcTt8mRxQbYjMeLcXMpYwucgMgaCkrrMn668Z97YBwZHJm/+/hnWZ/KwOzazl5c2DerS+o2Xth9eshXXd7jTu7NHHeb98+VHfqw/+z/Cmp5zhvSZe3e/kSOubt2EO3tExnWrrbsy/51x94+aWFa/84V1k/bfx2Z1fWE0+2It+2zfxGEfAaBiMbBctRiug0CpIBLFUpyK2R+OumYgYrZB+cZAdoT4+TfM0CpsksEggGCxGoNUsV4J5sVpc5SGJE6pwxvIJgM3r97+1Kq1S7et2UQKUI/v7znOCn/8jpW80ohvKaN24aOatFEFAx8XLFYDFYItR0UbkQMljuIiEgx5HMS0efW2pWtXPbVdGZb9yjruPIInv/sR3z/+EisAhMFkrmCRXGCB9uEUKgoomw16o95qEwxoJiaT2cDtl84CUP5G4XWJOTBmWLK8olOmNOjMKhUpWZWHK5LZgl9279229we2OBUX50kuVjv5QDo7PBwnsvrhWJF+YDIuVagZDxeFHOF1MEKbsBMEQS+KJjOVdXJ1BKw61EH+feqSTzTz3I7ZA3Zuv+whshy3sDFL2TjctJR6n2SDsfFJ3A0I5ewXfAgugw7s+0XQG0SAfFVWHOEsr6TyphSHW5NHFc9J6Wa+7B3Dfp42HguHAUINniPlZCpQ/l0CogDIrW/8u85iv7sGv8ZzGzYAxjwV/MCxTwobJQCTWU8HRPQeruaaXpRqestVdUOXso7dupeF7px4Z8+ed3arKFc44AIg51W9ch4kIIiUEocmSk4sBpCcj15oUDRJXYYExl37RmirrkIv55rLASYJJF+S3t0nopeptU+E+mLrLK+lPgQyid3mCBU6UP1rVz8R2n770zc/Xf7x8s/Nn9fvaFi3rmFHPfmMLWRP4lycho/jNPY4W82Os88wiJ34K4tdAIQjAOQkx8YArcM2PaAOjSZBL8uolzAJFFvGDXd8ej67P2AvKpUkOYghcnK7zl300RBcsExwzJ/hbrd7GuYBwhgAIYtbTx/3+d4klJ3gtKCQnGIz9InYZEzqG8EkjSzNavCB/cXYlcQshhyMsZrI6PYLWc3lOG/vlA4rHr/3uTFD3r38/r+3fMKOke9W4oJ9G566u7au84CpOz/ct5R99wF7W6dIYjjnawrHIAh3hlungFOWgXoyzVKbHOr1eD19Il6vISsrrU8kSzbY+0QMGpdjgYh60zDTHJKHoyP4404pw27zB4o1o62gq+BLL299am8j+zv774zj995/dgTOZsOfWr3rnTWPj2h8qGbo1/M//kYYvmxfms7TtPrM54E7ns4vwBw0rFy/aNJjRRVTet31OgCBPABhongUDOCAzuE0h6gnxChToCJ1ulB0iH0jeqvscFBZotflk+hMQ5oJDqhrC/l//FxmAUlGYeK5Z6Jl5MDec2yJQdc+l5ViNduL1avoZ805eGll04jy6COKheT8S+U6kQwdw+lW6nPpXF4qtEoBziwAye3mMnRLkqlPRLqZdQlsKxTcLghkqhzjrLL5M+WgUwldSkjbL1HPLrCf51d8MHbv66zu/mcGl5Kz0YNZ0+mcf759kbEB29qGGrZiYWop2b2R9fYqnKnlWOVzqXqgNfQIB5LtRr8fQLLT7CyT0ZLaL2K0WFzU5e0TcfmojkckcgvcyhJ4pNlr8Bd63VyEhIbiGhfIBFGTq8R9lqcWB2Dl1G79Rn/9i8n08OU3L/760UX2E369YuvqVUPrI9VryFR8CXc5V/rYefbW7svv/YNdxUHv/OnFVQ1V8yse2Dde0UcAIY/zU4L0sA1FEQg3jJT0jVAJFBlqbOOrALk1dCOmkuHNF+mpaKOYunHhldNAlZhEyFGpz4R20C+c47Vmu+6gqXo9lewuq5TfXrLnZORk9Ink5JjAlNwvYvJBoF8E5N8qd9nN3jrmj7mOx8OPLDXqolpgwv0zZkpuzaeTynf+vWjNvnr22b+bsfDJR7+e+cL6dQ1bXlu3CDvOWfHIMytnrhJPHt7x4L7eg/48+8C5U0euLuu/f8ozr1xteHTRssdGru8V3kwfeHTMsN937/zksLEzFdlO5NQpNsMLWdAtnJlizzQYAAQu26AljUvWZbEQlyuJi1Ymcr8Iaal2jjKNg5qJ9Ctqx02jMyDFKHJw8TpUIvjHKhXZQlZ0/Iwe1eO++6/RVHpg2mv/uPbBuguPMtfKLU+tuXfjkIFraEVzg2tlMuZg6O57/vXBP1C3kZ3H9od2PPV81RMVE/aNAy3HEcaokRS34Ta+LAA8XotzQMRiizkRDVfN87X0JXae6NzkVR6Znehb6J8XL+Y3IKovXMjn0oEDMrkmmc2iXu9yGm0DIkab6hgTZklwj/T6FDccpXsmn6Rjlxv+knyrTFMR8+U/cF9+DiRwh/UCiChwdeXD58cDhSwsRjeikNNcTo83/0AtP2DDKLywji1nhxSezMTjgo9eVHOy3LBbJgIQ0OsEsToiIFRHrIjI4wHOlfxEz6a4ZOTXTLq9eTjdTofW1bEH6up+g5GIBDhGEr2BkRNVlMZTa/P3HKVyrMMKrF3H/KPYUAWjlGsXaRnXrxTIhrJwqp/bMtnphFYWIdgGoLWtddqASGuPzdA7YhNaqFZLvVJSEa48LZwUd4YSN4mJ+aq/ctSSXgtmD6gf2emV91/9KNj38bHd9l3PX0tq19dMnzFw3OSsgsWjj+zqPXn0w4On3e9nZ+NJLYFZ1yqkQ2ITFEM5zzwyA+1KLJ1kVwpAjsvSTgx3S+rQQeiisxv5Ky+9kGbnqUmllmSFEhOP6/G4ug6C2nJQUPdSt0td36R1IFMgbsUalrqlQAbw4KK1v1BwIH/udKqm8NCQbeMHP2LUtVk3rv7Fb4712N3Tt/DeaWvZt3+8wA7swe6Y/5cvjv3I1rHJn+AyhLM44ODVn14/7bBUDpq/hpxb8c388XfdM+rU3veu+Tws17Pv7O79aFvzMnvxc3aaHRq8sAZX4jgUsP7CfvYntoNhGYquJiAAAKJNPAIyWLjk0ojFqENR0SwqyILNaiG9I0bRYhFECoKD518xh6iplZYz+5W8H0OIlBsz/tURB6IHmnaT7itJORvb6A94cnbjGZYvHrnSg0zENwfPGTGddQIKJwCEo9xyW8ALGdA7nO0UUg1Wn89iEGQLjwd01iRrUlXEarWAxVcVsTjAWxUBevt4QnM9/gxBMbluwe4SAjxpj/mcgN0ef3cCt2IAhVVLsR/7+TIjjZjU9PTeY1ew4I9/Ovhn8cCeI/Nf9BnK2Pk3/kZ7TF00+6HoquhndauXPAGAMIdb09Oqr8gOu6jFpbdQb5IDekccglHi/HK2DL+4emRymUNIE3+Ro3WokKfbtNP37Cs0/7rxjQ0X2Cvs2Rex/NNLuysbxBB7lX3FPmdvl64rwyU44QusOVSzuj8AUTgmDuEc04FdsYcWQQ8COJyiuSoiUsFSFREct4ppwc9rSBlA+ZuAPZTBx2Az2Uo2CY/hIHysic/1z59PI/dU5CtWz+aJB9gi9gKmYebVKZgHgMq89Bc+r1GJWSSDAQXQoWAyS/reEUlCQsTeEUKRr3B03DZmUZBwxy/6S/MZmh+dTYZHt5OF4oH1LKc+eilhJj0UhpMlAKQ6pAbjTRPxSW45Q0CbAac3asPzwaNfrY9LTuyi2ilOhUvnI8SSohNapUJK7wiAaDLZe0dMgujtHRGdt4+8/HaphRyV9+rq5lT1xe9nfPc0a2IrDuKQL//9bve3DrL/so/Qj0kbVrGXCYuWZWXjUhzzD7xn/+D6GvYau8Q+Ze8H8LUY7WK6yuVQ2KdHBJ0giCCaTTraO6LTiQaJoshJV81RgnG/Qbydi5f/DYnpjc2ssZGSRrI3Ws1z7dXkYQC8NoLNxfFqVpwaNht1OotVT4GzFDJj9GrpGI15+JJiPpxLMg0v6dVv9AONx9jclFWuR6fyFGvI0TNxvRC+UjHmnkjBViRGg4Ix0Yn6RGzLWkgJZRVRDKHw1TvRrzc2NpL1J6JN5M0l0dc5snnk4+jCBF0QIT1soQCCJCMFzgtw3EBXxTekkO0+0aio0pV/bIp9V+KIgpPrUZJOFCUev/JSmsuNBjuVjDK1gKQgp2DnLbuZlRjwuJUAn2MY4nce4COtZjadZSsCntbhh6zRomMm0bbpo+bh4oGrVQLPOume7Uev/BCXo1IDsUG7sFsvcaytVpDB7jBS2aqjKCdypaUI4xPzabNJKZdj+WvNn+tsW4/RVB2xkGeEk582NR/nE3ZMwaxy2guAqFp99FZ5bu+IXqDW3hHqvLVNiOltBiTmueJRtpW9oZgjHIE9sBOOujo9+v1/fvn5h/9Eeb77LHuYa+94HIt1bArbxs6yU1iIuRjEAnYqZp+E8erqdUBRONnA+c75DE6XQaiKGAySLDuqIjKVEtavhpXmSgW/mlplYChutYXx7Ay7tLsRZ5PWUePGL949euKoYPr7t1HOh2jK6mdXrVC5wHaoXLBCCp+Zp8MeAIEa+OqmZtns6x0xC7KTL2yZM+MtlRs3J6I2pViG8q258sX7OOxndrH0tpz5ki3rzuqxivyf/DnN+WMCN1SGs8yIxKS3y0aDQdYTwePVm8EMVRGzmVDK5UepkSi6cntnp2Ku8ktw20SOf5bGNm4BcRXyGdhfcfkJ9jQ7/VXTzl2vfEZGRLeJB94/zf4+LjqZjFi9cuWqJwDVHIFw29ha4V6a0wSQ5BSFrGxTGvV4uH30CFSfoEoJiY4mt0CGlozy8D+o5jgx+6jmBbwy4BEI+9d3rHnZ0I/GN+7usnL1ey+xM389WLx/1+INHRbWXfoDLjz+6Z07su+YN73vyIFFvd959sV3qtf2nfFA35F3FQw8AoDgABCGcv7JvJ7iABSRUp1epgK3CYLmFeJ5qGYSi7k3IEsbWYFQyQrE9PWqJzjM14yPj2OHrLDdhgYZZafDrqOCmQ8UpzGUuFzsLkUnVHMYs4uij/2F/cJfFxrfee3ld8QDzf2vsC8wo5nuaa44+Mabh+ghQAAA4XW1/pMcNqJgMuooCJQqiPLlrxWvQhjgF8//SgXTwej3O6M/NmF1x8zWHdVaFh/5uU3bnwXkmg1yXz6aT6km+QwpyW6LRdQn2Q0U9TGTotqUGOKqNclWAjJldKcyenwSZ0h8cyc75y5CT3v2xU42u+nL9p6UYpSa0Nne7yy+1EQ/7PaW6/dbm0N88llHNx18ic5qnrv59RXv0YUK93QAQr1q9QNhhyCJ3ORLiskXFJMvtDT5KhocAz63Yu7rj/PIY0oTXmKdjuAkfHg/60QWROeQZnI4+gq5M9oX4lybrUY5GWGrIBJRpnoDiChTUeOcJmE+qKL+GCJdcNEhlrSb+Q6T8+R887zoCZJPFyv1ZQBBscZ6pWKmQyqDLKBgMIoCNwcUdUrMcuuKmVot8AvlzU6qi9roq82/0LSFwoaNC69OAIQGdoRMVnSRY2mRUFAYoxcJlTDIOdBSfeJRD5nMSvEEu4B+dkS6svyKX6HWC0A+i1c2Kd5c2XRy3h0mgYbo/4spg/KNEDuCzdrMFFACSacHOUgFevPMXj5rMb9CfMoLfOrSA+KF5b9KyigFJCgExOMgQVJYD1TWiQQEwrO+G5rpVFUTC3DfaPxsA1vG9pEg3dQ8jnwV9QJea2Zv0k3XKtUKsJLHIlEqwBgjmU/LQUfRp9mbCwCxTjhHHZIf9OA8AILRID2BkJ+s1ZoxwDW1OMStBHU83G1fm5MZ0+4QzhUdK3f33F8MRKk50lPCUEXzoVc4K1NnTEvz+Rw6yqMpYkzrFSFGI7jd1ooIt4LJFRHRA24o/98LVH4tX7NllapJZ7zS6LZn8QVeLKsVKjrQrxv43GPPvUychyc/VveH0F3HR77xCrNs/mPDWy89tOWB3js3Y1+b1GPe7Jq5dxTuORZ11TZuHC3LD00fOhwI7OVWtVZygRPSeVUt0+D1Wq2mVGqiGX4zmNwOu8HOhccRljzgqoiArYV5DSXF1SDB1sddEk825YBijeRQiVcrvHAqyJ5Pv/3+k0l/7GwKzGzQ6Wa811i/qXFjfb0wlJ1jP/DXxwMGLpdcbNHcsTuWvv7ll29fOPPJXwAQpnMOLxWGxbIaK6VuPU3ySmaOmQ0cHDPPzVmNGM9qlJ1DHgNzu6hmOGTcZXYV9f8d8HTbUOn8QrbvuW11Tz3swiw0oRPvyPQu96Sywe9+2mlNGRBlVqGU88fB+dM97E+VvGCx2CV7ht/htgIgmqhez9mjt1FnRYR6bscerSYTkLTqvTcUDPLPA6osi+JOiG7ST//n2W+/++TCTLMsNCxmTzdu3Ny4evOmNS9gNlr5647tA/rh0V+/mfny+4Gv3r54+i+fxLF0cN44IRk6hdOTDF4jpdzqtkrxGit4uRskyaUyyqIw6paZQyiRZQ632++JsUuivNbh53Kb+x/2JYp/e/+7qFl8eecf/zBk65bfb7WQLstc2AZl1GMH9v3fJxx/p2pttp/+c/eGrS8oUksFoBYpHVxK3cVlMjkJ4UaSuj0GvhQMgKIsVkScspUqq0GtY98IAxWmOZS1p2QNgeJSXkPW3DX3mE+zrxreeANH3lObN6LH8KHopW83l9G3+3TugmsDC9PnPNkLgEKQuYQCzplcKIVu8HC4a56vQ5YpvYtY4ESnSHIzW6Vn+Qzd72xlLbYWV0R0nXpFDJm6XKvOqvPk5pJekVxrm/JekTY2T7teEU9KnHUa+zj/8pXd+rzbxD1uragaVBdAqDC+jaAUkrJv/OXKcGMXmJOnbhQXF/F3QsHJVnf87VhB3sSqoa/te5X9jf3r7FdPzMgtC/ccNOnTtwb3ZPb6ZWdOPLzh7amPD50/4z8/1T4uVE5ICkzt9ewxXYdBbfPqVx54ddvqMauTndXFnYfmBnY+2PS66ypEhs2ZFOn5IO08/ZFvfn4cEPYCCD24nnuUzM5i0nFz7dF7vEkWvcMhVEQcNgOA3q0Y7xjlCatesVT2mALbtRUfM1P06cfm/+GZhgadoWD/jBMnyJuLfn/kk+jrfHXnDOow4N5XP4gWAxDYDoDjxAtAwcr9tZ3PJCDa7Ga5MmImVlQ04/3EwqZSIqAJJVQc3NDQ1CG3TceObXI7CJWYU1Zc0qFDaSkAubaKudSxTZAEd4Q9TqPRrNP5kj22yognrLcC1z6ISzW5xSTOhATTljhb3v2det7Zv/eNGZnLt9g16B6h+aqNHZHv0yaP8TSV89QGJTzetxgMRqNOEkSdYHeYAGw2nY7KRje1xiKGfD5zeUyFyuJsRTUiQi0bdclYkzcER73JeuD5E2zOnB07dKSgy2icydpGlxLpQTZOcjW/XTo9NjcO5nNT4GQCoiASQHfca2tMVBjHYVRo6SRfJQGoCAfcdruDiz+gdwRo66xWHrfb4RPMPm5p0302p1UPDkUPuCLEt534Igi1bHVIVIgEzfAqepHh1bRDypryyOa1DVNmblnVsDhFl79rIuIAXcHhmYdfJicWLNj3cnSLcv/zx9HjQmV99dDDg8e8+heuMZq2cnxdUBBOApeiri69x23S22xcWW02g/V2ytpSV72Jmrp7m4JG6NDUt95RNPXwJ+q8d0XUSWM2dhSfU9EknsU6wSyDnOwzeLgds1GbYvxvmcVylSHFilGFxE4PYRT74fKaf/wOTZcvobX5lZ3PPffii88/10Cy2I/swyeR/AFNmMfeZ1f/8rfzH545p1j5vdyW1apU+6E8nOEzCrKsS3foHJkBwQhWq7siYrXprboUaHXDzMdZ0GLBqpaeO2hPAhMUr62Y+gRHrThpU8Niry7c+PBf/+f7yzvryabGFc8+6xowcMRg1kUqqh9azT5h/1GcNr14+GTWl29fevfUeYVXHNNSlVexqMKW6qHJyT6bL8OfnOK1pqalecxOp8wtv80MFRHz/+Y2VT5yJ1l63Ul6r3vQ0njtQyL9GzaIW15cvXnjnI8uf/fJ57P0SQsajObpM/d9mHXp3YunT59birloRDO2a6z/9T38eEzFCzE9okGOpw1ywy6zXm8wEF4DsZrB4FYtg03rc2nRkaE5IY15ZEfvjt4eRQtfaahz6rrsFoaZNlk/fTbaJFSenDQjlrnS6XyW1twOtIplrqLzeuZaEfHYJKq/rj/5t8pdueG5kbsG25Hfpq50+j/e/+tjA/bXzF82+dmN88r/evSPL3Z6ftEjj7Yds+J13jSzsaHnpjbt7h4Uvrdr2aAH+yzaXLm4R1W3O7p2KO71FCCkX/uG7BQrwKPWJlwu3jPioEKS1+C0OXtFLGGbVeaCkj1xU3kqIVjV5ONWqo52xVGXhtxKNuHyEMcdA5NSJuSy17ZurRiBXdlrw2vN8lyzHQeQZdU9/83mRWePngiAsIOvrjKhElx8fh86ZZPJ4DS4PSaz2aZzWdVV7TFqEbMS/4daVmW0rJcrhBY127EvX9TPNNQl6UP7Z7zztlAZLeMO6GMSvnpozV2Dj54hp7RcjgiVau+HAQ0ms6hHK6jhiJZl+NX0NFTicIYQt7ER+76ptuiMte/tYyP4oI/8o0cx9iPtrx6K5UpSgI/Winsblz4lNc3rsZipYBZ0yQ7ubnTuxCyYK7c2A1U2Z2Rlk8LhUHSq1BmbsoRPKeSfcBbp2qSdPsY+3jNxsk5nLHCcaHqjg0snBF7dzc6QBZ3OvHR/dK5QyUaz6j5l+4tJbXTp7trW9eRvHClACAIIOpXGzLBdFiVAUWlxQZ3RLaD1pnQ4ngmjmhUfYgteQT9m/JktwFVH2Cn27hFSQLxsGO6IfhU9jUdYD0AgfL1LfHw3z/sVMqnHK5jB7OBLO0UHfIJCVam1GRJo46KKOdrSUrLvuwFOnfnuS/tYTsWfl/StKu2xq3cXzuCVn9wf+pn87mrGy5vtC03HtkAsZ6YPCZW3yJl7RUQr6npF0P2/5cz0oeZ/ksHR0+TL6D5y31Q6eN685sPxrixetlPl5/YlJxu9AFbZRbmnpqlpTq09K3F7TdV/bpXcPJZTfEtxCddDvj7d3EK4ZLfHjedrpx794PFH58/49MClCxdM44aRZaRxE+aPjywnw0Zg4ebdS6Xj7NzZoCl4FhAvMxuZrfluorSo0RSABN+tlHzx8nKeJv3cDAiV7Ijaw5Oq4OwWDQ4H8UFqqsXiE2laujso0QScEzYFFXSDxYr7U7DPVNCV5Dj2pcRw4eKhDx+Z/9jjp45OnvHwVFIePIvB49LSPRvZ+yPvJcsjvOq5cRenZNg4zJn2qEvdpyXVQg6tAS/XAzu1JvkcpuoIdVglCaojEuTngS3pjfw38rSkOlOZT8nQVNOmbD9lKoU5HFg8t2TMUz2mRrqPyi95omTcisrHK/sMJSfuLFn/UKvsVinhsvqH/RkZSeoOPFuKdcJwrcuYCALV8343AGpSu4xtNPOWXcZcCQNO1/Xt0PNKk/Gszp3Ly0IVZPfVC2Lfxb3C5ZVhQDjK7fd5dVemazjNozNTahCARxo62irVJxKnwUz4SzDKgg+07k9ljt9sw2apra1KOJCldLR6NAOuqD89OWHNwpPHcdniPisKChY+tHv7My8sX/FdifTO+xlov4LNXXfvoH7vstCH5z462QkQypUYSDzBpV4Zzk5y6s3mZI+dGD1OMS3dlORL6h/R+3xOcNr6RpxJIPa5uRWkRdPQzZ6Nm29lf5Lfinl2ypuduEqQxqONXTatnD0HG9jQblU05erVU2+99f/EEzUL+/1uGTs397MxS+7YtDz/xwtzsfO+U4psZqMkeIVtnHNByAibW0GmBSxtctLd7iwZeNSYn1gJchaVBku9il8r9co82Ja9clCxDnKwNLs0IXQ6VLV4+OLx8+eOq7t/UVXVgmF14+YuGrN42MKqeVtnzHh627QZW8mHj01aNmxh794Lhz059ZEFD/CHvfj7JZN+N2XbM1Onbd8BiscDEJT9Fw8MDrdzWGSj0WYS9URPTS6LW/YmGSwW2So5HBScbqsz3UmsTqvThG7JlATlWg+33RHrzL7lpjuGUOGj1uaovjBEKnH2HjYCJfY6dmGv72BvYGd+ARu7j1wgZ5vZ3Ma57Ec08RslQBKsgaxUVYkkUR726QUqUDlmFjgmiYqtbgjFLYRiI5p/YebmnxVpXPuF1kupUABdeGdcdiE4pdy0Dj5fmkmCgNS13E07lbRqK/n1/mCviN+tt/WK6OGGznh/s4t9I39VVFmLztSUlwuwZdCiRC2l/Kk33lG0dHD/qprTbw5/ZmTxqMV9Z8yYvelw/cCqjf/+6K9P9H9t4KLl7R+cvmJR99W/f6Ggbs3LPQbRnMF1WW0mD5q1NDW4IJjSKdy5prTH+klDl+fctXrZxm5rs9r27dWuY8e8oqHTRvWb0MVZPfnuKWXOMUCwWLTQ8eKH6u5TWpiTanKAI8lnpW495N90QCAhzctKeI/FxVnZpaXZWcU4pzgrq7Q0K6tYnFrUrl1RYUFBYfwOQGEM7xzvEdt5hxKeSwWDXmrNT0936a1esbSDZAKH1ZRuIuCwOYjJYXKk5AWcoRQByhNPBdhblgFRMxHuG90bnN2obu8KDjc3eYHM1py5DiFU2NqhNXTQOXMWz10weE77sRWvffDZq0880vHB5vXv4PB3les1tv2D02z76xP2YNvdezD3pT3s7N497JOXhMCeTTu3t/2dq9X3n575qfMjIXZI/Q7b/u6brOGD0zj0rT+wD/+wB3P2xr8GQKCCushU8W1OdzqUhlt5pRQDokeJazP8rQwGh88D1EYJNTvSOakf3feGku9qVGpqG4xTV8ojfbXWGSt18iYUtdZJXEnDlt0/edPztWvHjM+btnB+HauecmLUlAeov2bk6HHjJkhCcGFoRIcJs1jnI2OaCgRBqd8NhFraSI+CBGbICTupxI21YNTrBbMkWKwmUYegHGS5WbPRiyhjVuw2EAfPVEriM1kjLsUhtexzTK9lO0kQ1/dk29mzvXB9yo23qh9EHfeDXhAhJWwiKKAki0J1RCSQr20nattixUJOXfM71Bv9Hhc+CdeuaV3LRAIbAAjXdUoX16r7wqGgF3iOLui5Zpn1JodXKu1gsnFoi9Pi0DmtjnQHAR63E4fT4bythikCCP22ZKVVoUS+hp0Bqm51Fnr+L2UjHz5YPXLwfRNx36B+l3eeXrwWxYbNVy/8n+pGrtwd7tNtSfXsNFaLo9jTdPZ89ub/pXB47YrkEiRpzW3r+oJ09UfBJLnmAoG5dBi5LJ5U83Z/2GIGp7L7nGwzHPNQhS3J7yWaAKe27LkytvA6c/fPn39g4Oqa+fun195VPX3qwLunC2vmH9i/oGZlTdOCgdOm3l0zdZoiv/GASic8yQYLAMhwBiA6Q93NqCLLub9OUmpcstOLaHGCwAsItnQvZqjyadHEUVx6cz+0JMt+sjy645vIQH91edGont0XbPj9msiaPXiIVI2/NHhk35IePbMLh0yeP6V6/ZPPA4KflKlzBqAsnGkVRaCONIPUOstxn/MhJ+nrRKMzxUmcTl2yP92s88eVhKvIfTe2KDHRmKtlyd/2PpPpA3vsPbRzw4w1sz/8snbmA6Or7+w+pUPP8mXDl2wVvqx+wJu//YmVHWb32L5q0oAeXXrkBYa2LZl5056LnkfvwhP6xD0X5YAIN3pyAOvaT85494494cnCD133dnN3O1oEqNZDegiV4IHicLJoMOhs4HS6dC6+LeC2ulLMRKks6LWkMWHX6XqfaELKyMnTOhsGs13PNCxJNkz+Z/0Qg6GhAeewK698pKaNLwyr2caOScrsU1mzMEJygRWCYYcgIoBopDa7TidSq4jaQa/8RJkG7MortqVTEvILI6Z9PL1rzacn//ov0pY1S3t/raYhx5WrKDBA2ED6Yh0dqvitsEECMJuofkCEQsyAJOqq2jzatUOseZR82L1nz+7xMwlZzIVNAOBQIge7xQhgUfrILXa7jtog/71CzQq3qDNoZYbSkOzBpo31obZtOw24a8BDQx4ubWIXRk7UT9S1Kckrtu+bHgSEvqQKP1d3kPleHwFKDSZuX2mGBGlK3sc5EGO7FpnEzw8MXLlQ8pQsvpNv4K4ld9471NP2/hFAoDt1kaPi26q3zgo7lONnEnBvHfMfbr3iP964r4XTTjgzJSYsWHJ0V/3qF3eu3/B8lN07fsKwYRMeGCZM3nHw8LPP7T+w/TH+b/YjjwCBau4hdsY9BF+ZRr1AgMrEoJdu5R/4fBhELEUxdqM72c5aTGef1+IQVnvjPTGxCb3wfhzek01IufGW24c+AOIZzq8gnCYLACAbHrsGKMNHNDV6EPR/osTBA8ziYuCw7Tjs+ThseQz2CwV2Ou3PYeV9xMZBVchkAMkvnuAQM34FFf4CxEZ9KD5qXmxUIBBiM2mNMBxSoY3Sba1zpQWwlbVVwCXk5EIqmmhqKj93lzEgkm2zG3tH7IEWecP9w+9rGZ4ohslCYnXDUm9MGF2J0ihbnJBfkf59Rs7q4vv9Y9X1ozq9+dbRTwPhSMnYbk2zOnXtXqqkXKHH1tZM7NOvw5ip2e0XjzjcWDEhMjB/yIz70jFvcU/eGRvmVKrdoPJ0bltbq9R1v/YaDgTdn4hNzIa84ltA1MLCGETS7SCOQSAGkdoSIv86xGsg3HKMrOsQE6CUQxiaKGmtgtyAkWIwIMNxKIN5QK4xAIk3MIIVnNA/fAdPM+wIOhPaRNEtuvROycm7kHm7iMHM7wabASUqOtByowkglmHm5an5G8bOiYau9y/SAF7vYVQ2zqR5UUeUXdxLDtMT0SMkNXqR9Lhag0cfURpetbZG/AvZr2jRHOZSOkc5ztkqzrMIAf55rM9N5VmbON8PqhxBs8aRmyFqoTwG4b4dxLFrV2MQyS0hsq5DTACHylWC/hhXgUA+gFip9id54Z5wod3t1glmAKcgCUk+rogS11erXC6/JJ+WL8jcIsuyoNfbqiJ6Kri17tNEXW55EDWhHZV7uVhLarxnM5QhVqpNqbM3bcJ9eBf+bn/07S9xNlt4lIyKtaWSunqyntWxHSQcba5nhhhNYrmqS+3jurSmJdWx7jiVLwUx3sKsmLb5bgdRi4YYhP92EMegKQaR3RIiX4PgeGy65RhZ1yEmwMdxnW4b5z7CQrQJJmEDGMEX1st6ino0mXXgy0+0x2rMHLeOu0ewbTh8BHua7RiLw9m2MThS2DCa/3fbaLyfPTsaR+CIsWwrAOXzv877434CJ6RAQFkZnnRvmsAPExtcAA6rqFMCF0+a32f2945YHTpRoDazQHnjnES1lrm3+Fq4+YgL/ygm0lglwc7fxSoM1BZEj3qKzovZ1zsLv1479tEH9ykddGe2jnx04rGmh6Mjpu/9zy/NwbFk68SdWpPhmOUDNr2FDyl9dMMXV699l61D26bmvgOVZjp2ZRN9qTc7xVdOrI9LlUxpXLoVMfk7Nb7fDFELp2MQKbeDOAZzYhAZLSGyrkNMgA3xlRNMtEfCbHWUTvF5CmKjOFSQeO/frHjvH9+pMOtFUbKDBB6vWeALiC8fs96sl2LdkZoVarkRrHVH8v9lCDcaJGexM+zzQ42NZ9GHnuYrO3mL5LvvUdvFy4zXWq/B6ei/V+5Y9yQAqv0oW6R0aK94ppxcMTUAXpMJUu25YkGhw5Hbrl12RaQd5LrV3S5tj+vm0xpaZCBL2vZIQjWCo6Q2/2lnOTKUqE/1UYJv5ZAOKb36Lxv32p+OTCrfUnn27ofnjujZq094yVz2TcPf/v7+58IPi6dX3OnPyC0L3b917LZdPTcF8w/0mVQxcHZN+cTisqHF1YMuXO0r7Nv3562c52pXkOTnPL8TACXovgLUVWlXOH6L57V56vN2t3t+7FP1eajFc/Gz689fe+UW3xc/vP58whegruiOKsCNGRZehzj+cwyiTQwCqAIhKbtXOVDENWdkOJQLre3tedlIaF+WlJTe3ghi5y4pbYNtKyK+AqGgV6RD66BdECyZQU+xzqKriLgsNtBaO9R97viBxZsNL1corarUot3Jy/+qHSkOv7bLFExMz5TiAMaaVIb/wg7NmPnUc0VVb4+a/3xO8a6Hj/0reqcOO967tWbwurHswpy73lz03Mt7Jg1ZtfPpwzvoK7OWGon8BOY/+yddrEUqp/ie+4eMYP/9+yRWGwjyVpav5k5sXH9/5MVNo2XdQ6Sw4ektO5V1zXc4lW4kzreeMU+JFaqnVDtxVIn1ikl8vyqRVppEbn5e21993vp2z4/9rD7PafGcS1R7PsEQk1d7TaLX/gqAo9URXolZHHYXKGOgqI3xIgApTICovZYRgzDHIa79iUMMSoA4xl6IQTg0iG84RDrHQ4OYwA4CqBbHZ9d89VRlx1zyq6euqsJ5fsnUqhXwYN5jsTttkj7YRp9eETFSj91nsfLIR0+9LqSttY3QmLJw6/3b430QyITiIlAqxdlBMcj/lHpUk+6gRVqnV4kwil39+e/sK5T/9sUYXdkp9n3vr4YN77ll3OW+pzc8v7NpC3vppe0vPUtC7Ev2FzR/cQmlWcInr25+cGHXgtrefZ6cNHMlm8b+taaRbXjh4Aku21jXgbraqmOrzaLyJC1RNqNUrt0Vk/1HquySb/e8drD6PPN2z4+p45Ngi+d8fu35a9/f4vtcJtrzCSkx3Wh3fS2Ph2YhR9gJVO1CD4WTPAaDTSACKjsZTifKZjMqJ/QQ8tX1yhOfG8nPjUN6iccXE96Pp8ejezqVFHXsFCrqot3J8iefZP/q3KW8Y1m4nPwYfwOUY3tEGCUsjvv7PvxEa3orl8vQ6iZn76u47uxt1M+b2Kjnf3P2ZWVxBdGcfXw7QXSpTl4Si1SnX6L2X2yaUjNt+Dw0Xd40o6Z25NzmV4rxTJ9pvAljfYjl95r63Iuxboyetf0XbEBQGjL6zuy7cMOvu8aRRcWffLRjTHRO6DzXjNjutSq5e2KSf0PVDI8mmZuf107VNOfWz4851OeBFs+5ZLXnE/yxtZarrfrYDqw6wr2xGWIjpKsAWu+I2t+VyXex0jOkFJfNZpfsrQMOsKeYPHqqT+NdjB7q5euvRZPnb3oYUWsXUUomXo/W9JUVbx7J4HugOKR748Sz333/yd8fMwk63mSElTs38OYRzF9LmyID2Efsvwpjn83sV86KdcDaFQ1NOXQi58u3ce/ZMxo1nF6Nmgn7Y/TmxejV+puEyuv9TaJArLfsb+Iw6gkU6UvxFLggHe4Ot0uSrE5nKpjtqZKY4bc6eDxpBaOR51hGGj+Vwg8UUAc4b5zk4det2ia1fWVJO2TlvZF9aafq7NnSl1EYN4y9zJ7BYRgeN5RaonxdR8+Rfs09fmXXEH+ecs89LqzDiTgeF3ljSZmwlZ1m55QTGn6hNi32qy1yujAU0iAXCmBQuG26zkI8nqx8t7tVlk4oDOW1Mbbh0RHvSCKixdiunWg32pIyxcyKCIieFj7YoVjVRAeseV9R9a0q5rdyvYktTFkxnyvWs/Nzup6pu8B+ROnrBae6djz2+InL0aAOq4Y/e8+QDVf9G154buPm5xvWCb3mrjKRjN+7vp4xEwtQh3q8Y+a0KbPYz19MYDO5tw1mkLIPz3985rOPP/10x9NP7wBEE68Q7pH8YFF6wGWwWXmN0KJs3CSfKkwsE/Igzx1QzhIE0DR3nLfB89CcmUMWLuFF2u+WPJGTu3C+t3TBoiIAgpP5iG2lhdp+kEMyxSpMejflw753u9KSrHUfcfpp29njxj46a8zY3z3YPRTq3rmsqJu4b9TM2lGjps8c3qFLlw78AkQdn+k78TN1N5wPn+Szg2gC/nKrZc73En4mKLYb3o4vKU6BwvQ0olRTQpJEXXkDB/TOLAxZRpmn39tucP/KjIL21tHmqcL5rLZZnbvMquO3Tl1n1aldEci5Ff/FEyCCePMvngykw+K/eMIh5f8VUtYgffQ49lB7+R0HUNTpQenhP6WBBkscHEs5y+QZ1WF29yx63DMUTVyicNM3RdTpRZly061Rq55Od5RisXIk/bGKDPGARzmLjqmfcouq/e4LkcAKAEQZizSpY1khOWwS0KwXbHbQUZP2M1+x3pUgbyrhA/vjeGG9tcNjs9M6maNnb2B4FnXTeR1Tw7TF6DZldL0ZRcHuMIs2WRn9LW10DWe/ei9JQJ4ELUkjOsxJ7m6+QYbnXvbTY2Ow6D6FHh/7lTTBZZSVLOtqB8g4iCCHzeZK+dC1Y38ymWJ3vb5SBnteXszG7cAfyXB6EYzgPBD/URrIP3Wr6u+OqQ9OmDF94qRp5JtZj/9u9sx5C/icym8TiHvgB8gGOwAEwU4c/M4nELJA1RaoJelK5ZPTbBAIlYikk0WuCInpvPM3e2CJ+16ASv2UpGqjUBAIkMRRWhRNSeqtK6QAyGYBkJXxUyYgEkE7ZYLxAQJIVjbPWkkXx4+ZIJRzr1gnnuT0TQ2Xp3rTPZ5kI5Hl5NZ2wZDslYJtjN4kb/+ILklMTUvtHyFp1rT0tPw0qqdJaUlpzsxM6BvJlJ0W3iDhg5ZN3bwwdMsfKruRW2ZQbuRlt9evdcorVpPyolGwuJT/dUDsCHUKOz4AWfRHQvA065Z1snHLxtW7/oddaNewgZANO4LY+n9OPN+rQSxmD80rC7ed1/Rm9/puaEacl3tH9TwUsfXIpYPVzprl6o4iBXdYT0AUtDAtYc3y+EuJtrjkUwGEVlI650ylKvE+5ABA/HNTwuf9lc+BgItUcf0/AgZwQedwuks0ypTyaYjSqY+iqLe60l3E5aIWOZ1mxPuV70toergeGwR4g0v8V2eKi0otVJZJ05xV7GHcsHQO+0ESk9LSjDup6913x/KzVKdeX9THFGzb1v5TDDfpQ45bECoJ9+43cBcf0nCXXr/F8/43notvxJ6rVEnqc1TWG05X9cp+AAQRKWiHl2Knck80KgqljCAC4Aq1QvJpPHP6XaxCImp1FiUv6pwAUXstt2Ud9NrbHGJCAsQx9ufEKktsFtJBzroOMYF9EK/V+GK1mv8PflNJUQAAAAABAAAAARmahXJJOF8PPPUACQgAAAAAAMk1MYsAAAAAyehMTPua/dUJoghiAAAACQACAAAAAAAAeAFjYGRg4Oj9u4KBgXPN71n/qjkXAUVQwU0Ap6sHhAB4AW2SA6wYQRRF786+2d3atm3b9ldQ27atsG6D2mFt2zaC2ra2d/YbSU7u6C3OG7mIowAgGQFlKIBldiXM1CVQQRZiurMEffRtDLVOYqbqhBBSS/ohgnt9rG+ooxYiTOXDMvUBGbnWixwgPUgnUoLMJCOj5n1IP3Oe1ImajzZpD0YOtxzG6rSALoOzOiUm6ps4K8NJPs6vc/4cZ1UBv4u85FoRnHWr4azjkRqYKFej8hP3eqCfDER61uyT44DbBzlkBTwZD8h8/sMabOD3ZmFWkAiUs5f4f2SFNZfv6iTPscW+jOHynEzEcLULuaQbivCdW5SDNcrx50uFYLzFHYotZl1umvNM1tgNWX+V/3gdebi3ThTgVEMWKYci4kHZhxBie3TYx3rHbGr+Pdo7x4dIHTKe5DFn+O/j+W2VnE3ooW6isf0LIUENvZs1gf/LHojJwdpplCP5gn/5gi26FoYa19ZVFOJ6Sxuoz/q2Ti20IKVJdnqvYJwnhfPH/2f6YHoQF30aZaK9J8T026RxH5fA/WPW/8IW4zkpnIfoFLifGB86v0ffm5nbyRs5iaHR3hNBD0HSfTzoPugRM+hdN0x052KoHLBS0tdgpidAiEesDsgWYO73RWQz2LWIwjqnMe/uYISQtlbyf2NlT9Q9PoBcBnrO6I5ELoMeyHkNnIXGdv809H/DXNOTeAEc0jWMJFcQxvFnto/5LjEvHrdbmh2Kji9aPL4839TcKPNAa6mlZUyOmZk6lzbPJ3bo56//Cz+Vaqqrat5rY8x7xnzxl3nvo+27jFnz8c/mI9Nmh2XBdMsilrBitsnD9rI8aiN5DI/jSftC9mIf9pMfIB4kHiI+hWfQY5aPAYYYYYwpcyfpMMX0aZzBWZzDeVygchGXcBlX8ApexWt4HW/gLbzNbnfwLt7DJ/p0TX4+Uucji1hCnY/U+cijVB7D46jzkb3Yh/3kB4gHiYeIT+EZ9JjlY4AhRhhjytxJOkwxfRpncBbncB4XqFzEJVzGFbyCV/EaXscbeAtvs9sdvIv3cjmftWavuWs2mg6byt3ooIsFOyx77Kos2kiWsIK/UVPDOjawiQmO4CgdxnAcJzClz2PVbNKsy2ZzvoncjQ66qE2kNpHaRJawgr9RU8M6NrCJCY6gNpFjOI4TmNIn36TNfGSH5RrssKtyN+59b410iF0sUFO0l2UJtY/8jU9rWMcGNjHBEUypf0z8mm7vZLvZaC/LzdhmV2XBvpBF25IlLJOvEFfRI+NjgCFGGGNK5Rs6Z7Ij/45yNzro4m9Ywzo2sIkJjuBj2ZnvLDdjGxntLLWzLGGZfIW4ih4ZHwMMMcIYUyq1s8xkl97bH0y3JkZyM36j/+58rvTQxwBDjDDGNzyVyX35Ccjd6KCLv2EN69jAJiY4go/lfr05F+Ua7CCzGx10sYA9tiWLxCWs2BfyN+Ia1rGBTUxwBEfpMIbjOIEpfdjHvGaTd9LJb0duRp2S1O1I3Y4sYZl8hbiKHhkfAwwxwhhTKt/QOZPfmY3//Ss3Y5tNpTpL9ZQeGR8DDDHCGN/wbCbdfHO5GbW51OZSm8sSlslXiKvokfExwBAjjDGlUpvLTBY0K5KbiDcT672SbXZY6k7lbnTQxQI1h+1FeZTKY3gcT2KvTWUf9pMZIB4kHiI+xcQzxGfpfA7P4wW8yG4eT/kYYIgRxvgb9TWsYwObmOAITlI/xf7TOIOzOIfzuEDlIi7hMq7gFbyK1/A63sBbeJtvdwfv4j28zyaP8QmVL/imL/ENJ5PJHt3RqtyMbbYlPfQxwBAjjPEN9ZksqkMqN6PuV7bZy7LDtuRudNDFwzx1FI/hcTzJp73Yh/3kB4gHiYeIT+EZ9JjlY4AhRhjjb1TWsI4NbGKCIzjJlCmcxhmcxTmcxwVcxCVcxhW8glfxGl7HG3gLbzPxDt7Fe/gY/+egvq0YCAEoCNa1n+KVyTUl3Q0uIhoe+3DnRfV7nXGOc5zjHOc4xznOcY5znOMc5zjHOc5xjnOc4xznOMc5znGOc5zjHOc4xznOcY5znOMc5zjHOc5xjnOc4xznOMc5znGOc5zjHOc4xznOcY5znOM8XZouTZemS1OAKcAUYAowBZgCTAHm3x31O7p3vNf5c1iXeBkEAQDFcbsJX0IqFBwK7tyEgkPC3R0K7hrXzsIhePPK/7c77jPM1yxSPua0WmuDzNcuNmuLtmq7sbyfsUu7De/xu9fvvvDNfN3ioN9j5pq0ximd1hmd1TmlX7iky7qiq7qmG3pgXYd6pMd6oqd6pud6oZd6pdd6p/f6oI/6pC/KSxvf9F0/1LFl1naRcwwzrAu7AHNarbW6oEu6rCu6qmu6ob9Y7xu+kbfHH1ZopCk25RVrhXKn4LCO6KiOGfvpd+R3is15xXmVWKGRptgaysQKpUwc1hEdVcpEysTI7xTbKHMcKzTSFDtCmVihkab4z0FdI0QQBAEUbRz6XLh3Lc7VcI/WN54IuxXFS97oH58+MBoclE1usbHHW77wlW985wcHHHLEMSecsUuPXMNRqfzib3pcllj5xd+0lSVW5nNIL3nF6389h+Y5NG3Thja0oQ1taEMb2tCGNrQn+QwjrcwxM93gJre4Y89mvsdb3vGeD3zkE5/5wle+8Z0fHHDIEceccMaOX67wNz3747gObCQAQhCKdjlRzBVD5be7rwAmfOMQsUvPLj279OzSYBks49Ibl97In/HCuNDGO+NOW6qlWqqlWqqlWqqlWqqYUkwpphTzifnEfII92IM92IM92IM92IM92IM92I/D4/A4PA6Pw+PwODwOj8M/f7kaaDXQyt7K3mqglcCVwNVAq4FWA60GWglZCVkJWQlZCVkJWQlZDbQyqhpoNdAPh3NAwCAAwwDM+7b2sg8kCjIO4zAO4zAO4zAO4zAO4zAO4zAO4zAO4zAO4zAO4zAO47AO67AO67AO67AO67AO67AO67AO67AO67AO67AO67AO63AO53AO53AO53AO53AO53AO53AO53AO53AO53AO53AO5xCHOMQhDnGIQxziEIc4xCEOcYhDHOIQhzjEIQ5xiEMd6lCHOtShDnWoQx3qUIc61KEOdahDHepQhzrUoQ6/h+P6RpIjiKEoyOPvCARUoK9LctP5ZqXTop7q/6H/0H+4P9yfPz82bdm2Y9ee/T355bS3/divDW9reFtDb4beDL0ZejP0ZujN0JuhN0Nvht4MvRl6M/Rm6M3w1of3PVnJSlaykpWsZCUrWclKVrKSlaxkJStZySpWsYpVrGIVq1jFKlaxilWsYhWrWMUqVrGa1axmNatZzWpWs5rVrGY1q1nNalazmtWsYQ1rWMMa1rCGNaxhDWtYwxrWsIY1rGENa1nLWtaylrWsZS1rWcta1rKWtaxlLWtZyzrWsY51rGMd61jHOtaxjnWsYx3rWMc61rEeTf1o6kdTP/84rpMqCKAYhmH8Cfy2JjuLCPiYPDH1Y+rH1I+pH1M/pn5M/Zh6FEZhFEZhFEZhFEZhFEZhFFZhFVZhFVZhFVZhFVZhFVbhFE7hFE7hFE7hFE7hFE7hFCKgCChPHQFlc7I52ZxsTgQUAUVAEVAEFAFFQBFQBBQBRUARUAQUAUVAEVAEFAFFQBFQti5bl63L1mXrsnXZuggoAoqAIqAIKAKKgCKgCCgCioAioAgoAoqAIqAIKAKKgCKgCCgCyt5GQBFQBPTlwD7OEIaBKAxSOrmJVZa2TsJcwJ6r0/+9sBOGnTDshOF+DndyXG7k7vfh9+n35fft978Thp2wKuqqqKtarmq58cYbb7zzzjvvfPDBBx988sknn3zxxRdfPHnyVPip8FPhp8JPhZ8KP78czLdxBDAMAMFc/bdAk4AERoMS5CpQOW82uWyPHexkJzvZyU52spOd7GQnu9jFLnaxi13sYhe72MVudrOb3exmN7vZzW52s8EGG2ywwQYbbLDBBnvZy172spe97GUve9nLJptssskmm2yyySabbLHFFltsscUWW2yxxX6+7P+rH/qtf6+2Z3u2Z3u2Z3u2Z3u2Z3s+O66jKoYBGASA/iUFeLO2tqfgvhIgVkOshvj/8f/jF8VqiL8dqyG+d4klllhiiSWWWGKJJY444ogjjjjiiCOO+Pua0gPv7paRAHgBLcEDFOsGAADAurFtJw/bt23btm3btm3btm3btq27UCik/1sq1CH0I9wl/DTSONInsjxyKcpGc0VrRNtGx0dXRF/FpFiV2KbYl3j++Jz4vkTaxKjEgcSXpJzMm6yb3ALkAnoCV0ARLAcOBjdCAJQJqgWNhJZDT2EbbgTPhz8h+ZFJyDbkFSqgVdGh6Br0BhbFFCwHVhNrj43DXuH58V74WcIkahHvyDRkLXIGeY18SxWl+lMHaIVuSc+h3zHpmNbMJOYuy7DF2E7sFvYMJ3Clf+3DHecNvjm/m38g1BYmioxYS5wqbhZ3S0Wl2tJkab50U04pl5CHy9vlmwqlZFJaK4uVnco55YlaUK2kNla7qEPV6epi9aMW01jN0zJohbRZ2mptj3ZWu6e91wE9vT5LX63v0c/q9/UPRiZjprHS2GmcNG4ar8yIOcycZC4yN5mHzMvmE/OrhVq6NcCaYC2wNlgHrAvWQ/t/e6w9115r77XP2fecrE4xp65zwM3lNnZnuBfdZ17E071sXj6vrTfP2+Hd8F74lJ/eL+Hv86/6D/23Qfogf1A+qB10CAYGk4LFwdaf2C+JfQAAAAABAAAA3QCKABYAVgAFAAIAEAAvAFwAAAEOAPgAAwABeAFljgNuBEAUhr/ajBr3AHVY27btds0L7MH3Wysz897PZIAO7mihqbWLJoahiJvpl+Wxc4HRIm6tyrQxwkMRtzNIooj7uSDDMRE+Cdk859Ud50z+TZKAPMaqyjsm+HDGzI37GlqiNTu/tj7E00x5rrBBXDWMWdUJdMrtUveHhCfCHJOeNB4m9CK+d91PWZgY37oBfov/iTvjKgfsss4mR5w7x5kxPZUFNtEoQ3gBbMEDjJYBAADQ9/3nu2zbtm3b5p9t17JdQ7Zt21zmvGXXvJrZe0LA37Cw/3lDEBISIVKUaDFixYmXIJHEkkgqmeRSSCmV1NJIK530Msgok8yyyCqb7HLIKZfc8sgrn/wKKKiwIooqprgSSiqltDLKKqe8CiqqpLIqqqqmuhpqqqW2Ouqqp74GGmqksSaaaqa5FlpqpbU22mqnvQ466qSzLrrqprs9NpthprNWeWeWReZba6ctQYR5QaTplvvhp4VWm+Oyt75bZ5fffvljk71uum6fHnpaopfbervhlvfCHnngof36+Gappx57oq+PPpurv34GGGSgwTYYYpihhhthlJFGG+ODscYbZ4JJJjphoykmm2qaT7445ZkDDnrujRcOOeyY46444qirZtvtnPPOBFG+BtFBTBAbxAXxQYJC7rvjrnv/xpJXmpPDXpqXaWDg6MKZX5ZaVJycX5TK4lpalA8SdnMyMITSRjxp+aVFxaUFqUWZ+UVQQWMobcKUlgYAHQ14sAAAeAFNSzVaxFAQfhP9tprgntWkeR2PGvd1GRwqaiyhxd1bTpGXbm/BPdAbrFaMzy+T75H4YoxiYFN0UaWoDWhP2IGtZtNuNJMW0fS8E3XHLHJEiga66lFTq0cNtR5dXhLRpSbXJTpJB5U00XSrgOqEGqjqwvxA9GsekiJBw2KIekUPdQCSJZAQ86hE8QMVxDoqhgKMQDDaZ6csYH9Msxic9YIOVXgLK2XO01WzXkrLSGFTwp10yq05WdyQxp1ktLG5FgK8rF8/P7PpkbQcLa/J2Mh6Wu42D2sk7GXT657H+Y7nH/NW+Nzz+f9ov/07DXE7QQYAAA==) format("woff")}@font-face{font-family:'Open Sans';font-style:normal;font-weight:700;src:local('Open Sans Bold'),local(OpenSans-Bold),url(data:application/font-woff;base64,) format("woff")}html{font-family:sans-serif;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}article,aside,details,figcaption,figure,footer,header,hgroup,main,menu,nav,section,summary{display:block}audio,canvas,progress,video{display:inline-block;vertical-align:baseline}audio:not([controls]){display:none;height:0}[hidden],template{display:none}a{background-color:transparent}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}dfn{font-style:italic}h1{margin:.67em 0;font-size:2em}mark{color:#000;background:#ff0}small{font-size:80%}sub,sup{position:relative;font-size:75%;line-height:0;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}img{border:0}svg:not(:root){overflow:hidden}figure{margin:1em 40px}hr{height:0;-moz-box-sizing:content-box;box-sizing:content-box}pre{overflow:auto}code,kbd,pre,samp{font-family:monospace,monospace;font-size:1em}table{border-spacing:0;border-collapse:collapse;width:100%;margin:20px auto}th,td{border-bottom:1px solid #bbb;text-align:left;padding:10px}th{background-color:#63a0e1;color:#fff}tr:nth-child(odd){background-color:#eee}tr:nth-child(even){background-color:#fff}body{font-family:'Open Sans','Helvetica Neue',Helvetica,Arial,sans-serif;font-size:16px;font-weight:400;line-height:1.5;color:#666;background:#fafafa url() 0 0 repeat}p{margin-top:0}a{color:#2879d0}a:hover{color:#2268b2}header{padding-top:40px;padding-bottom:20px;background:#2e7bcf url() 0 0 repeat-x;border-bottom:solid 1px #275da1;text-align:center}header h1{margin-top:0;margin-bottom:.5em;font-size:2em;font-weight:700;line-height:1;color:#fff;letter-spacing:-1px}header h2{margin-top:0;margin-bottom:1em;font-size:1.5em;font-weight:400;line-height:1.3;color:#9ddcff;letter-spacing:0}header h3{margin-top:0;margin-bottom:1em;font-size:1.2em;font-weight:400;line-height:1.2;color:#9ddcff;letter-spacing:0}.inner,.toc{position:relative;width:840px;font-size:1.1em;margin:0 auto}.toc{padding-top:1em;padding-bottom:0}.toc ul{margin-bottom:0}#content-wrapper{padding-top:30px;border-top:solid 1px #fff}#main-content img{max-width:100%}code,pre{margin-bottom:30px;font-family:Monaco,Consolas,"Bitstream Vera Sans Mono","Lucida Console",Terminal,monospace;font-size:1em;color:#222}code{padding:0 3px;background-color:#f2f8fc;border:solid 1px #dbe7f3}pre{padding:20px;overflow:auto;text-shadow:none;background:#fff;border:solid 1px #f2f2f2;font-size:.9em}pre code{padding:0;color:#2879d0;background-color:#fff;border:none}ul,ol,dl{margin-bottom:20px}hr{height:1px;margin-top:1em;margin-bottom:1em;border:0;background:#aaa;background-image:linear-gradient(to right,#eee,#aaa,#eee)}form{padding:20px;background:#f2f2f2}#main-content h1{margin-top:0;margin-bottom:0;font-size:2em;font-weight:700;color:#474747;letter-spacing:-1px}#main-content h1:before{padding-right:.3em;margin-left:-.8em;color:#9ddcff;content:"/"}#main-content h2{margin-bottom:8px;font-size:1.5em;font-weight:700;color:#474747}#main-content h2:before{padding-right:.3em;margin-left:-1.2em;content:"//";color:#9ddcff}#main-content h3{margin-top:24px;margin-bottom:8px;font-size:1.2em;font-weight:700;color:#474747}#main-content h3:before{padding-right:.3em;margin-left:-1.7em;content:"///";color:#9ddcff}#main-content h4{margin-bottom:8px;font-size:1.1em;font-weight:700;color:#474747}h4:before{padding-right:.3em;margin-left:-2em;content:"////";color:#9ddcff}#main-content h5{margin-bottom:8px;font-size:1em;color:#474747}h5:before{padding-right:.3em;margin-left:-2.4em;content:"/////";color:#9ddcff}#main-content h6{margin-bottom:8px;font-size:.9em;color:#474747}h6:before{padding-right:.3em;margin-left:-3em;content:"//////";color:#9ddcff}p{margin-bottom:20px}a{text-decoration:none}p a{font-weight:400}blockquote{padding:0 0 0 30px;margin-bottom:20px;font-size:1.1em;border-left:10px solid #e9e9e9}ul,ol{padding-left:30px}dl dd{font-style:italic;font-weight:100}.clearfix:after{display:block;height:0;clear:both;visibility:hidden;content:'.'}.clearfix{display:inline-block}* html .clearfix{height:1%}.clearfix{display:block}@media only screen and (max-width: 850px){.toc,.inner{width:93%;font-size:1em}header{padding:10px 0}header h1,header h2{width:100%}header h1{font-size:1.75em}header h2{font-size:1.2em}header h3{font-size:1em}#main-content h1:before,#main-content h2:before,#main-content h3:before,#main-content h4:before,#main-content h5:before,#main-content h6:before{padding-right:0;margin-left:0;content:none}}
code > span.kw { color: #a71d5d; font-weight: normal; }
code > span.dt { color: #795da3; }
code > span.dv { color: #0086b3; }
code > span.bn { color: #0086b3; }
code > span.fl { color: #0086b3; }
code > span.ch { color: #4070a0; }
code > span.st { color: #183691; }
code > span.co { color: #969896; font-style: italic; }
code > span.ot { color: #007020; }
</style>
</head>
<body>
<header>
<div class="inner">
<h1 class="title toc-ignore">R introduction and dplyr</h1>
<h3 class="author">Gregor Pirs, Jure Demsar and Erik Strumbelj</h3>
<h3 class="date">25/7/2019</h3>
</div>
</header>
<div id="TOC" class="toc">
<ul>
<li><a href="#r-and-rstudio">R and Rstudio</a></li>
<li><a href="#variables">Variables</a></li>
<li><a href="#basic-data-structures">Basic data structures</a><ul>
<li><a href="#vector">Vector</a></li>
<li><a href="#factor">Factor</a></li>
<li><a href="#matrix">Matrix</a></li>
<li><a href="#array">Array</a></li>
<li><a href="#data-frame">Data frame</a></li>
<li><a href="#list">List</a></li>
</ul></li>
<li><a href="#packages">Packages</a></li>
<li><a href="#bpod">Data import</a></li>
<li><a href="#if-statement">If statement</a></li>
<li><a href="#loops">Loops</a></li>
<li><a href="#functions">Functions</a><ul>
<li><a href="#writing-functions">Writing functions</a></li>
<li><a href="#other-useful-functions-for-data-summarizing">Other useful functions for data summarizing</a></li>
</ul></li>
<li><a href="#debugging">Debugging</a></li>
<li><a href="#data-wrangling-with-dplyr">Data wrangling with dplyr</a><ul>
<li><a href="#filter">Filter</a></li>
<li><a href="#arrange">Arrange</a></li>
<li><a href="#select">Select</a></li>
<li><a href="#mutate">Mutate</a></li>
<li><a href="#summarise">Summarise</a></li>
<li><a href="#the-pipe">The pipe</a></li>
</ul></li>
<li><a href="#long-and-wide-data-formats">Long and wide data formats</a></li>
</ul>
</div>
<div id="content-wrapper">
<div class="inner clearfix">
<section id="main-content">
<div style="text-align:center">
<p><img src="" alt="drawing" width="128" /></p>
</div>
<div id="r-and-rstudio" class="section level1">
<h1>R and Rstudio</h1>
<p>R (<a href="https://www.r-project.org/" class="uri">https://www.r-project.org/</a>) is free open-source software for statistical computing. The basic interface to R is via console, which is quite rigid. RStudio (<a href="https://www.rstudio.com" class="uri">https://www.rstudio.com</a>) provides us with a better user interface and additional functionalities (R notebooks, RMarkdown,…).</p>
<p>Usually the user interface in RStudio is split into four parts. Upper left part is used for scripts. These are R files (or similar) which include our code and represent the main building blocks of our programs. Lower left part is the console, equivalent to the basic console interface of R. Upper right part is dedicated to the environment and history. Lower right part shows our workspace, plots, packages, and help.</p>
<p>To create a new script, go to File -> New File -> R Script. To run the code, highlight the desired part of the code and press Ctrl + Enter. Alternatively, you can run the code by clicking the run icon in the top-right corner of the script.</p>
<p>To specify the working directory, use <code>setwd()</code> function, where you provide the working directory in parentheses. For example to set the working directory to C:/Author you would call</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1"><span class="kw">setwd</span>(<span class="st">"C:/Author"</span>)</a></code></pre></div>
</div>
<div id="variables" class="section level1">
<h1>Variables</h1>
<p>Variables are the main data type of every program. In R, we define the values of variables with the syntax <code><-</code>. We do not need to initialize the type of the variables, as R predicts it. We denote strings with <code>""</code>. Comments are written with <code>#</code>.</p>
<p>Let’s create some variables.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" title="1">n <-<span class="st"> </span><span class="dv">20</span></a>
<a class="sourceLine" id="cb2-2" title="2">x <-<span class="st"> </span><span class="fl">2.7</span></a>
<a class="sourceLine" id="cb2-3" title="3">m <-<span class="st"> </span>n <span class="co"># m gets value 20</span></a>
<a class="sourceLine" id="cb2-4" title="4">my_flag <-<span class="st"> </span><span class="ot">TRUE</span></a>
<a class="sourceLine" id="cb2-5" title="5">student_name <-<span class="st"> "Luke"</span></a>
<a class="sourceLine" id="cb2-6" title="6">student_name <-<span class="st"> </span>Luke <span class="co"># because there is no variable named Luke, it returns an error</span></a></code></pre></div>
<pre><code>## Error in eval(expr, envir, enclos): object 'Luke' not found</code></pre>
<p>By using the function <code>typeof()</code> we can check the type of a variable.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="kw">typeof</span>(n)</a></code></pre></div>
<pre><code>## [1] "double"</code></pre>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" title="1"><span class="kw">typeof</span>(student_name)</a></code></pre></div>
<pre><code>## [1] "character"</code></pre>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1"><span class="kw">typeof</span>(my_flag)</a></code></pre></div>
<pre><code>## [1] "logical"</code></pre>
<p>We can change the types of variables with as.type functions. The main types are <strong>integer</strong>, <strong>double</strong>, <strong>character</strong> (strings), and <strong>logical</strong>. Note that the type character is used for strings and we do not have a separate type for single characters.</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb10-1" title="1"><span class="kw">typeof</span>(<span class="kw">as.integer</span>(n))</a></code></pre></div>
<pre><code>## [1] "integer"</code></pre>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1"><span class="kw">typeof</span>(<span class="kw">as.character</span>(n))</a></code></pre></div>
<pre><code>## [1] "character"</code></pre>
<p>Another common type is date. We can convert a character string to a date with the <code>as.Date()</code> function. When using this function, we have to be careful to provide the correct format of the date.</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" title="1">some_date <-<span class="st"> </span><span class="kw">as.Date</span>(<span class="st">"2019-01-01"</span>, <span class="dt">format =</span> <span class="st">"%Y-%m-%d"</span>)</a>
<a class="sourceLine" id="cb14-2" title="2">some_date</a></code></pre></div>
<pre><code>## [1] "2019-01-01"</code></pre>
<p>To access the values of the variables, we use variable names.</p>
<div class="sourceCode" id="cb16"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb16-1" title="1">n</a></code></pre></div>
<pre><code>## [1] 20</code></pre>
<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb18-1" title="1">m</a></code></pre></div>
<pre><code>## [1] 20</code></pre>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" title="1">my_flag</a></code></pre></div>
<pre><code>## [1] TRUE</code></pre>
<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb22-1" title="1">student_name</a></code></pre></div>
<pre><code>## [1] "Luke"</code></pre>
<p>We can apply arithmetic operations on numerical variables.</p>
<div class="sourceCode" id="cb24"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb24-1" title="1">n <span class="op">+</span><span class="st"> </span>x</a></code></pre></div>
<pre><code>## [1] 22.7</code></pre>
<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb26-1" title="1">n <span class="op">-</span><span class="st"> </span>x</a></code></pre></div>
<pre><code>## [1] 17.3</code></pre>
<div class="sourceCode" id="cb28"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb28-1" title="1">diff <-<span class="st"> </span>n <span class="op">-</span><span class="st"> </span>x <span class="co"># variable diff gets the difference between n and x</span></a>
<a class="sourceLine" id="cb28-2" title="2">diff</a></code></pre></div>
<pre><code>## [1] 17.3</code></pre>
<div class="sourceCode" id="cb30"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb30-1" title="1">n <span class="op">*</span><span class="st"> </span>x</a></code></pre></div>
<pre><code>## [1] 54</code></pre>
<div class="sourceCode" id="cb32"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb32-1" title="1">n <span class="op">/</span><span class="st"> </span>x</a></code></pre></div>
<pre><code>## [1] 7.407407</code></pre>
<div class="sourceCode" id="cb34"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb34-1" title="1">x<span class="op">^</span><span class="dv">2</span></a></code></pre></div>
<pre><code>## [1] 7.29</code></pre>
<div class="sourceCode" id="cb36"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb36-1" title="1"><span class="kw">sqrt</span>(x)</a></code></pre></div>
<pre><code>## [1] 1.643168</code></pre>
<div class="sourceCode" id="cb38"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb38-1" title="1">n <span class="op">></span><span class="st"> </span><span class="dv">2</span> <span class="op">*</span><span class="st"> </span>n <span class="co"># logical is greater</span></a></code></pre></div>
<pre><code>## [1] FALSE</code></pre>
<div class="sourceCode" id="cb40"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb40-1" title="1">n <span class="op">==</span><span class="st"> </span>n <span class="co"># equals</span></a></code></pre></div>
<pre><code>## [1] TRUE</code></pre>
<div class="sourceCode" id="cb42"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb42-1" title="1">n <span class="op">==</span><span class="st"> </span><span class="dv">2</span> <span class="op">*</span><span class="st"> </span>n</a></code></pre></div>
<pre><code>## [1] FALSE</code></pre>
<div class="sourceCode" id="cb44"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb44-1" title="1">n <span class="op">!=</span><span class="st"> </span>n <span class="co"># not equals</span></a></code></pre></div>
<pre><code>## [1] FALSE</code></pre>
<p>We can concatenate strings with functions <code>paste()</code> and <code>paste0()</code>. The difference between these functions is that the first one forces a space between inputs, while the second one does not.</p>
<div class="sourceCode" id="cb46"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb46-1" title="1"><span class="kw">paste</span>(student_name, <span class="st">"is"</span>, n, <span class="st">"years old"</span>)</a></code></pre></div>
<pre><code>## [1] "Luke is 20 years old"</code></pre>
<div class="sourceCode" id="cb48"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb48-1" title="1"><span class="kw">paste0</span>(student_name, <span class="st">"is"</span>, n, <span class="st">"years old"</span>)</a></code></pre></div>
<pre><code>## [1] "Lukeis20years old"</code></pre>
<div class="sourceCode" id="cb50"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb50-1" title="1">L_username <-<span class="st"> </span><span class="kw">paste0</span>(student_name, n)</a></code></pre></div>
<p>Function <code>paste()</code> can get an additional parameter <code>sep</code>, which should be used between the inputs. If we want to find out more about a function, we put a question mark before the function’s name in the console.</p>
<div class="sourceCode" id="cb51"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb51-1" title="1"><span class="co"># ?paste</span></a>
<a class="sourceLine" id="cb51-2" title="2"><span class="kw">paste</span>(student_name, <span class="st">"is"</span>, n, <span class="st">"years_old"</span>, <span class="dt">sep =</span> <span class="st">"_"</span>)</a></code></pre></div>
<pre><code>## [1] "Luke_is_20_years_old"</code></pre>
</div>
<div id="basic-data-structures" class="section level1">
<h1>Basic data structures</h1>
<div id="vector" class="section level2">
<h2>Vector</h2>
<p>Vectors are the most common data structure in R. They consist of several elements of the same type. We create them with the function <code>c()</code> (combine).</p>
<div class="sourceCode" id="cb53"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb53-1" title="1">student_ages <-<span class="st"> </span><span class="kw">c</span>(<span class="dv">20</span>, <span class="dv">23</span>, <span class="dv">21</span>)</a>
<a class="sourceLine" id="cb53-2" title="2">student_names <-<span class="st"> </span><span class="kw">c</span>(<span class="st">"Luke"</span>, <span class="st">"Jen"</span>, <span class="st">"Mike"</span>)</a>
<a class="sourceLine" id="cb53-3" title="3">passed <-<span class="st"> </span><span class="kw">c</span>(<span class="ot">TRUE</span>, <span class="ot">TRUE</span>, <span class="ot">FALSE</span>)</a></code></pre></div>
<p>To access individual elements of vectors we use square brackets with the sequential number of the elements we want. <strong>The indexing in R starts with 1</strong>, as opposed to 0 (C++, Java,…).</p>
<div class="sourceCode" id="cb54"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb54-1" title="1">student_ages[<span class="dv">2</span>]</a></code></pre></div>
<pre><code>## [1] 23</code></pre>
<div class="sourceCode" id="cb56"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb56-1" title="1">student_names[<span class="dv">2</span>]</a></code></pre></div>
<pre><code>## [1] "Jen"</code></pre>
<div class="sourceCode" id="cb58"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb58-1" title="1">passed[<span class="dv">2</span>]</a></code></pre></div>
<pre><code>## [1] TRUE</code></pre>
<p>To get the length of the vector use <code>length()</code>.</p>
<div class="sourceCode" id="cb60"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb60-1" title="1"><span class="kw">length</span>(student_names)</a></code></pre></div>
<pre><code>## [1] 3</code></pre>
<p>We can use element-wise arithmetic operations on vectors, and we can use the scalar product (<code>%*%</code>). Note that you have to be careful with vector lengths. For example, if we have an operation on two elements—in our case vectors—and they are not of the same length, the smaller one will start preiodically repeating itself, until it reaches the size of the larger one. In that case, R will provide us with a warning.</p>
<div class="sourceCode" id="cb62"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb62-1" title="1">a <-<span class="st"> </span><span class="kw">c</span>(<span class="dv">1</span>, <span class="dv">3</span>, <span class="dv">5</span>)</a>
<a class="sourceLine" id="cb62-2" title="2">b <-<span class="st"> </span><span class="kw">c</span>(<span class="dv">2</span>, <span class="dv">2</span>, <span class="dv">1</span>)</a>
<a class="sourceLine" id="cb62-3" title="3">d <-<span class="st"> </span><span class="kw">c</span>(<span class="dv">6</span>, <span class="dv">7</span>)</a>
<a class="sourceLine" id="cb62-4" title="4">a <span class="op">+</span><span class="st"> </span>b</a></code></pre></div>
<pre><code>## [1] 3 5 6</code></pre>
<div class="sourceCode" id="cb64"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb64-1" title="1">a <span class="op">*</span><span class="st"> </span>b</a></code></pre></div>
<pre><code>## [1] 2 6 5</code></pre>
<div class="sourceCode" id="cb66"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb66-1" title="1">a <span class="op">+</span><span class="st"> </span>d <span class="co"># not the same length, d becomes (6, 7, 6)</span></a></code></pre></div>
<pre><code>## Warning in a + d: longer object length is not a multiple of shorter object
## length</code></pre>
<pre><code>## [1] 7 10 11</code></pre>
<div class="sourceCode" id="cb69"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb69-1" title="1">a <span class="op">+</span><span class="st"> </span><span class="dv">2</span> <span class="op">*</span><span class="st"> </span>b</a></code></pre></div>
<pre><code>## [1] 5 7 7</code></pre>
<div class="sourceCode" id="cb71"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb71-1" title="1">a <span class="op">%*%</span><span class="st"> </span>b <span class="co"># scalar product</span></a></code></pre></div>
<pre><code>## [,1]
## [1,] 13</code></pre>
<div class="sourceCode" id="cb73"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb73-1" title="1">a <span class="op">></span><span class="st"> </span>b <span class="co"># logical relations between elements</span></a></code></pre></div>
<pre><code>## [1] FALSE TRUE TRUE</code></pre>
<div class="sourceCode" id="cb75"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb75-1" title="1">b <span class="op">==</span><span class="st"> </span>a</a></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE</code></pre>
<p>We often want to select only specific elements of a vector. There are several ways to do that—for example all of the calls below return the first two elements of vector <code>a</code>.</p>
<div class="sourceCode" id="cb77"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb77-1" title="1">a[<span class="kw">c</span>(<span class="ot">TRUE</span>, <span class="ot">TRUE</span>, <span class="ot">FALSE</span>)] <span class="co"># selection based on logical vector</span></a></code></pre></div>
<pre><code>## [1] 1 3</code></pre>
<div class="sourceCode" id="cb79"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb79-1" title="1">a[<span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>)] <span class="co"># selection based on indexes</span></a></code></pre></div>
<pre><code>## [1] 1 3</code></pre>
<div class="sourceCode" id="cb81"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb81-1" title="1">a[a <span class="op"><</span><span class="st"> </span><span class="dv">5</span>] <span class="co"># selection based on logical condition</span></a></code></pre></div>
<pre><code>## [1] 1 3</code></pre>
<p>We can also use several conditions. If we want both conditions to hold, we use and (<code>&</code>), if only one has to hold we use if (<code>|</code>). Note that only here we use only a single symbol for each, as opposed to some other programming languages that use two.</p>
<div class="sourceCode" id="cb83"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb83-1" title="1">a[a <span class="op">></span><span class="st"> </span><span class="dv">2</span> <span class="op">&</span><span class="st"> </span>a <span class="op"><</span><span class="st"> </span><span class="dv">4</span>]</a></code></pre></div>
<pre><code>## [1] 3</code></pre>
<div class="sourceCode" id="cb85"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb85-1" title="1">a[a <span class="op"><</span><span class="st"> </span><span class="dv">2</span> <span class="op">|</span><span class="st"> </span>a <span class="op">></span><span class="st"> </span><span class="dv">4</span>]</a></code></pre></div>
<pre><code>## [1] 1 5</code></pre>
</div>
<div id="factor" class="section level2">
<h2>Factor</h2>
<p>Factors are used for coding categorical variables, which can only take a finite number of predetermined values. We can further divide categorical variables into nominal and ordinal. Nominal values don’t have an ordering (for example car brand), while ordinal variables do (for example frequency—never, rarely, sometimes, often, always). Ordinal variables have an ordering but usually we can not assign values to them (for example sometimes is more than rarely, but we do not know how much more).</p>
<p>In R we create factors with function <code>factor()</code>. When creating factors, we can determine in advance, which values the factor can take with the argument <code>levels</code>. If we wish to add a non-existing level to a factor variable, R turns it into NA.</p>
<div class="sourceCode" id="cb87"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb87-1" title="1">car_brand <-<span class="st"> </span><span class="kw">factor</span>(<span class="kw">c</span>(<span class="st">"Audi"</span>, <span class="st">"BMW"</span>, <span class="st">"Mercedes"</span>, <span class="st">"BMW"</span>), <span class="dt">ordered =</span> <span class="ot">FALSE</span>)</a>
<a class="sourceLine" id="cb87-2" title="2">car_brand</a></code></pre></div>
<pre><code>## [1] Audi BMW Mercedes BMW
## Levels: Audi BMW Mercedes</code></pre>
<div class="sourceCode" id="cb89"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb89-1" title="1">freq <-<span class="st"> </span><span class="kw">factor</span>(<span class="dt">x =</span> <span class="ot">NA</span>,</a>
<a class="sourceLine" id="cb89-2" title="2"> <span class="dt">levels =</span> <span class="kw">c</span>(<span class="st">"never"</span>,<span class="st">"rarely"</span>,<span class="st">"sometimes"</span>,<span class="st">"often"</span>,<span class="st">"always"</span>),</a>
<a class="sourceLine" id="cb89-3" title="3"> <span class="dt">ordered =</span> <span class="ot">TRUE</span>)</a>
<a class="sourceLine" id="cb89-4" title="4">freq[<span class="dv">1</span><span class="op">:</span><span class="dv">3</span>] <-<span class="st"> </span><span class="kw">c</span>(<span class="st">"rarely"</span>, <span class="st">"sometimes"</span>, <span class="st">"rarely"</span>)</a>
<a class="sourceLine" id="cb89-5" title="5">freq</a></code></pre></div>
<pre><code>## [1] rarely sometimes rarely
## Levels: never < rarely < sometimes < often < always</code></pre>
<div class="sourceCode" id="cb91"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb91-1" title="1">freq[<span class="dv">4</span>] <-<span class="st"> "quite_often"</span> <span class="co"># non-existing level, returns NA</span></a></code></pre></div>
<pre><code>## Warning in `[<-.factor`(`*tmp*`, 4, value = "quite_often"): invalid factor
## level, NA generated</code></pre>
<div class="sourceCode" id="cb93"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb93-1" title="1">freq</a></code></pre></div>
<pre><code>## [1] rarely sometimes rarely <NA>
## Levels: never < rarely < sometimes < often < always</code></pre>
</div>
<div id="matrix" class="section level2">
<h2>Matrix</h2>
<p>Two-dimensional generalizations of vectors are matrices. We create them with the function <code>matrix()</code>, where we have to provide the values and either the number of rows or columns. Additionally, the argument <code>byrow = TRUE</code> fills the matrix with provided elements by rows (default is by columns).</p>
<div class="sourceCode" id="cb95"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb95-1" title="1">my_matrix <-<span class="st"> </span><span class="kw">matrix</span>(<span class="kw">c</span>(<span class="dv">1</span>, <span class="dv">2</span>, <span class="dv">1</span>,</a>
<a class="sourceLine" id="cb95-2" title="2"> <span class="dv">5</span>, <span class="dv">4</span>, <span class="dv">2</span>),</a>
<a class="sourceLine" id="cb95-3" title="3"> <span class="dt">nrow =</span> <span class="dv">2</span>,</a>
<a class="sourceLine" id="cb95-4" title="4"> <span class="dt">byrow =</span> <span class="ot">TRUE</span>)</a>
<a class="sourceLine" id="cb95-5" title="5">my_matrix</a></code></pre></div>
<pre><code>## [,1] [,2] [,3]
## [1,] 1 2 1
## [2,] 5 4 2</code></pre>
<div class="sourceCode" id="cb97"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb97-1" title="1">my_square_matrix <-<span class="st"> </span><span class="kw">matrix</span>(<span class="kw">c</span>(<span class="dv">1</span>, <span class="dv">3</span>,</a>
<a class="sourceLine" id="cb97-2" title="2"> <span class="dv">2</span>, <span class="dv">3</span>),</a>
<a class="sourceLine" id="cb97-3" title="3"> <span class="dt">nrow =</span> <span class="dv">2</span>)</a>
<a class="sourceLine" id="cb97-4" title="4">my_square_matrix</a></code></pre></div>
<pre><code>## [,1] [,2]
## [1,] 1 2
## [2,] 3 3</code></pre>
<p>To access individual elements we use square brackets, where we divide the dimensions by a comma.</p>
<div class="sourceCode" id="cb99"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb99-1" title="1">my_matrix[<span class="dv">1</span>,<span class="dv">2</span>] <span class="co"># first row, second column</span></a></code></pre></div>
<pre><code>## [1] 2</code></pre>
<div class="sourceCode" id="cb101"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb101-1" title="1">my_matrix[<span class="dv">2</span>, ] <span class="co"># second row</span></a></code></pre></div>
<pre><code>## [1] 5 4 2</code></pre>
<div class="sourceCode" id="cb103"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb103-1" title="1">my_matrix[ ,<span class="dv">3</span>] <span class="co"># third column</span></a></code></pre></div>
<pre><code>## [1] 1 2</code></pre>
<p>Some useful functions for matrices.</p>
<div class="sourceCode" id="cb105"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb105-1" title="1"><span class="kw">nrow</span>(my_matrix) <span class="co"># number of matrix rows</span></a></code></pre></div>
<pre><code>## [1] 2</code></pre>
<div class="sourceCode" id="cb107"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb107-1" title="1"><span class="kw">ncol</span>(my_matrix) <span class="co"># number of matrix columns</span></a></code></pre></div>
<pre><code>## [1] 3</code></pre>
<div class="sourceCode" id="cb109"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb109-1" title="1"><span class="kw">dim</span>(my_matrix) <span class="co"># matrix dimension</span></a></code></pre></div>
<pre><code>## [1] 2 3</code></pre>
<div class="sourceCode" id="cb111"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb111-1" title="1"><span class="kw">t</span>(my_matrix) <span class="co"># transpose</span></a></code></pre></div>
<pre><code>## [,1] [,2]
## [1,] 1 5
## [2,] 2 4
## [3,] 1 2</code></pre>
<div class="sourceCode" id="cb113"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb113-1" title="1"><span class="kw">diag</span>(my_matrix) <span class="co"># the diagonal of the matrix as vector</span></a></code></pre></div>
<pre><code>## [1] 1 4</code></pre>
<div class="sourceCode" id="cb115"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb115-1" title="1"><span class="kw">diag</span>(<span class="dv">1</span>, <span class="dt">nrow =</span> <span class="dv">3</span>) <span class="co"># creates a diagonal matrix</span></a></code></pre></div>
<pre><code>## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1</code></pre>
<div class="sourceCode" id="cb117"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb117-1" title="1"><span class="kw">det</span>(my_square_matrix) <span class="co"># matrix determinant</span></a></code></pre></div>
<pre><code>## [1] -3</code></pre>
<p>We can also use arithmetic operations on matrices. Note that we have to be careful with matrix dimensions. For matrix multiplication, we use <code>%*%</code></p>
<div class="sourceCode" id="cb119"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb119-1" title="1">my_matrix <span class="op">+</span><span class="st"> </span><span class="dv">2</span> <span class="op">*</span><span class="st"> </span>my_matrix</a></code></pre></div>
<pre><code>## [,1] [,2] [,3]
## [1,] 3 6 3
## [2,] 15 12 6</code></pre>
<div class="sourceCode" id="cb121"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb121-1" title="1">my_matrix <span class="op">*</span><span class="st"> </span>my_matrix <span class="co"># element-wise multiplication</span></a></code></pre></div>
<pre><code>## [,1] [,2] [,3]
## [1,] 1 4 1
## [2,] 25 16 4</code></pre>
<div class="sourceCode" id="cb123"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb123-1" title="1">my_matrix <span class="op">%*%</span><span class="st"> </span><span class="kw">t</span>(my_matrix) <span class="co"># matrix multiplication</span></a></code></pre></div>
<pre><code>## [,1] [,2]
## [1,] 6 15
## [2,] 15 45</code></pre>
<div class="sourceCode" id="cb125"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb125-1" title="1">my_square_matrix <span class="op">%*%</span><span class="st"> </span>my_matrix</a></code></pre></div>
<pre><code>## [,1] [,2] [,3]
## [1,] 11 10 5
## [2,] 18 18 9</code></pre>
<div class="sourceCode" id="cb127"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb127-1" title="1">my_matrix <span class="op">%*%</span><span class="st"> </span>my_square_matrix <span class="co"># wrong dimensions</span></a></code></pre></div>
<pre><code>## Error in my_matrix %*% my_square_matrix: non-conformable arguments</code></pre>
<p>We can transform a matrix into a vector.</p>
<div class="sourceCode" id="cb129"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb129-1" title="1">my_vec <-<span class="st"> </span><span class="kw">as.vector</span>(my_matrix)</a>
<a class="sourceLine" id="cb129-2" title="2">my_vec</a></code></pre></div>
<pre><code>## [1] 1 5 2 4 1 2</code></pre>
</div>
<div id="array" class="section level2">
<h2>Array</h2>
<p>Multi-dimensional generalizations of matrices are arrays.</p>
<div class="sourceCode" id="cb131"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb131-1" title="1">my_array <-<span class="st"> </span><span class="kw">array</span>(<span class="kw">c</span>(<span class="dv">1</span>, <span class="dv">2</span>, <span class="dv">3</span>, <span class="dv">4</span>, <span class="dv">5</span>, <span class="dv">6</span>, <span class="dv">7</span>, <span class="dv">8</span>), <span class="dt">dim =</span> <span class="kw">c</span>(<span class="dv">2</span>, <span class="dv">2</span>, <span class="dv">2</span>))</a>
<a class="sourceLine" id="cb131-2" title="2">my_array[<span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">1</span>]</a></code></pre></div>
<pre><code>## [1] 1</code></pre>
<div class="sourceCode" id="cb133"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb133-1" title="1">my_array[<span class="dv">2</span>, <span class="dv">2</span>, <span class="dv">1</span>]</a></code></pre></div>
<pre><code>## [1] 4</code></pre>
<div class="sourceCode" id="cb135"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb135-1" title="1">my_array[<span class="dv">1</span>, , ]</a></code></pre></div>
<pre><code>## [,1] [,2]
## [1,] 1 5
## [2,] 3 7</code></pre>
<div class="sourceCode" id="cb137"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb137-1" title="1"><span class="kw">dim</span>(my_array)</a></code></pre></div>
<pre><code>## [1] 2 2 2</code></pre>
</div>
<div id="data-frame" class="section level2">
<h2>Data frame</h2>
<p>Data frames are the basic data structure used in R for data analysis. It has the form of a table, where columns represent individual variables, and rows represent observations. They differ from matrices, as the columns can be of different types. We access elements the same way as in matrices.</p>
<p>We can combine vectors into data frames with <code>data.frame()</code>. The function transforms variables of type character into factors by default. if we do not want that, we have to add an argument <code>stringsAsFactors = FALSE</code>. We can assign column names with the function <code>colnames()</code>.</p>
<div class="sourceCode" id="cb139"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb139-1" title="1">student_data <-<span class="st"> </span><span class="kw">data.frame</span>(student_names, student_ages, passed,</a>
<a class="sourceLine" id="cb139-2" title="2"> <span class="dt">stringsAsFactors =</span> <span class="ot">FALSE</span>)</a>
<a class="sourceLine" id="cb139-3" title="3"><span class="kw">colnames</span>(student_data) <-<span class="st"> </span><span class="kw">c</span>(<span class="st">"Name"</span>, <span class="st">"Age"</span>, <span class="st">"Pass"</span>)</a>
<a class="sourceLine" id="cb139-4" title="4">student_data</a></code></pre></div>
<pre><code>## Name Age Pass
## 1 Luke 20 TRUE
## 2 Jen 23 TRUE
## 3 Mike 21 FALSE</code></pre>
<p>We can also assign column names directly, when creating a data frame.</p>
<div class="sourceCode" id="cb141"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb141-1" title="1">student_data <-<span class="st"> </span><span class="kw">data.frame</span>(<span class="st">"Name"</span> =<span class="st"> </span>student_names, </a>
<a class="sourceLine" id="cb141-2" title="2"> <span class="st">"Age"</span> =<span class="st"> </span>student_ages, </a>
<a class="sourceLine" id="cb141-3" title="3"> <span class="st">"Pass"</span> =<span class="st"> </span>passed)</a>
<a class="sourceLine" id="cb141-4" title="4">student_data</a></code></pre></div>
<pre><code>## Name Age Pass
## 1 Luke 20 TRUE
## 2 Jen 23 TRUE
## 3 Mike 21 FALSE</code></pre>
<p>Similar to vectors, we can access the elements in data frames (and matrices) with logical calls. Here we need to be careful if we are selecting rows or columns. To access specific columns, we can also use the name of the column preceded by <code>$</code>.</p>
<div class="sourceCode" id="cb143"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb143-1" title="1">student_data[ ,<span class="kw">colnames</span>(student_data) <span class="op">%in%</span><span class="st"> </span><span class="kw">c</span>(<span class="st">"Name"</span>, <span class="st">"Pass"</span>)]</a></code></pre></div>
<pre><code>## Name Pass
## 1 Luke TRUE
## 2 Jen TRUE
## 3 Mike FALSE</code></pre>
<div class="sourceCode" id="cb145"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb145-1" title="1">student_data[student_data<span class="op">$</span>Pass <span class="op">==</span><span class="st"> </span><span class="ot">TRUE</span>, ]</a></code></pre></div>
<pre><code>## Name Age Pass
## 1 Luke 20 TRUE
## 2 Jen 23 TRUE</code></pre>
<div class="sourceCode" id="cb147"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb147-1" title="1">student_data<span class="op">$</span>Pass</a></code></pre></div>
<pre><code>## [1] TRUE TRUE FALSE</code></pre>
</div>
<div id="list" class="section level2">
<h2>List</h2>
<p>Lists are very useful data structure, especially when we are dealing with different data sets and data structures. We can imagine a list as a vector, where each element can be a different data structure. For example, a list can have a vector stored on index 1, a matrix on index 2, and a data frame on index 3. Moreover, a list can be an element of a list and so on.</p>
<div class="sourceCode" id="cb149"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb149-1" title="1">first_list <-<span class="st"> </span><span class="kw">list</span>(student_ages, my_matrix, student_data)</a>
<a class="sourceLine" id="cb149-2" title="2">second_list <-<span class="st"> </span><span class="kw">list</span>(student_ages, my_matrix, student_data, first_list)</a></code></pre></div>
<p>We access the elements of a list with double square brackets.</p>
<div class="sourceCode" id="cb150"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb150-1" title="1">first_list[[<span class="dv">1</span>]]</a></code></pre></div>
<pre><code>## [1] 20 23 21</code></pre>
<div class="sourceCode" id="cb152"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb152-1" title="1">second_list[[<span class="dv">4</span>]]</a></code></pre></div>
<pre><code>## [[1]]
## [1] 20 23 21
##
## [[2]]
## [,1] [,2] [,3]
## [1,] 1 2 1
## [2,] 5 4 2
##
## [[3]]
## Name Age Pass
## 1 Luke 20 TRUE
## 2 Jen 23 TRUE
## 3 Mike 21 FALSE</code></pre>
<div class="sourceCode" id="cb154"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb154-1" title="1">second_list[[<span class="dv">4</span>]][[<span class="dv">1</span>]] <span class="co"># first element of the fourth element of second_list</span></a></code></pre></div>
<pre><code>## [1] 20 23 21</code></pre>
<p>We can also apply <code>length()</code> to get the number of elements in the list.</p>
<div class="sourceCode" id="cb156"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb156-1" title="1"><span class="kw">length</span>(second_list)</a></code></pre></div>
<pre><code>## [1] 4</code></pre>
<p>To append to list, we use the call below.</p>
<div class="sourceCode" id="cb158"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb158-1" title="1">second_list[[<span class="kw">length</span>(second_list) <span class="op">+</span><span class="st"> </span><span class="dv">1</span>]] <-<span class="st"> "add_me"</span></a>
<a class="sourceLine" id="cb158-2" title="2">second_list[[<span class="kw">length</span>(second_list)]] <span class="co"># check, what is on the last index</span></a></code></pre></div>
<pre><code>## [1] "add_me"</code></pre>
<p>Additionally, we can name the elements of the list, and access them by name. For that we use the <code>names()</code> function.</p>
<div class="sourceCode" id="cb160"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb160-1" title="1"><span class="kw">names</span>(first_list) <-<span class="st"> </span><span class="kw">c</span>(<span class="st">"Age"</span>, <span class="st">"Matrix"</span>, <span class="st">"Data"</span>)</a>
<a class="sourceLine" id="cb160-2" title="2">first_list<span class="op">$</span>Age</a></code></pre></div>
<pre><code>## [1] 20 23 21</code></pre>
</div>
</div>
<div id="packages" class="section level1">
<h1>Packages</h1>
<p>R is an open-source programming language and anyone can contribute to its development. Many packages exist that make our work in R easier. Additionally, some packages include different statistical models—some of which are implemented in other languages for efficiency (for example C++). An open-source repository CRAN consists of most packages that you are going to need. To install a specific package, we use the function <code>install.packages()</code>, or we can use R-Studio’s UI. Once a package is installed, we can load it into our workspace with <code>library()</code>. We will get to know several useful packages during this workshop.</p>
<div class="sourceCode" id="cb162"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb162-1" title="1"><span class="kw">install.packages</span>(<span class="st">"stats"</span>) <span class="co"># install package</span></a>
<a class="sourceLine" id="cb162-2" title="2"><span class="kw">library</span>(stats) <span class="co"># load the package into workspace</span></a></code></pre></div>
</div>
<div id="bpod" class="section level1">
<h1>Data import</h1>
<p>We often encounter data in a csv (comma separated value) format. Different pacakges in R allow us to read data from csv, txt, xlsx, etc. formats. Here we will go through reading data from csv and xlsx formats.</p>
<p>To read csv data use <code>read.csv</code> from the package <code>utils</code>. Before we read the data, we need to check two things. First, what is the character that separates the columns and how the decimal places are denoted (comma or dot). Second, if the data have a header (Does the first row contain column names?). Function automatically returns a data frame. <code>read.csv()</code> assumes that comma is the separator and a decimal point. However, it allows the change of these default values by providing the corresponding arguments. It also assumes that we have a header by default. When saving your data in the csv format, we recommend using a semi-colon as the separator, as comma is often used a) in text, b) as the decimal separator, or c) as thousands separator.</p>
<p>In our <strong>data</strong> folder, we have medical insurance data set acquired from Kaggle (<a href="https://www.kaggle.com/easonlai/sample-insurance-claim-prediction-dataset/" class="uri">https://www.kaggle.com/easonlai/sample-insurance-claim-prediction-dataset/</a>). To show different reading functions, we saved the data set in three different formats—csv with a comma separator, csv with a semi-colon separator, and xlsx file. The file also contains a header. Function <code>head()</code> returns the first six rows of the data frame.</p>
<div class="sourceCode" id="cb163"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb163-1" title="1"><span class="kw">library</span>(utils)</a>
<a class="sourceLine" id="cb163-2" title="2">claim_data <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">"./data/insurance01.csv"</span>)</a>
<a class="sourceLine" id="cb163-3" title="3"><span class="kw">head</span>(claim_data)</a></code></pre></div>
<pre><code>## age sex bmi children smoker region charges
## 1 19 female 27.900 0 yes southwest 16884.924
## 2 18 male 33.770 1 no southeast 1725.552
## 3 28 male 33.000 3 no southeast 4449.462
## 4 33 male 22.705 0 no northwest 21984.471
## 5 32 male 28.880 0 no northwest 3866.855
## 6 31 female 25.740 0 no southeast 3756.622</code></pre>
<p>The dot in the string represents current working directory. We see that R automatically converted string variables (sex, smoker, region) to factors. In our case this is sensible. However, sometimes we want strings to remain strings. In those cases, change the argument <code>stringsAsFactors</code> to false.</p>
<p>Along with a semi-colon as the separator, the second file has a decimal comma. Therefore</p>
<div class="sourceCode" id="cb165"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb165-1" title="1">claim_data <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">"./data/insurance02.csv"</span>, <span class="dt">sep =</span> <span class="st">";"</span>, <span class="dt">dec =</span> <span class="st">","</span>)</a></code></pre></div>
<p>Data is often saved as xlsx. To read data from xlsx, we use the <code>read.xlsx</code> function from the package <strong>xlsx</strong>. However, this function can be quite slow, so if you are dealing with large data frames, it might be better to save the excel file as a csv file and then read it as csv.</p>
<div class="sourceCode" id="cb166"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb166-1" title="1"><span class="kw">library</span>(xlsx)</a>
<a class="sourceLine" id="cb166-2" title="2">claim_data <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">"./data/insurance03.xlsx"</span>)</a></code></pre></div>
</div>
<div id="if-statement" class="section level1">
<h1>If statement</h1>
<p>We often want to execute code based on some condition. For that we use the <code>if</code>-<code>else</code> pair.</p>
<div class="sourceCode" id="cb167"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb167-1" title="1">x <-<span class="st"> </span><span class="dv">5</span></a>
<a class="sourceLine" id="cb167-2" title="2"><span class="cf">if</span> (x <span class="op"><</span><span class="st"> </span><span class="dv">0</span>) {</a>
<a class="sourceLine" id="cb167-3" title="3"> <span class="kw">print</span>(<span class="st">"x is smaller than 0"</span>)</a>
<a class="sourceLine" id="cb167-4" title="4">} <span class="cf">else</span> <span class="cf">if</span> (x <span class="op">==</span><span class="st"> </span><span class="dv">0</span>) {</a>
<a class="sourceLine" id="cb167-5" title="5"> <span class="kw">print</span>(<span class="st">"x is 0"</span>)</a>
<a class="sourceLine" id="cb167-6" title="6">} <span class="cf">else</span> {</a>
<a class="sourceLine" id="cb167-7" title="7"> <span class="kw">print</span>(<span class="st">"x is greater than 0"</span>)</a>
<a class="sourceLine" id="cb167-8" title="8">}</a></code></pre></div>
<pre><code>## [1] "x is greater than 0"</code></pre>
</div>
<div id="loops" class="section level1">
<h1>Loops</h1>
<p>The most useful loop in R is the for loop. In the for loop we have to define a new variable, which will represent the different iterations of the loop. Then we have to define the values over which that variable will iterate. Often, these are sequential numbers. For example, let us add first 10 natural numbers.</p>
<div class="sourceCode" id="cb169"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb169-1" title="1">my_sum <-<span class="st"> </span><span class="dv">0</span></a>
<a class="sourceLine" id="cb169-2" title="2"><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="op">:</span><span class="dv">10</span>) { <span class="co"># 1:10 returns a vector of natural numbers between 1 and 10</span></a>
<a class="sourceLine" id="cb169-3" title="3"> my_sum <-<span class="st"> </span>my_sum <span class="op">+</span><span class="st"> </span>i</a>
<a class="sourceLine" id="cb169-4" title="4">}</a>
<a class="sourceLine" id="cb169-5" title="5">my_sum</a></code></pre></div>
<pre><code>## [1] 55</code></pre>
<p>The values in a for loop do not have to be sequential numbers.</p>
<div class="sourceCode" id="cb171"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb171-1" title="1">my_sum <-<span class="st"> </span><span class="dv">0</span></a>
<a class="sourceLine" id="cb171-2" title="2">some_numbers <-<span class="st"> </span><span class="kw">c</span>(<span class="dv">2</span>, <span class="fl">3.5</span>, <span class="dv">6</span>, <span class="dv">100</span>)</a>
<a class="sourceLine" id="cb171-3" title="3"><span class="cf">for</span> (i <span class="cf">in</span> some_numbers) {</a>
<a class="sourceLine" id="cb171-4" title="4"> my_sum <-<span class="st"> </span>my_sum <span class="op">+</span><span class="st"> </span>i</a>
<a class="sourceLine" id="cb171-5" title="5">}</a>
<a class="sourceLine" id="cb171-6" title="6">my_sum</a></code></pre></div>
<pre><code>## [1] 111.5</code></pre>
<p>For example, let us calculate the average charges per region on our data set.</p>
<div class="sourceCode" id="cb173"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb173-1" title="1">regions <-<span class="st"> </span><span class="kw">unique</span>(claim_data<span class="op">$</span>region) <span class="co"># returns unique values in region column</span></a>
<a class="sourceLine" id="cb173-2" title="2"><span class="cf">for</span> (reg <span class="cf">in</span> regions) {</a>
<a class="sourceLine" id="cb173-3" title="3"> tmp_data <-<span class="st"> </span>claim_data[claim_data<span class="op">$</span>region <span class="op">==</span><span class="st"> </span>reg, ]</a>
<a class="sourceLine" id="cb173-4" title="4"> charges <-<span class="st"> </span>tmp_data<span class="op">$</span>charges</a>
<a class="sourceLine" id="cb173-5" title="5"> <span class="kw">print</span>(<span class="kw">paste0</span>(<span class="st">"Region: "</span>, reg, </a>
<a class="sourceLine" id="cb173-6" title="6"> <span class="st">", average charges: "</span>, <span class="kw">mean</span>(charges)))</a>
<a class="sourceLine" id="cb173-7" title="7">}</a></code></pre></div>
<pre><code>## [1] "Region: southwest, average charges: 12346.9373772923"
## [1] "Region: southeast, average charges: 14735.4114376099"
## [1] "Region: northwest, average charges: 12417.5753739692"
## [1] "Region: northeast, average charges: 13406.3845163858"</code></pre>
</div>
<div id="functions" class="section level1">
<h1>Functions</h1>
<p>Base R consists of several function intended for easier work with data, for example <code>length()</code>, <code>dim()</code>, <code>colnames()</code>,… We can extend the set of functions with packages. For example, package <strong>stats</strong> allows us to create statistical models with the use of a single function—for example the linear model <code>lm()</code>. Here we will present some useful functions, more complex functions will follow in later chapters. Remember, if you want additional information about functions, we can call the name of the function in the console, where we add a question mark (for example <code>?length</code>).</p>
<div class="sourceCode" id="cb175"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb175-1" title="1"><span class="dv">1</span><span class="op">:</span><span class="dv">10</span> <span class="co"># special function that returns a sequence of numbers</span></a></code></pre></div>
<pre><code>## [1] 1 2 3 4 5 6 7 8 9 10</code></pre>
<div class="sourceCode" id="cb177"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb177-1" title="1"><span class="kw">sum</span>(<span class="dv">1</span><span class="op">:</span><span class="dv">10</span>) <span class="co"># sum of first 10 natural numbers</span></a></code></pre></div>
<pre><code>## [1] 55</code></pre>
<div class="sourceCode" id="cb179"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb179-1" title="1"><span class="kw">sum</span>(<span class="kw">c</span>(<span class="dv">3</span>,<span class="dv">5</span>,<span class="dv">6</span>,<span class="dv">3</span>))</a></code></pre></div>
<pre><code>## [1] 17</code></pre>
<div class="sourceCode" id="cb181"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb181-1" title="1"><span class="kw">rep</span>(<span class="dv">1</span>, <span class="dt">times =</span> <span class="dv">5</span>) <span class="co"># returns a vector of lenght 5, where all values are 1</span></a></code></pre></div>
<pre><code>## [1] 1 1 1 1 1</code></pre>
<div class="sourceCode" id="cb183"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb183-1" title="1"><span class="kw">rep</span>(<span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">2</span>), <span class="dt">times =</span> <span class="dv">5</span>) <span class="co"># returns a vector of length 5 where 1 and 2 are periodically changing</span></a></code></pre></div>
<pre><code>## [1] 1 2 1 2 1 2 1 2 1 2</code></pre>
<div class="sourceCode" id="cb185"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb185-1" title="1"><span class="kw">seq</span>(<span class="dv">0</span>, <span class="dv">2</span>, <span class="dt">by =</span> <span class="fl">0.5</span>) <span class="co"># vector from 0 to 2, by adding 0.5</span></a></code></pre></div>
<pre><code>## [1] 0.0 0.5 1.0 1.5 2.0</code></pre>
<div class="sourceCode" id="cb187"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb187-1" title="1"><span class="kw">prod</span>(<span class="dv">1</span><span class="op">:</span><span class="dv">10</span>) <span class="co"># multiply first 10 numbers</span></a></code></pre></div>
<pre><code>## [1] 3628800</code></pre>
<div class="sourceCode" id="cb189"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb189-1" title="1"><span class="kw">round</span>(<span class="fl">5.24</span>)</a></code></pre></div>
<pre><code>## [1] 5</code></pre>
<div class="sourceCode" id="cb191"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb191-1" title="1"><span class="dv">5</span><span class="op">^</span><span class="dv">5</span> <span class="co"># square</span></a></code></pre></div>
<pre><code>## [1] 3125</code></pre>
<div class="sourceCode" id="cb193"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb193-1" title="1"><span class="kw">sqrt</span>(<span class="dv">16</span>) <span class="co"># square root</span></a></code></pre></div>
<pre><code>## [1] 4</code></pre>
<div class="sourceCode" id="cb195"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb195-1" title="1"><span class="kw">as.character</span>(<span class="kw">c</span>(<span class="dv">1</span>,<span class="dv">6</span>,<span class="dv">3</span>)) <span class="co"># transforms a numerical vector to a character vector</span></a></code></pre></div>
<pre><code>## [1] "1" "6" "3"</code></pre>
<p>We often want a summary of our data. We can get it with <code>summary()</code>. We can use it on vectors and on data frames. The returned values are dependent on the types of variables.</p>
<div class="sourceCode" id="cb197"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb197-1" title="1"><span class="kw">summary</span>(student_ages)</a></code></pre></div>
<pre><code>## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 20.00 20.50 21.00 21.33 22.00 23.00</code></pre>
<div class="sourceCode" id="cb199"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb199-1" title="1"><span class="kw">summary</span>(student_names)</a></code></pre></div>
<pre><code>## Length Class Mode
## 3 character character</code></pre>
<div class="sourceCode" id="cb201"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb201-1" title="1"><span class="kw">summary</span>(passed)</a></code></pre></div>
<pre><code>## Mode FALSE TRUE
## logical 1 2</code></pre>
<div class="sourceCode" id="cb203"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb203-1" title="1"><span class="kw">summary</span>(car_brand)</a></code></pre></div>
<pre><code>## Audi BMW Mercedes
## 1 2 1</code></pre>
<div class="sourceCode" id="cb205"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb205-1" title="1"><span class="kw">summary</span>(freq)</a></code></pre></div>
<pre><code>## never rarely sometimes often always NA's
## 0 2 1 0 0 1</code></pre>
<div class="sourceCode" id="cb207"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb207-1" title="1"><span class="kw">summary</span>(student_data) <span class="co"># summary of the whole data frame</span></a></code></pre></div>
<pre><code>## Name Age Pass
## Jen :1 Min. :20.00 Mode :logical
## Luke:1 1st Qu.:20.50 FALSE:1
## Mike:1 Median :21.00 TRUE :2
## Mean :21.33
## 3rd Qu.:22.00
## Max. :23.00</code></pre>
<div id="writing-functions" class="section level2">
<h2>Writing functions</h2>
<p>We can write our own functions with <code>function()</code>. In the brackets, we define the parameters the function gets, and in curly brackets we define what the function does. We use <code>return()</code> to return values.</p>
<div class="sourceCode" id="cb209"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb209-1" title="1">sum_first_n_elements <-<span class="st"> </span><span class="cf">function</span> (n) {</a>
<a class="sourceLine" id="cb209-2" title="2"> my_sum <-<span class="st"> </span><span class="dv">0</span></a>
<a class="sourceLine" id="cb209-3" title="3"> <span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="op">:</span>n) {</a>
<a class="sourceLine" id="cb209-4" title="4"> my_sum <-<span class="st"> </span>my_sum <span class="op">+</span><span class="st"> </span>i</a>
<a class="sourceLine" id="cb209-5" title="5"> }</a>
<a class="sourceLine" id="cb209-6" title="6"> <span class="kw">return</span> (my_sum)</a>
<a class="sourceLine" id="cb209-7" title="7">}</a>
<a class="sourceLine" id="cb209-8" title="8"><span class="kw">sum_first_n_elements</span>(<span class="dv">10</span>)</a></code></pre></div>
<pre><code>## [1] 55</code></pre>
<p>If we want that the function returns several different data structures, we use a list. For example, let us look at a function which gets a matrix as input, and returns its transpose and determinant.</p>
<div class="sourceCode" id="cb211"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb211-1" title="1">get_transpose_and_det <-<span class="st"> </span><span class="cf">function</span> (mat) {</a>
<a class="sourceLine" id="cb211-2" title="2"> trans_mat <-<span class="st"> </span><span class="kw">t</span>(mat)</a>
<a class="sourceLine" id="cb211-3" title="3"> det_mat <-<span class="st"> </span><span class="kw">det</span>(mat)</a>
<a class="sourceLine" id="cb211-4" title="4"> out <-<span class="st"> </span><span class="kw">list</span>(<span class="st">"transposed"</span> =<span class="st"> </span>trans_mat,</a>
<a class="sourceLine" id="cb211-5" title="5"> <span class="st">"determinant"</span> =<span class="st"> </span>det_mat)</a>
<a class="sourceLine" id="cb211-6" title="6"> <span class="kw">return</span> (out)</a>
<a class="sourceLine" id="cb211-7" title="7">}</a>
<a class="sourceLine" id="cb211-8" title="8">mat_vals <-<span class="st"> </span><span class="kw">get_transpose_and_det</span>(my_square_matrix)</a>
<a class="sourceLine" id="cb211-9" title="9">mat_vals<span class="op">$</span>transposed</a></code></pre></div>
<pre><code>## [,1] [,2]
## [1,] 1 3
## [2,] 2 3</code></pre>
<div class="sourceCode" id="cb213"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb213-1" title="1">mat_vals<span class="op">$</span>determinant</a></code></pre></div>
<pre><code>## [1] -3</code></pre>
</div>
<div id="other-useful-functions-for-data-summarizing" class="section level2">
<h2>Other useful functions for data summarizing</h2>
<p>There are several functions that are useful when working with data. We already mentioned the <code>summary()</code> function. Let’s look at some other functions.</p>
<p>To generate random numbers we can use a variety of random number generators. Which we select depends on the data that we wish to generate. Usually, we want to be able to replicate our analysis exactly, therefore we recommend the use of a seed—this will generate the same random numbers everytime you call the function. There is a function for that in R called <code>set.seed()</code>.</p>
<div class="sourceCode" id="cb215"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb215-1" title="1"><span class="kw">set.seed</span>(<span class="dv">0</span>)</a>
<a class="sourceLine" id="cb215-2" title="2">norm_dat <-<span class="st"> </span><span class="kw">rnorm</span>(<span class="dv">1000</span>, <span class="dv">5</span>, <span class="dv">6</span>) <span class="co"># generate 1000 samples from the normal</span></a>
<a class="sourceLine" id="cb215-3" title="3"> <span class="co"># distribution with mean 5 and standard deviation 6</span></a>
<a class="sourceLine" id="cb215-4" title="4">count_dat <-<span class="st"> </span><span class="kw">rpois</span>(<span class="dv">2000</span>, <span class="dv">8</span>) <span class="co"># generate 2000 samples from the Poisson</span></a>
<a class="sourceLine" id="cb215-5" title="5"> <span class="co"># distribution with mean 8</span></a>
<a class="sourceLine" id="cb215-6" title="6">unif_dat <-<span class="st"> </span><span class="kw">runif</span>(<span class="dv">1000</span>, <span class="dv">-2</span>, <span class="dv">5</span>) <span class="co"># generate 1000 samples from the uniform</span></a>
<a class="sourceLine" id="cb215-7" title="7"> <span class="co"># distribution form -2 to 5</span></a></code></pre></div>
<p>In data science, we often work with statistics, so let’s look at some functions which provide us with meaningful information about our data.</p>
<div class="sourceCode" id="cb216"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb216-1" title="1"><span class="kw">mean</span>(norm_dat)</a></code></pre></div>
<pre><code>## [1] 4.905023</code></pre>
<div class="sourceCode" id="cb218"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb218-1" title="1"><span class="kw">var</span>(norm_dat) <span class="co"># variance</span></a></code></pre></div>
<pre><code>## [1] 35.85649</code></pre>
<div class="sourceCode" id="cb220"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb220-1" title="1"><span class="kw">sd</span>(norm_dat) <span class="co"># standard deviation</span></a></code></pre></div>
<pre><code>## [1] 5.988029</code></pre>
<div class="sourceCode" id="cb222"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb222-1" title="1"><span class="kw">max</span>(norm_dat)</a></code></pre></div>
<pre><code>## [1] 24.59849</code></pre>
<div class="sourceCode" id="cb224"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb224-1" title="1"><span class="kw">min</span>(norm_dat)</a></code></pre></div>
<pre><code>## [1] -14.41831</code></pre>
<div class="sourceCode" id="cb226"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb226-1" title="1"><span class="kw">quantile</span>(norm_dat) <span class="co"># calculates 5 quantiles of the data</span></a></code></pre></div>
<pre><code>## 0% 25% 50% 75% 100%
## -14.4183144 0.7492647 4.6467753 9.1258324 24.5984871</code></pre>
<p>We often want to standardize the data, before doing analysis. We can do that manually, or we can use R’s <code>scale()</code> function.</p>
<div class="sourceCode" id="cb228"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb228-1" title="1">st_dat <-<span class="st"> </span><span class="kw">scale</span>(norm_dat)</a>
<a class="sourceLine" id="cb228-2" title="2"><span class="kw">mean</span>(st_dat)</a></code></pre></div>
<pre><code>## [1] -1.257609e-17</code></pre>
<div class="sourceCode" id="cb230"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb230-1" title="1"><span class="kw">var</span>(st_dat)</a></code></pre></div>
<pre><code>## [,1]
## [1,] 1</code></pre>
</div>
</div>
<div id="debugging" class="section level1">
<h1>Debugging</h1>
<p>For the debugging in R we will use the <code>browser()</code> function. It stops the execution of the code and you can access the variables in the environment at the moment that browser was called.</p>
<p>For browser commands see <code>?browser</code> or type help when browser is active.</p>
</div>
<div id="data-wrangling-with-dplyr" class="section level1">
<h1>Data wrangling with dplyr</h1>
<p>Dplyr is a package for easier data manipulation. It is a part of a collection of packages called <strong>tidyverse</strong>, which consist of several R packages intended for data science. Dplyr is especially useful for data frame manipulation.</p>
<p>The main format of working with data in tidyverse is a <strong>tibble</strong>. This data structure is very smilar to base R’s data frame, however it is designed for easier work with other packages in tidyverse and also provides a different print output. Let’s look at it on our insurance data set.</p>
<div class="sourceCode" id="cb232"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb232-1" title="1"><span class="kw">library</span>(dplyr)</a></code></pre></div>
<div class="sourceCode" id="cb233"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb233-1" title="1">claim_data <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="st">"./data/insurance01.csv"</span>)</a>
<a class="sourceLine" id="cb233-2" title="2"><span class="kw">head</span>(claim_data)</a></code></pre></div>
<pre><code>## age sex bmi children smoker region charges
## 1 19 female 27.900 0 yes southwest 16884.924
## 2 18 male 33.770 1 no southeast 1725.552
## 3 28 male 33.000 3 no southeast 4449.462
## 4 33 male 22.705 0 no northwest 21984.471
## 5 32 male 28.880 0 no northwest 3866.855
## 6 31 female 25.740 0 no southeast 3756.622</code></pre>
<div class="sourceCode" id="cb235"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb235-1" title="1">claim_data <-<span class="st"> </span><span class="kw">as_tibble</span>(claim_data)</a>
<a class="sourceLine" id="cb235-2" title="2">claim_data</a></code></pre></div>
<pre><code>## # A tibble: 1,338 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 19 female 27.9 0 yes southwest 16885.
## 2 18 male 33.8 1 no southeast 1726.
## 3 28 male 33 3 no southeast 4449.
## 4 33 male 22.7 0 no northwest 21984.
## 5 32 male 28.9 0 no northwest 3867.
## 6 31 female 25.7 0 no southeast 3757.
## 7 46 female 33.4 1 no southeast 8241.
## 8 37 female 27.7 3 no northwest 7282.
## 9 37 male 29.8 2 no northeast 6406.
## 10 60 female 25.8 0 no northwest 28923.
## # ... with 1,328 more rows</code></pre>
<p>A tibble only shows the first 10 rows of the data set for clarity. Additionally, it only prints as many columns as fit into a page, and lists other columns below. If we wish to see all of the tibble, we can use the function <code>View()</code>. Under the variable names, a tibble shows the type of the variables.</p>
<p>Now that we have our starting data set, we can begin manipulating it. This usually consists of selecting specific rows and columns, and adding statistics derived from variables in the data frame. Below we describe five functions which will enable us dynamic data set manipulation.</p>
<div id="filter" class="section level2">
<h2>Filter</h2>
<p>The function <code>filter()</code> allows us to select rows, based on values of the variables. As input it gets a tibble and the conditions and it outputs a new tibble that consists only of desired rows.</p>
<div class="sourceCode" id="cb237"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb237-1" title="1"><span class="kw">filter</span>(claim_data, region <span class="op">==</span><span class="st"> "southwest"</span>)</a></code></pre></div>
<pre><code>## # A tibble: 325 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 19 female 27.9 0 yes southwest 16885.
## 2 23 male 34.4 0 no southwest 1827.
## 3 19 male 24.6 1 no southwest 1837.
## 4 56 male 40.3 0 no southwest 10602.
## 5 30 male 35.3 0 yes southwest 36837.
## 6 30 female 32.4 1 no southwest 4150.
## 7 31 male 36.3 2 yes southwest 38711
## 8 22 male 35.6 0 yes southwest 35586.
## 9 19 female 28.6 5 no southwest 4688.
## 10 28 male 36.4 1 yes southwest 51195.
## # ... with 315 more rows</code></pre>
<div class="sourceCode" id="cb239"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb239-1" title="1"><span class="kw">filter</span>(claim_data, region <span class="op">==</span><span class="st"> "southwest"</span>, age <span class="op">>=</span><span class="st"> </span><span class="dv">30</span>)</a></code></pre></div>
<pre><code>## # A tibble: 226 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 56 male 40.3 0 no southwest 10602.
## 2 30 male 35.3 0 yes southwest 36837.
## 3 30 female 32.4 1 no southwest 4150.
## 4 31 male 36.3 2 yes southwest 38711
## 5 60 male 39.9 0 yes southwest 48173.
## 6 55 male 37.3 0 no southwest 20630.
## 7 48 male 28 1 yes southwest 23568.
## 8 61 female 39.1 2 no southwest 14235.
## 9 53 female 28.1 3 no southwest 11742.
## 10 44 male 27.4 2 no southwest 7727.
## # ... with 216 more rows</code></pre>
<p>The conditions in filter use and—all conditions have to be satisfied. If we want to use or, we have to divide them with a pipe |.</p>
<div class="sourceCode" id="cb241"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb241-1" title="1"><span class="kw">filter</span>(claim_data, region <span class="op">==</span><span class="st"> "southwest"</span> <span class="op">|</span><span class="st"> </span>region <span class="op">==</span><span class="st"> "northwest"</span>)</a></code></pre></div>
<pre><code>## # A tibble: 650 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 19 female 27.9 0 yes southwest 16885.
## 2 33 male 22.7 0 no northwest 21984.
## 3 32 male 28.9 0 no northwest 3867.
## 4 37 female 27.7 3 no northwest 7282.
## 5 60 female 25.8 0 no northwest 28923.
## 6 23 male 34.4 0 no southwest 1827.
## 7 19 male 24.6 1 no southwest 1837.
## 8 56 male 40.3 0 no southwest 10602.
## 9 30 male 35.3 0 yes southwest 36837.
## 10 30 female 32.4 1 no southwest 4150.
## # ... with 640 more rows</code></pre>
<p>Or, the same can be achieved by using the operator %in%.</p>
<div class="sourceCode" id="cb243"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb243-1" title="1"><span class="kw">filter</span>(claim_data, region <span class="op">%in%</span><span class="st"> </span><span class="kw">c</span>(<span class="st">"southwest"</span>, <span class="st">"northwest"</span>))</a></code></pre></div>
<pre><code>## # A tibble: 650 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 19 female 27.9 0 yes southwest 16885.
## 2 33 male 22.7 0 no northwest 21984.
## 3 32 male 28.9 0 no northwest 3867.
## 4 37 female 27.7 3 no northwest 7282.
## 5 60 female 25.8 0 no northwest 28923.
## 6 23 male 34.4 0 no southwest 1827.
## 7 19 male 24.6 1 no southwest 1837.
## 8 56 male 40.3 0 no southwest 10602.
## 9 30 male 35.3 0 yes southwest 36837.
## 10 30 female 32.4 1 no southwest 4150.
## # ... with 640 more rows</code></pre>
<p>For example, let’s say we are interested in doing further analysis on people older than 29, who live in the south. We can construct a new tibble, where we filter out the unnecessary rows.</p>
<div class="sourceCode" id="cb245"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb245-1" title="1">claim_df <-<span class="st"> </span><span class="kw">filter</span>(claim_data, region <span class="op">%in%</span><span class="st"> </span><span class="kw">c</span>(<span class="st">"southwest"</span>, <span class="st">"southeast"</span>), </a>
<a class="sourceLine" id="cb245-2" title="2"> age <span class="op">>=</span><span class="st"> </span><span class="dv">30</span>)</a></code></pre></div>
</div>
<div id="arrange" class="section level2">
<h2>Arrange</h2>
<p>To arrange data we use dplyr’s function <code>arrange()</code>, which gets a tibble and the variables on which to arrange. If we want a descending arrangement, we have to use function <code>desc()</code>.</p>
<div class="sourceCode" id="cb246"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb246-1" title="1"><span class="kw">arrange</span>(claim_df, age)</a></code></pre></div>
<pre><code>## # A tibble: 475 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 30 male 35.3 0 yes southwest 36837.
## 2 30 female 32.4 1 no southwest 4150.
## 3 30 male 35.5 0 yes southeast 36950.
## 4 30 female 30.9 3 no southwest 5326.
## 5 30 female 33.3 1 no southeast 4151.
## 6 30 female 27.7 0 no southwest 3554.
## 7 30 female 28.4 1 yes southeast 19522.
## 8 30 female 43.1 2 no southeast 4754.
## 9 30 male 37.8 2 yes southwest 39241.
## 10 30 male 31.4 1 no southwest 3659.
## # ... with 465 more rows</code></pre>
<div class="sourceCode" id="cb248"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb248-1" title="1"><span class="kw">arrange</span>(claim_df, age, <span class="kw">desc</span>(charges))</a></code></pre></div>
<pre><code>## # A tibble: 475 x 7
## age sex bmi children smoker region charges
## <int> <fct> <dbl> <int> <fct> <fct> <dbl>
## 1 30 female 39.0 3 yes southeast 40932.
## 2 30 male 37.8 2 yes southwest 39241.
## 3 30 male 35.5 0 yes southeast 36950.
## 4 30 male 35.3 0 yes southwest 36837.
## 5 30 female 28.4 1 yes southeast 19522.
## 6 30 male 38.8 1 no southeast 18963.
## 7 30 male 24.4 3 yes southwest 18259.
## 8 30 female 30.9 3 no southwest 5326.
## 9 30 male 31.6 3 no southeast 4838.
## 10 30 female 43.1 2 no southeast 4754.
## # ... with 465 more rows</code></pre>
</div>
<div id="select" class="section level2">
<h2>Select</h2>
<p>In our current data set we have a relatively small number of columns, so working with our tibble is not too complicated. However, we often encounter data sets with large numbers of columns. In such situations, we might want to select a subset of columns. For that we have the function <code>select</code>.</p>
<p>To select certain columns, input the names into select.</p>
<div class="sourceCode" id="cb250"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb250-1" title="1"><span class="kw">select</span>(claim_df, age, sex)</a></code></pre></div>
<pre><code>## # A tibble: 475 x 2
## age sex
## <int> <fct>
## 1 31 female
## 2 46 female
## 3 62 female
## 4 56 female
## 5 56 male
## 6 30 male
## 7 30 female
## 8 59 female
## 9 31 male
## 10 60 male
## # ... with 465 more rows</code></pre>
<p>We can also select all columns between two columns with a colon. Using a minus sign will select all columns except the ones in the expression.</p>
<div class="sourceCode" id="cb252"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb252-1" title="1"><span class="kw">select</span>(claim_df, bmi<span class="op">:</span>region)</a></code></pre></div>
<pre><code>## # A tibble: 475 x 4
## bmi children smoker region
## <dbl> <int> <fct> <fct>
## 1 25.7 0 no southeast
## 2 33.4 1 no southeast
## 3 26.3 0 yes southeast
## 4 39.8 0 no southeast
## 5 40.3 0 no southwest
## 6 35.3 0 yes southwest
## 7 32.4 1 no southwest
## 8 27.7 3 no southeast
## 9 36.3 2 yes southwest
## 10 39.9 0 yes southwest
## # ... with 465 more rows</code></pre>
<div class="sourceCode" id="cb254"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb254-1" title="1"><span class="kw">select</span>(claim_df, <span class="op">-</span>(bmi<span class="op">:</span>region))</a></code></pre></div>
<pre><code>## # A tibble: 475 x 3
## age sex charges
## <int> <fct> <dbl>
## 1 31 female 3757.
## 2 46 female 8241.
## 3 62 female 27809.
## 4 56 female 11091.
## 5 56 male 10602.
## 6 30 male 36837.
## 7 30 female 4150.
## 8 59 female 14001.
## 9 31 male 38711
## 10 60 male 48173.
## # ... with 465 more rows</code></pre>
<p>There are several utility functions that let us select columns based on their names, for example <code>ends_with</code>, <code>starts_with</code>, or <code>contains</code>.</p>
<div class="sourceCode" id="cb256"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb256-1" title="1"><span class="kw">select</span>(claim_df, <span class="kw">starts_with</span>(<span class="st">"c"</span>))</a></code></pre></div>
<pre><code>## # A tibble: 475 x 2
## children charges
## <int> <dbl>
## 1 0 3757.
## 2 1 8241.
## 3 0 27809.
## 4 0 11091.
## 5 0 10602.
## 6 0 36837.
## 7 1 4150.
## 8 3 14001.
## 9 2 38711
## 10 0 48173.
## # ... with 465 more rows</code></pre>
</div>
<div id="mutate" class="section level2">
<h2>Mutate</h2>
<p>To create new variables in the data frame, dependent on the existing variables, we can use the <code>mutate()</code> function. For example, let’s create a new variable, which will consist of charges per insured person.</p>
<div class="sourceCode" id="cb258"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb258-1" title="1">claim_df <-<span class="st"> </span><span class="kw">mutate</span>(claim_df, <span class="dt">charges_per_person =</span> charges <span class="op">/</span><span class="st"> </span>(children <span class="op">+</span><span class="st"> </span><span class="dv">1</span>))</a>
<a class="sourceLine" id="cb258-2" title="2">claim_df</a></code></pre></div>
<pre><code>## # A tibble: 475 x 8
## age sex bmi children smoker region charges charges_per_person
## <int> <fct> <dbl> <int> <fct> <fct> <dbl> <dbl>
## 1 31 female 25.7 0 no southeast 3757. 3757.
## 2 46 female 33.4 1 no southeast 8241. 4120.
## 3 62 female 26.3 0 yes southeast 27809. 27809.
## 4 56 female 39.8 0 no southeast 11091. 11091.
## 5 56 male 40.3 0 no southwest 10602. 10602.
## 6 30 male 35.3 0 yes southwest 36837. 36837.
## 7 30 female 32.4 1 no southwest 4150. 2075.
## 8 59 female 27.7 3 no southeast 14001. 3500.
## 9 31 male 36.3 2 yes southwest 38711 12904.
## 10 60 male 39.9 0 yes southwest 48173. 48173.
## # ... with 465 more rows</code></pre>
<p>We can also use own functions when creating new variables. For example, let us create a new variable, which will classify the insured according to the standard BMI categories.</p>
<div class="sourceCode" id="cb260"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb260-1" title="1">classify_bmi <-<span class="st"> </span><span class="cf">function</span> (bmi) {</a>
<a class="sourceLine" id="cb260-2" title="2"> bmi_classes <-<span class="st"> </span><span class="kw">rep</span>(<span class="st">"underweight"</span>, <span class="dt">times =</span> <span class="kw">length</span>(bmi))</a>
<a class="sourceLine" id="cb260-3" title="3"> bmi_classes[bmi <span class="op">>=</span><span class="st"> </span><span class="fl">18.5</span> <span class="op">&</span><span class="st"> </span>bmi <span class="op"><</span><span class="st"> </span><span class="dv">25</span>] <-<span class="st"> "normal"</span></a>
<a class="sourceLine" id="cb260-4" title="4"> bmi_classes[bmi <span class="op">>=</span><span class="st"> </span><span class="dv">25</span>] <-<span class="st"> "overweight"</span></a>
<a class="sourceLine" id="cb260-5" title="5"> bmi_classes <-<span class="st"> </span><span class="kw">factor</span>(bmi_classes, <span class="dt">levels =</span> <span class="kw">c</span>(<span class="st">"underweight"</span>, </a>
<a class="sourceLine" id="cb260-6" title="6"> <span class="st">"normal"</span>, </a>
<a class="sourceLine" id="cb260-7" title="7"> <span class="st">"overweight"</span>),</a>
<a class="sourceLine" id="cb260-8" title="8"> <span class="dt">ordered =</span> <span class="ot">TRUE</span>)</a>
<a class="sourceLine" id="cb260-9" title="9"> <span class="kw">return</span>(bmi_classes)</a>
<a class="sourceLine" id="cb260-10" title="10">}</a>
<a class="sourceLine" id="cb260-11" title="11">claim_df <-<span class="st"> </span><span class="kw">mutate</span>(claim_df, <span class="dt">bmi_class =</span> <span class="kw">classify_bmi</span>(bmi))</a>
<a class="sourceLine" id="cb260-12" title="12">claim_df</a></code></pre></div>
<pre><code>## # A tibble: 475 x 9
## age sex bmi children smoker region charges charges_per_per~
## <int> <fct> <dbl> <int> <fct> <fct> <dbl> <dbl>
## 1 31 fema~ 25.7 0 no south~ 3757. 3757.
## 2 46 fema~ 33.4 1 no south~ 8241. 4120.
## 3 62 fema~ 26.3 0 yes south~ 27809. 27809.
## 4 56 fema~ 39.8 0 no south~ 11091. 11091.
## 5 56 male 40.3 0 no south~ 10602. 10602.
## 6 30 male 35.3 0 yes south~ 36837. 36837.
## 7 30 fema~ 32.4 1 no south~ 4150. 2075.
## 8 59 fema~ 27.7 3 no south~ 14001. 3500.
## 9 31 male 36.3 2 yes south~ 38711 12904.
## 10 60 male 39.9 0 yes south~ 48173. 48173.
## # ... with 465 more rows, and 1 more variable: bmi_class <ord></code></pre>
<p>The tibble is too wide to show all variables. Let us use select to check the values of our new variable.</p>
<div class="sourceCode" id="cb262"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb262-1" title="1"><span class="kw">select</span>(claim_df, bmi, bmi_class)</a></code></pre></div>
<pre><code>## # A tibble: 475 x 2
## bmi bmi_class
## <dbl> <ord>
## 1 25.7 overweight
## 2 33.4 overweight
## 3 26.3 overweight
## 4 39.8 overweight
## 5 40.3 overweight
## 6 35.3 overweight
## 7 32.4 overweight
## 8 27.7 overweight
## 9 36.3 overweight
## 10 39.9 overweight
## # ... with 465 more rows</code></pre>
</div>
<div id="summarise" class="section level2">
<h2>Summarise</h2>
<p>The <code>summarise</code> function aggregates the data according to some condition. Conditions are provided with the function <code>group_by</code>, if they are not, the data are aggregated over the whole tibble.</p>
<div class="sourceCode" id="cb264"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb264-1" title="1"><span class="kw">summarise</span>(claim_df, <span class="dt">mean_age =</span> <span class="kw">mean</span>(age), <span class="dt">mean_charges =</span> <span class="kw">mean</span>(charges))</a></code></pre></div>
<pre><code>## # A tibble: 1 x 2
## mean_age mean_charges
## <dbl> <dbl>
## 1 46.7 15341.</code></pre>
<p>To get something more meaningful, we first need to group the data. For example let us look at the mean charges, dependent on whether the insured is a smoker and his BMI class.</p>
<div class="sourceCode" id="cb266"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb266-1" title="1">g_data <-<span class="st"> </span><span class="kw">group_by</span>(claim_df, smoker, bmi_class)</a>
<a class="sourceLine" id="cb266-2" title="2"><span class="kw">summarise</span>(g_data, <span class="dt">mean_charges =</span> <span class="kw">mean</span>(charges))</a></code></pre></div>
<pre><code>## # A tibble: 5 x 3
## # Groups: smoker [?]
## smoker bmi_class mean_charges
## <fct> <ord> <dbl>
## 1 no normal 10454.
## 2 no overweight 9931.
## 3 yes underweight 19023.
## 4 yes normal 20420.
## 5 yes overweight 38326.</code></pre>
</div>
<div id="the-pipe" class="section level2">
<h2>The pipe</h2>
<p>To arrive at the above results we made several changes to the original data set. However, we can use the pipe <code>%>%</code> to do all these calls sequentially, without creating an additional data set, or changing the original.</p>
<p>Let us demonstrate how to get the same result as above with use of the pipe.</p>
<div class="sourceCode" id="cb268"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb268-1" title="1">claim_df <span class="op">%>%</span></a>
<a class="sourceLine" id="cb268-2" title="2"><span class="st"> </span><span class="kw">filter</span>(age <span class="op">>=</span><span class="st"> </span><span class="dv">30</span>, region <span class="op">%in%</span><span class="st"> </span><span class="kw">c</span>(<span class="st">"southwest"</span>, <span class="st">"southeast"</span>)) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb268-3" title="3"><span class="st"> </span><span class="kw">mutate</span>(<span class="dt">bmi_class =</span> <span class="kw">classify_bmi</span>(bmi)) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb268-4" title="4"><span class="st"> </span><span class="kw">group_by</span>(smoker, bmi_class) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb268-5" title="5"><span class="st"> </span><span class="kw">summarise</span>(<span class="dt">mean_charges =</span> <span class="kw">mean</span>(charges))</a></code></pre></div>
<pre><code>## # A tibble: 5 x 3
## # Groups: smoker [?]
## smoker bmi_class mean_charges
## <fct> <ord> <dbl>
## 1 no normal 10454.
## 2 no overweight 9931.
## 3 yes underweight 19023.
## 4 yes normal 20420.
## 5 yes overweight 38326.</code></pre>
<p>To count the number of cases in each group, use <code>count()</code>.</p>
<div class="sourceCode" id="cb270"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb270-1" title="1">claim_df <span class="op">%>%</span></a>
<a class="sourceLine" id="cb270-2" title="2"><span class="st"> </span><span class="kw">filter</span>(age <span class="op">>=</span><span class="st"> </span><span class="dv">30</span>, region <span class="op">%in%</span><span class="st"> </span><span class="kw">c</span>(<span class="st">"southwest"</span>, <span class="st">"southeast"</span>)) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb270-3" title="3"><span class="st"> </span><span class="kw">mutate</span>(<span class="dt">bmi_class =</span> <span class="kw">classify_bmi</span>(bmi)) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb270-4" title="4"><span class="st"> </span><span class="kw">group_by</span>(smoker, bmi_class) <span class="op">%>%</span></a>
<a class="sourceLine" id="cb270-5" title="5"><span class="st"> </span><span class="kw">count</span>()</a></code></pre></div>
<pre><code>## # A tibble: 5 x 3
## # Groups: smoker, bmi_class [5]
## smoker bmi_class n
## <fct> <ord> <int>
## 1 no normal 35
## 2 no overweight 340
## 3 yes underweight 1
## 4 yes normal 15
## 5 yes overweight 84</code></pre>
</div>
</div>
<div id="long-and-wide-data-formats" class="section level1">
<h1>Long and wide data formats</h1>
<p>Usually we encounter data in a wide format. A wide format of data is a format where each row represents an object, some columns represent identifiers of this object, and several columns contain measurements associated with this object. On the other hand, in a long format each row represents a measurement. In other words, the columns that contain object identifiers remain unchanged, but we get a new row for each of the measured values. The long format is usually easier to process, while the wide format is easier to comprehend. Also several R functions (for example <code>ggplot</code>) require a long data format.</p>
<p>The functions for conversion between the formats in <strong>tidyr</strong> are <code>gather</code> (wide to long) and <code>spread</code> (long to wide). Let us look how to use them on a stock market data (acquired from the R package <strong>datasets</strong>). Here we have the daily closing prices of four major European stock indices between the years 1991 and 1998. Each row represents an object – the day of the closing prices. Then we have four measurements (prices). This data frame is therefore in a wide format. Let us convert it to a long format, and then back to wide, to see how to use <code>gather</code> and <code>spread</code>.</p>
<div class="sourceCode" id="cb272"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb272-1" title="1"><span class="kw">library</span>(tidyr)</a>
<a class="sourceLine" id="cb272-2" title="2">stock_df <-<span class="st"> </span>datasets<span class="op">::</span>EuStockMarkets</a>
<a class="sourceLine" id="cb272-3" title="3">stock_df <-<span class="st"> </span><span class="kw">as_tibble</span>(<span class="kw">data.frame</span>(<span class="dt">X =</span> <span class="kw">as.matrix</span>(stock_df), <span class="dt">time=</span><span class="kw">time</span>(stock_df)))</a>
<a class="sourceLine" id="cb272-4" title="4">stock_df</a></code></pre></div>
<pre><code>## # A tibble: 1,860 x 5
## X.DAX X.SMI X.CAC X.FTSE time
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1629. 1678. 1773. 2444. 1991.
## 2 1614. 1688. 1750. 2460. 1992.
## 3 1607. 1679. 1718 2448. 1992.
## 4 1621. 1684. 1708. 2470. 1992.
## 5 1618. 1687. 1723. 2485. 1992.
## 6 1611. 1672. 1714. 2467. 1992.
## 7 1631. 1683. 1734. 2488. 1992.
## 8 1640. 1704. 1757. 2508. 1992.
## 9 1635. 1698. 1754 2510. 1992.
## 10 1646. 1716. 1754. 2497. 1992.
## # ... with 1,850 more rows</code></pre>
<div class="sourceCode" id="cb274"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb274-1" title="1">df_long <-<span class="st"> </span><span class="kw">gather</span>(stock_df, <span class="dt">key =</span> <span class="st">"stock"</span>, <span class="dt">value =</span> <span class="st">"price"</span>, <span class="op">-</span>time)</a>
<a class="sourceLine" id="cb274-2" title="2">df_long</a></code></pre></div>
<pre><code>## # A tibble: 7,440 x 3
## time stock price
## <dbl> <chr> <dbl>
## 1 1991. X.DAX 1629.
## 2 1992. X.DAX 1614.
## 3 1992. X.DAX 1607.