-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathch-010-core-r.html
471 lines (399 loc) · 20 KB
/
ch-010-core-r.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta http-equiv="X-UA-Compatible" content="IE=EDGE" />
<title>ch-010-core-r.knit</title>
<script src="site_libs/header-attrs-2.11/header-attrs.js"></script>
<script src="site_libs/jquery-3.6.0/jquery-3.6.0.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="site_libs/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script>
<style>h1 {font-size: 34px;}
h1.title {font-size: 38px;}
h2 {font-size: 30px;}
h3 {font-size: 24px;}
h4 {font-size: 18px;}
h5 {font-size: 16px;}
h6 {font-size: 12px;}
code {color: inherit; background-color: rgba(0, 0, 0, 0.04);}
pre:not([class]) { background-color: white }</style>
<script src="site_libs/navigation-1.1/tabsets.js"></script>
<link href="site_libs/highlightjs-9.12.0/default.css" rel="stylesheet" />
<script src="site_libs/highlightjs-9.12.0/highlight.js"></script>
<link href="site_libs/font-awesome-5.1.0/css/all.css" rel="stylesheet" />
<link href="site_libs/font-awesome-5.1.0/css/v4-shims.css" rel="stylesheet" />
<style type="text/css">
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
</style>
<style type="text/css">code{white-space: pre;}</style>
<script type="text/javascript">
if (window.hljs) {
hljs.configure({languages: []});
hljs.initHighlightingOnLoad();
if (document.readyState && document.readyState === "complete") {
window.setTimeout(function() { hljs.initHighlighting(); }, 0);
}
}
</script>
<link rel="stylesheet" href="textbook.css" type="text/css" />
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
img {
max-width:100%;
}
.tabbed-pane {
padding-top: 12px;
}
.html-widget {
margin-bottom: 20px;
}
button.code-folding-btn:focus {
outline: none;
}
summary {
display: list-item;
}
pre code {
padding: 0;
}
</style>
<style type="text/css">
.dropdown-submenu {
position: relative;
}
.dropdown-submenu>.dropdown-menu {
top: 0;
left: 100%;
margin-top: -6px;
margin-left: -1px;
border-radius: 0 6px 6px 6px;
}
.dropdown-submenu:hover>.dropdown-menu {
display: block;
}
.dropdown-submenu>a:after {
display: block;
content: " ";
float: right;
width: 0;
height: 0;
border-color: transparent;
border-style: solid;
border-width: 5px 0 5px 5px;
border-left-color: #cccccc;
margin-top: 5px;
margin-right: -10px;
}
.dropdown-submenu:hover>a:after {
border-left-color: #adb5bd;
}
.dropdown-submenu.pull-left {
float: none;
}
.dropdown-submenu.pull-left>.dropdown-menu {
left: -100%;
margin-left: 10px;
border-radius: 6px 0 6px 6px;
}
</style>
<script type="text/javascript">
// manage active state of menu based on current page
$(document).ready(function () {
// active menu anchor
href = window.location.pathname
href = href.substr(href.lastIndexOf('/') + 1)
if (href === "")
href = "index.html";
var menuAnchor = $('a[href="' + href + '"]');
// mark it active
menuAnchor.tab('show');
// if it's got a parent navbar menu mark it active as well
menuAnchor.closest('li.dropdown').addClass('active');
// Navbar adjustments
var navHeight = $(".navbar").first().height() + 15;
var style = document.createElement('style');
var pt = "padding-top: " + navHeight + "px; ";
var mt = "margin-top: -" + navHeight + "px; ";
var css = "";
// offset scroll position for anchor links (for fixed navbar)
for (var i = 1; i <= 6; i++) {
css += ".section h" + i + "{ " + pt + mt + "}\n";
}
style.innerHTML = "body {" + pt + "padding-bottom: 40px; }\n" + css;
document.head.appendChild(style);
});
</script>
<!-- tabsets -->
<style type="text/css">
.tabset-dropdown > .nav-tabs {
display: inline-table;
max-height: 500px;
min-height: 44px;
overflow-y: auto;
border: 1px solid #ddd;
border-radius: 4px;
}
.tabset-dropdown > .nav-tabs > li.active:before {
content: "";
font-family: 'Glyphicons Halflings';
display: inline-block;
padding: 10px;
border-right: 1px solid #ddd;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open > li.active:before {
content: "";
border: none;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open:before {
content: "";
font-family: 'Glyphicons Halflings';
display: inline-block;
padding: 10px;
border-right: 1px solid #ddd;
}
.tabset-dropdown > .nav-tabs > li.active {
display: block;
}
.tabset-dropdown > .nav-tabs > li > a,
.tabset-dropdown > .nav-tabs > li > a:focus,
.tabset-dropdown > .nav-tabs > li > a:hover {
border: none;
display: inline-block;
border-radius: 4px;
background-color: transparent;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open > li {
display: block;
float: none;
}
.tabset-dropdown > .nav-tabs > li {
display: none;
}
</style>
<!-- code folding -->
</head>
<body>
<div class="container-fluid main-container">
<div class="navbar navbar-inverse navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="index.html">DATA SCIENCE I</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://ds4ps.org/dp4ss/">
<span class="fa fa-university fa-2x"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
<div id="header">
</div>
<div id="TOC">
<ul>
<li><a href="#the-r-language" id="toc-the-r-language"><span class="toc-section-number">1</span> The <strong>R</strong> Language</a>
<ul>
<li><a href="#key-concepts" id="toc-key-concepts"><span class="toc-section-number">1.1</span> Key Concepts</a></li>
<li><a href="#r-an-open-source-language-for-statistical-computing" id="toc-r-an-open-source-language-for-statistical-computing"><span class="toc-section-number">1.2</span> R: An Open Source Language for Statistical Computing</a></li>
<li><a href="#r-as-a-social-network" id="toc-r-as-a-social-network"><span class="toc-section-number">1.3</span> R as a Social Network</a></li>
<li><a href="#r-as-an-operating-system" id="toc-r-as-an-operating-system"><span class="toc-section-number">1.4</span> R as an Operating System</a></li>
<li><a href="#downloading-installing-r" id="toc-downloading-installing-r"><span class="toc-section-number">1.5</span> Downloading & Installing R</a></li>
<li><a href="#r-console" id="toc-r-console"><span class="toc-section-number">1.6</span> R Console</a></li>
<li><a href="#extending-rs-functionality-packages" id="toc-extending-rs-functionality-packages"><span class="toc-section-number">1.7</span> Extending R’s Functionality: Packages</a></li>
<li><a href="#resources" id="toc-resources"><span class="toc-section-number">1.8</span> Resources</a></li>
</ul></li>
</ul>
</div>
<div id="the-r-language" class="section level1" number="1">
<h1><span class="header-section-number">1</span> The <strong>R</strong> Language</h1>
<p><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/R_logo.svg/724px-R_logo.svg.png" width="30%" style="display: block; margin: auto auto auto 0;" /></p>
<p><br>
<br></p>
<div id="key-concepts" class="section level2 tip" number="1.1">
<h2><span class="header-section-number">1.1</span> Key Concepts</h2>
<p>R is a specialized programming language created by statisticians for data analysis and visualization.</p>
<ul>
<li>R Console</li>
<li>Base R</li>
<li>Packages</li>
<li>Comprehensive R Archive Network (CRAN)</li>
</ul>
<p><br>
<br></p>
</div>
<div id="r-an-open-source-language-for-statistical-computing" class="section level2" number="1.2">
<h2><span class="header-section-number">1.2</span> R: An Open Source Language for Statistical Computing</h2>
<p>R is a language that was designed for <strong>statistical computing</strong>, the art of combining computer science tools for problem-solving with models from statistics. The goal is to turn raw data into useful, actionable insights. This field has come to be known as <strong>Data Science</strong>.</p>
<p>R is an <strong>open source</strong> language, which means that applications built in R are not only free, but users are allowed to access and modify the <em>source code</em>.</p>
<p>As a result of this design approach, it is extremely easy to develop and adapt code in R. Because of the freedom this provides, R users have expanded the power and functionality of <strong>Core R</strong> for nearly a quarter century.</p>
<p>Custom applications and tools that users create for R are called <strong>packages</strong> (also called <strong>libraries</strong> when you are loading them). Packages are programs designed to perform a specific type of analysis or visualization.</p>
<p>The best part of R is how easy it is to access cutting edge software by installing new packages in a two lines of code:</p>
<pre class="r"><code>install.packages( "tidyverse" ) # install the package
library( "tidyverse" ) # load the package</code></pre>
<p><br></p>
<p><strong>Popular R Packages:</strong> [ <a href="https://support.rstudio.com/hc/en-us/articles/201057987-Quick-list-of-useful-R-packages"><strong>A RECENT LIST</strong></a> ]</p>
<p><br></p>
</div>
<div id="r-as-a-social-network" class="section level2" number="1.3">
<h2><span class="header-section-number">1.3</span> R as a Social Network</h2>
<p>The R Foundation is a nonprofit that maintains the R language and ensures it remains free and accessible to everyone in the world. Packages are shared through the Comprehensive R Archival Network (The <strong>CRAN</strong>), a group of servers housed primarily at universities that store R packages so they can be quickly downloaded and deployed.</p>
<p>There are over 15,000 packages that users have created for R. They perform a wide variety of tasks such as data preparation, specialized statistical analysis, custom data visualizations, or specific analytical tasks such as text analysis or network analysis.</p>
<p>This functionality is a primary reason R has become one of the most popular languages used by academics and data scientists. Provides a very simple way for people to develop cool tools and share them with the world. It became popular because it was built by smart and creative people, who attracted other smart and creative people, who created cool tools, which then attracted more smart and creative people.
<a href="http://gallery.shinyapps.io/087-crandash"><strong>R Package Downloads</strong></a></p>
<p><br></p>
</div>
<div id="r-as-an-operating-system" class="section level2" number="1.4">
<h2><span class="header-section-number">1.4</span> R as an Operating System</h2>
<p>R is a <strong>programming language</strong>. We can think of a programming languages as instructions that are evaluated and carried out by a computer. R, then, is simply one way to give instructions to computers.</p>
<p>This is a limited view of R, though. It is better understood as an operating system for data science software. Just as Windows allows you to turn on your computer, open a web browser, moved files around, and write a paper using MS Word, R allows you to access the CRAN, install and run packages, and manage files while organizing large data projects. Just like Windows would be a very boring piece of software without all of the applications you run while on the computer, R would be a boring language without all of the packages it can run.</p>
<div class="figure" style="text-align: center">
<img src="figures/ch-001-image1.jpg" alt="*R is both a programming language and programming environment.*" width="75%" />
<p class="caption">
(#fig:000.a)<em>R is both a programming language and programming environment.</em>
</p>
</div>
<p><br></p>
</div>
<div id="downloading-installing-r" class="section level2" number="1.5">
<h2><span class="header-section-number">1.5</span> Downloading & Installing R</h2>
<p>You can download and install R quickly and easily from the <a href="https://cran.r-project.org/"><strong>Comprehensive R Archive Network</strong></a>, or <strong>CRAN</strong>. It is a decentralized website that’s hosted and updated by academic institutions all over the world. In other words, R would survive a semi-global catastrophic event. It contains:</p>
<ul>
<li>The latest version and past versions of R</li>
<li>Extensions, also called <strong>packages</strong>, for R</li>
<li>Package and version documentation</li>
<li>Books, blogs, conferences, news, etc.</li>
</ul>
<p><br></p>
</div>
<div id="r-console" class="section level2" number="1.6">
<h2><span class="header-section-number">1.6</span> R Console</h2>
<p>After installing, when you open Base R directly you will see the <strong>command-line interface</strong>, or a <strong>console</strong>. This is used to type R code is directly evaluated by the environment, a process known as working <em>interactively</em>.</p>
<p>While this is practice is a quick way to run some simple code, it is difficult to develop complex programs in real-time (it would be like writing a play while it is being acted out). A more typical and organized way to create data recipes is through <strong>scripting</strong>, which we address below.</p>
</div>
<div id="extending-rs-functionality-packages" class="section level2" number="1.7">
<h2><span class="header-section-number">1.7</span> Extending R’s Functionality: Packages</h2>
<p><strong>Packages</strong> are collections of new commands, a.k.a. <strong>functions</strong>, that are developed and shared by the worldwide R userbase. Packages greatly expand the power and functionality of <strong>base R</strong>, the “vanilla” or unmodified version of R. While <a href="https://cran.r-project.org/">CRAN</a> is the most popular package archive, others include <a href="http://www.bioconductor.org/">Bioconductor</a> and <a href="https://cran.r-project.org/web/packages/githubinstall/vignettes/githubinstall.html">GitHub</a>.</p>
<p><br></p>
<p>If R were the Constitution of a nation, packages would be its amendments - they not only provide more freedom for the user, they address new ideas and practices that were unforeseen by R’s founders. More on packages:</p>
<ul>
<li>Functions and packages are developed in response to identified needs</li>
<li>If your needs are unmet by base R, there’s likely a package for it</li>
<li>Altogether, there are over 18,000 packages on CRAN, alone</li>
<li>There are tens of thousands of unpublished packages</li>
<li>Entire ecosystems of packages exist, e.g, <em>Tidyverse</em></li>
</ul>
<p><br></p>
<div class="figure" style="text-align: center">
<img src="figures/ch-001-image2.jpg" alt="*Packages give users more freedom and resolve issues unforeseen by R's founders.*" width="75%" />
<p class="caption">
(#fig:000.b)<em>Packages give users more freedom and resolve issues unforeseen by R’s founders.</em>
</p>
</div>
<p><br></p>
<p>You can <strong>install packages</strong> in R by calling the <code>install.packages()</code> function, <em>with the package name in quotations</em>:</p>
<pre class="r"><code>install.packages("my_package")</code></pre>
<p>Once installed, you can <strong>load packages</strong> by calling the <code>library()</code> function, <em>without quotations</em>.</p>
<p><strong>Note:</strong> You only need to install a package <em>once</em>. However, you must load each package <em>every time you start R</em>:</p>
<pre class="r"><code>library(my_package)</code></pre>
<p><strong>Note:</strong> “Packages” and “libraries” are two words for the same thing. They both refer to a set of <strong>functions</strong> that have been “packaged” or are organized into a “library” to be shared.</p>
<p><br></p>
<div class="note">
<p><strong>Fun Fact:</strong> R is an implementation of an older programming language, <strong>S</strong>. John Chambers first developed S in 1976 to make statistical analysis a point-and-click, interactive, and user-friendly process. However, Chambers’ underlying philosophy reflects the use of R packages to this day:</p>
<blockquote>
<p>“We wanted users to be able to be in in an interactive environment, where they did not consciously think of themselves as programming. Then as their needs became clearer and their sophistication increases, they should be able to slide gradually into programming, when the language and system aspects would become more important.”</p>
</blockquote>
</div>
<p><br></p>
</div>
<div id="resources" class="section level2" number="1.8">
<h2><span class="header-section-number">1.8</span> Resources</h2>
<p>There’s a litany of online and print resources introducing the R language. Here are a few that we find instructive:</p>
<p><br></p>
<p><strong>I) Full-Length Introductions to R:</strong></p>
<ul>
<li><a href="http://ds4ps.org/dp4ss-textbook/ch-000-introduction-to-r">“Part I: Foundations, Introduction to R”</a> (Lecy, 2018)</li>
<li><a href="https://rpubs.com/jamisoncrawford/nutsandbolts">“Intro to R: Nuts * Bolts”</a> (Crawford, 2018)</li>
</ul>
<p><strong>II) Publications & Articles:</strong></p>
<ul>
<li><a href="https://www.r-project.org/about.html">“What is R? Introduction to R and the R Environment”</a> (CRAN, 2001)</li>
<li><a href="https://www.tandfonline.com/doi/abs/10.1080/10618600.1996.10474713">“R: A Language for Data Analysis and Graphics”</a> (Ihaka & Gentleman, 1996)</li>
</ul>
<p><strong>III) Handouts & Cheat Sheets:</strong></p>
<ul>
<li><a href="https://github.com/DS4PS/dp4ss-textbook/raw/master/resources/ch-001_handout1_r_vocabulary.pdf">“R: Some Helpful Vocabulary”</a> (Lecy, 2017)</li>
<li><a href="https://www.rstudio.com/wp-content/uploads/2016/05/base-r.pdf">“Base R Cheat Sheet”</a> (RStudio, 2016)</li>
</ul>
<p><strong>IV) Videos:</strong></p>
<ul>
<li><a href="https://player.vimeo.com/video/180644880">“R in 60 Seconds”</a> (Lecy, 2018)</li>
<li><a href="https://www.youtube.com/watch?time_continue=135&v=jk9S3RTAl38">“John Chambers Interview [On the History of S & R]”</a> (Statistical Learning, 2013)</li>
</ul>
<p><br></p>
</div>
</div>
<div class="footer">
<div class="row" align="center">
Notes for the <a href=http://ds4ps.org/ms-prog-eval-data-analytics/ target="_blank">MS in Program Evaluation and Data Analytics</a><br>
A program at <a href=https://asuonline.asu.edu/online-degree-programs/graduate/program-evaluation-and-data-analytics-ms/ target="_blank">Arizona State University</a><br>
Website powered by <a href=https://rmarkdown.rstudio.com/ target="_blank">R Markdown</a> and <a href=http://jekyllrb.com target="_blank">Jekyll</a>
<br>
<br>
</div>
</div>
</div>
<script>
// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
$('tr.odd').parent('tbody').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
bootstrapStylePandocTables();
});
</script>
<!-- tabsets -->
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
$(document).ready(function () {
$('.tabset-dropdown > .nav-tabs > li').click(function () {
$(this).parent().toggleClass('nav-tabs-open');
});
});
</script>
<!-- code folding -->
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>