forked from swcarpentry/r-novice-gapminder
-
Notifications
You must be signed in to change notification settings - Fork 0
/
reference.html
216 lines (216 loc) · 12.3 KB
/
reference.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: R for reproducible scientific analysis</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<h1 class="title">R for reproducible scientific analysis</h1>
<h2 class="subtitle">Reference</h2>
<h2 id="introduction-to-r-and-rstudio"><a href="01-rstudio-intro.html">Introduction to R and RStudio</a></h2>
<ul>
<li>Use the escape key to cancel incomplete commands or running code (Ctrl+C) if you’re using R from the shell.</li>
<li>Basic arithmetic operations follow standard order of precedence:</li>
<li>Brackets: <code>(</code>, <code>)</code></li>
<li>Exponents: <code>^</code> or <code>**</code></li>
<li>Divide: <code>/</code></li>
<li>Multiply: <code>*</code></li>
<li>Add: <code>+</code></li>
<li>Subtract: <code>-</code></li>
<li>Scientific notation is available, e.g: <code>2e-3</code></li>
<li>Anything to the right of a <code>#</code> is a comment, R will ignore this!</li>
<li>Functions are denoted by <code>function_name()</code>. Expressions inside the brackets are evaluated before being passed to the function, and functions can be nested.</li>
<li>Mathematical functions: <code>exp</code>, <code>sin</code>, <code>log</code>, <code>log10</code>, <code>log2</code> etc.</li>
<li>Comparison operators: <code><</code>, <code><=</code>, <code>></code>, <code>>=</code>, <code>==</code>, <code>!=</code></li>
<li>Use <code>all.equal</code> to compare numbers!</li>
<li><code><-</code> is the assignment operator. Anything to the right is evaluate, then stored in a variable named to the left.</li>
<li><code>ls</code> lists all variables and functions you’ve created</li>
<li><code>rm</code> can be used to remove them</li>
<li>When assigning values to function arguments, you <em>must</em> use <code>=</code>.</li>
</ul>
<h2 id="project-management-with-rstudio"><a href="02-project-intro.html">Project management with RStudio</a></h2>
<ul>
<li>To create a new project, go to File -> New Project</li>
<li>Install the <code>packrat</code> package to create self-contained projects</li>
<li><code>install.packages</code> to install packages from CRAN</li>
<li><code>library</code> to load a package into R</li>
<li><code>packrat::status</code> to check whether all packages referenced in your scripts have been installed.</li>
</ul>
<h2 id="reading-data"><a href="03-reading-data.html">Reading data</a></h2>
<ul>
<li><code>read.table</code> to read in data in a regular structure</li>
<li><code>sep</code> argument to specify the separator
<ul>
<li>“,” for comma separated</li>
<li>“” for tab separated</li>
</ul></li>
<li>Other arguments:
<ul>
<li><code>header=TRUE</code> if there is a header row</li>
</ul></li>
</ul>
<h2 id="seeking-help"><a href="04-seeking-help.html">Seeking help</a></h2>
<ul>
<li><code>?</code> or <code>help()</code> to seek help for a function.</li>
<li><code>??</code> to search for a function.</li>
<li>Wrap special operators in quotes when searching for help: <code>help("+")</code>.</li>
<li><a href="http://cran.at.r-project.org/web/views">CRAN Task Views</a>.</li>
<li><a href="http://stackoverflow.com/">stackoverflow</a>.</li>
</ul>
<h2 id="data-structures"><a href="05-data-structures.html">Data structures</a></h2>
<p><strong>Basic data structures in R:</strong></p>
<ul>
<li>atomic <code>?vector</code> (can only contain one type)</li>
<li><code>?list</code> (containers for other objects)</li>
<li><code>?data.frame</code> two dimensional objects whose columns can contain different types of data</li>
<li><code>?matrix</code> two dimensional objects that can contain only one type of data.</li>
<li><code>?factor</code> vectors that contain predefined categorical data.</li>
<li><code>?array</code> multi-dimensional objects that can only contain one type of data</li>
</ul>
<p>Remember that matrices are really atomic vectors underneath the hood, and that data.frames are really lists underneath the hood (this explains some of the weirder behaviour of R).</p>
<p><strong>Data types:</strong></p>
<ul>
<li><code>?numeric</code> real (decimal) numbers</li>
<li><code>?integer</code> whole numbers only</li>
<li><code>?character</code> text</li>
<li><code>?complex</code> complex numbers</li>
<li><code>?logical</code> TRUE or FALSE values</li>
</ul>
<p><strong>Special types:</strong></p>
<ul>
<li><code>?NA</code> missing values</li>
<li><code>?NaN</code> “not a number” for undefined values (e.g. <code>0/0</code>).</li>
<li><code>?Inf</code>, <code>-Inf</code> infinity.</li>
<li><code>?NULL</code> a data structure that doesn’t exist</li>
</ul>
<p><code>NA</code> can occur in any atomic vector. <code>NaN</code>, and <code>Inf</code> can only occur in complex, integer or numeric type vectors. Atomic vectors are the building blocks for all other data structures. A <code>NULL</code> value will occur in place of an entire data structure (but can occur as list elements).</p>
<p><strong>Useful functions for querying data structures:</strong></p>
<ul>
<li><code>?str</code> structure, prints out a summary of the whole data structure</li>
<li><code>?typeof</code> tells you the type inside an atomic vector</li>
<li><code>?class</code> what is the data structure?</li>
<li><code>?head</code> print the first <code>n</code> elements (rows for two-dimensional objects)</li>
<li><code>?tail</code> print the last <code>n</code> elements (rows for two-dimensional objects)</li>
<li><code>?rownames</code>, <code>?colnames</code>, <code>?dimnames</code> retrieve or modify the row names and column names of an object.</li>
<li><code>?names</code> retrieve or modify the names of an atomic vector or list (or columns of a data.frame).</li>
<li><code>?length</code> get the number of elements in an atomic vector</li>
<li><code>?nrow</code>, <code>?ncol</code>, <code>?dim</code> get the dimensions of a n-dimensional object (Won’t work on atomic vectors or lists).</li>
</ul>
<h2 id="data-subsetting"><a href="06-data-subsetting.html">Data subsetting</a></h2>
<ul>
<li>Elements can be accessed by:</li>
<li>Index</li>
<li>Name</li>
<li><code>:</code> to generate a sequence of numbers to extract slices</li>
<li><code>[</code> single square brackets:</li>
<li><em>extract</em> single elements or <em>subset</em>: - vectors</li>
<li><em>extract</em> single elements of a list</li>
<li><em>extract</em> columns from a data.frame</li>
<li><code>[</code> with two arguments to:</li>
<li><em>extract</em> rows and/or columns of
<ul>
<li>matrices</li>
<li>data.frames</li>
</ul></li>
<li><code>[[</code> double square brackets to subset lists</li>
<li><code>$</code> to access columns or list elements by name</li>
<li>negative indices skip elements</li>
</ul>
<h2 id="writing-data"><a href="07-writing-data.html">Writing data</a></h2>
<ul>
<li><code>write.table</code> to write out objects in regular format</li>
<li>set <code>quote=FALSE</code> so that text isn’t wrapped in <code>"</code> marks</li>
</ul>
<h2 id="vectorisation"><a href="08-vectorisation.html">Vectorisation</a></h2>
<ul>
<li>Most functions and operations apply to each element of a vector</li>
<li><code>*</code> applies element-wise to matrices</li>
<li><code>%*%</code> for true matrix multiplication</li>
<li><code>any()</code> will return <code>TRUE</code> if any element of a vector is <code>TRUE</code></li>
<li><code>all()</code> will return <code>TRUE</code> if <em>all</em> elements of a vector are <code>TRUE</code></li>
</ul>
<h2 id="control-flow"><a href="09-control-flow.html">Control flow</a></h2>
<ul>
<li>Use <code>if</code> condition to start a conditional statement, <code>else if</code> condition to provide additional tests, and <code>else</code> to provide a default</li>
<li>The bodies of the branches of conditional statements must be indented.</li>
<li>Use <code>==</code> to test for equality.</li>
<li><code>X && Y</code> is only true if both X and Y are <code>TRUE</code>.</li>
<li><code>X || Y</code> is true if either X or Y, or both, are <code>TRUE</code>.</li>
<li>Zero is considered <code>FALSE</code>; all other numbers are considered <code>TRUE</code></li>
<li>Nest loops to operate on multi-dimensional data.</li>
</ul>
<h2 id="functions"><a href="10-functions.html">Functions</a></h2>
<ul>
<li>Put code whose parameters change frequently in a function, then call it with different parameter values to customize its behavior.</li>
<li>The last line of a function is returned, or you can use <code>return</code> explictly</li>
<li>Any code written in the body of the function is isolated to the function when it is called.</li>
<li>Document Why, then What, then lastly How (if the code isn’t self explanatory)</li>
</ul>
<h2 id="split-apply-combine"><a href="11-plyr.html">Split-apply-combine</a></h2>
<ul>
<li>Use the <code>xxply</code> family of functions to apply functions to groups within some data.</li>
<li>the first letter, <code>a</code>rray , <code>d</code>ata.frame or <code>l</code>ist corresponds to the input data</li>
<li>the second letter denotes the output data structure</li>
<li>Anonymous functions (those not assigned a name) are used inside the <code>plyr</code> family of functions on groups within data.</li>
</ul>
<h2 id="ggplot2"><a href="12-ggplot2.html">GGplot2</a></h2>
<ul>
<li>figures can be created with the grammar of graphics:</li>
<li><code>ggplot</code> to create the base figure</li>
<li><code>aes</code>thetics specify the data axes, shape, color, and data size</li>
<li><code>geom</code>etry functions specify the type of plot, e.g. <code>point</code>, <code>line</code>, <code>density</code>, <code>box</code></li>
<li><code>geom</code>etry functions also add statistical transforms, e.g. <code>geom_smooth</code></li>
<li><code>scale</code> functions change the mapping from data to aesthetics</li>
<li><code>facet</code> functions stratify the figure into panels</li>
<li><code>aes</code>thetics apply to individual layers, or can be set for the whole plot inside <code>ggplot</code>.</li>
<li><code>theme</code> functions change the overall look of the plot</li>
<li>order of layers matters!</li>
<li><code>ggsave</code> to save a figure.</li>
</ul>
<h2 id="defensive-programming">Defensive Programming</h2>
<ul>
<li>Program defensively, i.e., assume that errors are going to arise, and write code to detect them when they do.</li>
<li>Write tests before writing code in order to help determine exactly what that code is supposed to do.</li>
<li>Know what code is supposed to do before trying to debug it.</li>
<li>Make it fail every time.</li>
<li>Make it fail fast.</li>
<li>Change one thing at a time, and for a reason.</li>
<li>Keep track of what you’ve done.</li>
<li>Be humble</li>
</ul>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/lesson-template">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
</body>
</html>