-
Notifications
You must be signed in to change notification settings - Fork 6
/
syllabus.qmd
407 lines (263 loc) · 24.5 KB
/
syllabus.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
---
title: "{{< fa user-shield >}} Syllabus"
---
::: {.callout-note appearance="minimal"}
## Term 2024 Winter 1: 03 Sep - 05 Dec 2024
:::
## Course info
__Instructors__:\
Geoff Pleiss\
Website: <https://geoffpleiss.com/>\
Email: <[email protected]>\
Slack: [@geoff](https://stat-406-2024.slack.com/)
Trevor Campbell\
Website: <https://trevorcampbell.me/>\
Email: <[email protected]>\
Slack: [@trevor](https://stat-406-2024.slack.com/)
__Office hours__:\
See [Canvas](https://canvas.ubc.ca/courses/147492) for times and locations.
__Course webpage__:\
WWW: <https://ubc-stat.github.io/stat-406/>\
Github: <https://github.com/stat-406-2024/>\
Canvas: <https://canvas.ubc.ca/courses/147492/>
__Lectures/Labs__:\
See [Canvas](https://canvas.ubc.ca/courses/147492) for times and locations.
__Textbooks__:\
[\[ISLR\]](https://www.statlearning.com)
[\[ESL\]](https://web.stanford.edu/~hastie/ElemStatLearn/)
__Prerequisite__:\
STAT 306 or CPSC 340
## Course objectives
This is a course in statistical learning methods. Based on the theory of
linear models covered in Stat 306, this course will focus on applying many
techniques of data analysis to interesting datasets.
The course combines analysis with methodology and
computational aspects. It treats both the "art" of understanding
unfamiliar data and the "science" of analyzing that data in terms of
statistical properties. The focus will be on practical aspects of methodology and intuition
to help students develop tools for selecting appropriate methods and
approaches to problems in their own lives.
This is not a "how to program" course, nor a "tour of machine learning methods".
Rather, this course is about how to *understand* some ML methods. STAT 306 tends
to give background in many of the tools of understanding as well as working with
already-written `R` packages. On the other hand, CPSC 340 introduces
many methods with a focus on "from-scratch" implementation (in Julia or Python).
This course will try to bridge the gap between these approaches. Depending on which course you took, you
may be more or less skilled in some aspects than in others. That's OK and
expected.
### Learning outcomes
1. Assess the prediction properties of the supervised learning methods covered in class;
1. Correctly use regularization to improve predictions from linear models, and also to identify important explanatory variables;
1. Explain the practical
difference between predictions obtained with parametric and non-parametric methods, and decide in specific applications which approach should be used;
1. Select and construct appropriate ensembles to obtain improved predictions in different contexts;
1. Use and interpret principal components and other dimension reduction techniques;
1. Employ reasonable coding practices and understand basic R syntax and function.
1. Write reports and use proper version control; engage with standard software.
## Textbooks
#### Required:
*An Introduction to Statistical Learning*, James, Witten, Hastie, Tibshirani, 2013, Springer, New York. (denoted \[ISLR\])
Available **free** online: <https://www.statlearning.com>
#### Optional (but excellent):
*The Elements of Statistical Learning*, Hastie, Tibshirani, Friedman, 2009, Second Edition, Springer, New York. (denoted [ESL])
Also available **free** online: <https://web.stanford.edu/~hastie/ElemStatLearn/>
This second book is a more advanced treatment of a superset of the topics we
will cover. If you want to learn more and understand the material more deeply,
this is the book for you. All readings from
[ESL] are optional.
## Course assessment opportunities
### Effort-based component
Labs: [0, 20]
Homework assignments: [0, 50]
Clickers: [0, 10]
Total: `min(65, Labs + Homework + Clickers)`
---
#### Labs
These are intended to keep you on track. They are to be
submitted via pull requests in your personal `labs-<username>` repo (see
the [computing tab](/computing/) for descriptions on how to do this).
Labs typically have a few questions for you to
answer or code to implement. These are to be done _during lab_
periods. But you can do them on your own as well. These are worth 2 points each up to a maximum of 20 points. They are due at 2300 on the day of your assigned lab section.
If you attend lab, you may share a submission with another student (with acknowledgement on the PR). If you do not attend lab, you must work on your own (subject to the collaboration instructions for Assignments below).
##### Rules.
You must submit via PR by the deadline. Your PR must include **at least 3 commits**. After lab 2, failure to include at least 3 commits will result in a maximum score of 1.
::: {.callout-tip}
If you attend your lab section, you may work in pairs, submitting a single document to one of your Repos. Be sure to put both names on the document, and mention the collaboration on your PR. You still have until 11pm to submit.
:::
##### Marking.
The overriding theme here is "if you put in the effort, you'll get all the
points." Grading scheme:
* 2 if basically all correct
* 1 if complete but with some major errors, or mostly complete and mostly correct
* 0 otherwise
You may submit as many labs as you wish up to 20 total points.
There are no appeals on grades.
It's important here to recognize just how important active participation in
these activities is. You learn by doing, and this is your opportunity to learn
in a low-stakes environment. One thing you'll learn, for example, is that all
animals urinate in 21 seconds.[^note]
[^note]: A careful reading of [this paper](https://arxiv.org/abs/1310.3737) with the provocative title "Law of Urination: all mammals empty their bladders over the same duration" reveals that the authors actually mean something far less precise. In fact, their claim is more accurately stated as "mammals over 3kg in body weight urinate in 21 seconds with a standard deviation of 13 seconds". But the accurate characterization is far less publicity-worthy.
#### Assignments
There will be 5 assignments. These are submitted via pull request similar to the labs but to the `homework-<username>` repo. Each assignment is worth up to 10 points. They are due by 2300 on the deadline. You must make **at least 5 commits**. Failure to have at least 5 commits will result in a 25% deduction on HW1 and a 50% deduction thereafter. **No exceptions.**
Assignments are typically lightly marked. The median last year was 8/10. But they are not easy. Nor are they short. They often involve a combination of coding, writing, description, and production of statistical graphics.
After receiving a mark and feedback, if you score less than 7, you may make corrections to bring your total to 7. This means, if you fix everything that you did wrong, you get 7. Not 10. The revision must be submitted within 1 week of getting your mark. Only 1 revision per assignment. The TA decision is final. Note that the TAs will only regrade parts you missed, but if you somehow make it worse, they can deduct more points.
The revision allowance applies only if you got 3 or more points of "content" deductions. If you missed 3 points for content and 2 more for "penalties" (like insufficient commits, code that runs off the side of the page, etc), then you are ineligible.
##### Policy on collaboration on assignments
Discussing assignments with your classmates is allowed and encouraged, but it is
important that every student get practice working on these problems. This means
that **all the work you turn in must be your own**. The general policy on
homework collaboration is:
1. You must first make a serious effort to solve the problem.
2. If you are stuck after doing so, you may ask for help from another student.
You may discuss strategies to solve the problem, but you may not look at
their code, nor may they spell out the solution to you step-by-step.
3. Once you have gotten help, you must write your own solution individually. You
must disclose, in your GitHub pull request, the names of anyone from whom you
got help.
4. This also applies in reverse: if someone approaches you for help, you must
not provide it unless they have already attempted to solve the problem, and
you may not share your code or spell out the solution step-by-step.
::: {.callout-warning}
Adherence to the above policy means that identical answers, or nearly identical answers, cannot occur. Thus, such occurrences are violations of the Course's [Academic honesty policy](#academic-honesty-and-standards).
:::
These rules also apply to getting help from other people such as friends not in
the course (try the problem first, discuss strategies, not step-by-step
solutions, acknowledge those from whom you received help).
You may not use homework help websites under any circumstances.
You are also very strongly discouraged from using ChatGPT, Stack Overflow, and so on.
The purpose here is to learn. Good faith efforts toward learning are rewarded.
You can always, of course, ask me for help on Slack. And public Slack
**questions** are allowed and encouraged.
You may also use external sources (books, websites, papers, ...) to
* Look up programming language documentation, find useful packages, find
explanations for error messages, or remind yourself about the syntax for some feature. I do this all the time in the real world. Wikipedia is your friend.
* Read about general approaches to solving specific problems (e.g. a guide to dynamic programming or a tutorial on unit testing in your programming language), or
* Clarify material from the course notes or assignments.
But external sources must be used to **support** your solution, not to
**obtain** your solution. You may **not** use them to
* Find solutions to the specific problems assigned as homework (in words or in code)—you must independently solve the problem assigned, not translate a solution presented online or elsewhere.
* Find course materials or solutions from this or similar courses from previous years, or
* Copy text or code to use in your submissions without attribution.
If you use code from online or other sources (including generative AI),
you must include code comments identifying the source.
It must be clear what code you wrote and what code is
from other sources. This rule also applies to text, images, and any other
material you submit.
Please talk to me if you have any questions about this policy. Any form of
plagiarism or cheating will result in sanctions to be determined by me,
including grade penalties (such as negative points for the assignment or
reductions in letter grade) or course failure. I am obliged to report violations
to the appropriate University authorities. See also the text below.
#### Clickers
These are short multiple choice and True / False questions. They happen in class. For each question, correct answers are worth `4`, incorrect answers are worth `2`. You get `0` points for not answering.
Suppose there are `N` total clicker questions, and you have `x` points. Your final score for this component is
`max(0, min(5 * x / N - 5, 10))`.
Note that if your average is less than 1, you get 0 points in this component.
::: {.callout-important}
In addition, your final grade in this course will be reduced by 1 full letter grade.
:::
This means that if you did everything else and get a perfect score on the final exam, you will get a 79. Two people did this last year. They were sad.
::: {.callout-warning}
DON'T DO THIS!!
:::
This may sound harsh, but think about what is required for such a penalty. You'd have to skip more than 50% of class meetings and get every question wrong when you are in class. This is an in-person course. It is not possible to get an A without attending class on a regular basis.
To compensate, I will do my best to post recordings of lectures. Past experience has shown 2 things:
1. You learn better by attending class than by skipping and "watching".
2. Sometimes the technology messes up. So there's no guarantee that these will be available.
The purpose is to let you occasionally miss class for any reason with minimal consequences. See also [below](#your-personal-health). If for some reason you need to miss longer streches of time, please contact me or discuss your situation with your Academic Advisor as soon as possible. Don't wait until December.
---
#### Your score on HW, Labs, and Clickers
The total you can accumulate across these 3 components is 65 points. But you can get there however you want. The total available is 80 points. The rest is up to you. But with choice, comes responsibility.
Rules:
* Nothing dropped.
* No extensions.
* If you miss a lab or a HW deadline, then you miss it.
* Make up for missed work somewhere else.
* If you isolate due to Covid, fine. You miss a few clickers and maybe a lab (though you can do it remotely).
* If you have a job interview and can't complete an assignment on time, then skip it.
We're not going to police this stuff. You don't need to let me know. There is no reason that every single person enrolled in this course shouldn't get > 65 in this class.
Illustrative scenarios:
* Doing 80% on 5 homeworks, coming to class and getting 50% correct, get 2 points on 8 labs gets you 65 points.
* Doing 90% on 5 homeworks, getting 50% correct on all the clickers, averaging 1/2 on all the labs gets you 65 points.
* Going to all the labs and getting 100%, 100% on 4 homeworks, plus being wrong on every clicker gets you 65 points
Choose your own adventure. Note that the biggest barrier to getting to 65 is skipping the assignments.
#### Late policy
Late lab/homework submissions will not be accepted.
More specifically, any submission that we receive after grading has commenced will receive a 0.
We likely won't start grading at 11:01pm on the due date, so don't worry if you're a few minutes late.
On the other hand, don't even bother submitting if you've missed the deadline by a few days.
This policy may seem harsh, but remember that there are many paths to a full 65 on the effort based grade.
If you miss one assignment, focus on doing well on the other labs/assignments
and you might still end up with an A in the course.
---
### Final exam
35 points
---
* All multiple choice, T/F, matching.
* The clickers are the best preparation.
* Questions may ask you to understand or find mistakes in code.
* No writing code.
The Final is very hard. By definition, it cannot be effort-based.
It is intended to separate those who really understand the material from those who don't. Last year, the median was 50%. But if you put in the work (do all the effort points) and get 50%, you get an 83 (an A-). **If you put in the work (do all the effort points) and skip the final, you get a 65.** You do not have to pass the final to pass the course. You don't even have to take the final.
The point of this scheme is for those who work hard to do well. But only those who really understand the material will get 90+.
## Health issues and considerations
### Covid Safety in the Classroom
::: {.callout-important}
If you think you're sick, stay home no matter what.
:::
**Masks.** For our in-person meetings in this class, it is important that all of us feel as comfortable as possible engaging in class activities while sharing an indoor space. Masks are a primary tool to make it harder for Covid-19 to find a new host. Please feel free to wear one or not given your own personal circumstances. Note that there are some people who cannot wear a mask. These individuals are equally welcome in our class.
**Vaccination.** If you have not yet had a chance to get vaccinated against Covid-19, vaccines are available to you, free. See <http://www.vch.ca/covid-19/covid-19-vaccine> for help finding an appointment. Boosters will be available later this term. The higher the rate of vaccination in our community overall, the lower the chance of spreading this virus. You are an important part of the UBC community. Please arrange to get vaccinated if you have not already done so. The same goes for Flu.
### Your personal health
::: {.callout-warning}
If you are sick, it’s important that you stay home – no matter what you think you may be sick with (e.g., cold, flu, other).
:::
* Do not come to class if you have Covid symptoms, have recently tested positive for Covid, or are required to quarantine. You can check this website to find out if you should self-isolate or self-monitor:
<http://www.bccdc.ca/health-info/diseases-conditions/covid-19/self-isolation#Who>.
* Your precautions will help reduce risk and keep everyone safer. In this class, the marking scheme is intended to provide flexibility so that you can prioritize your health and still be able to succeed. All work can be completed outside of class with reasonable time allowances.
* **If you do miss class because of illness:**
* Make a connection early in the term to another student or a group of students in the class. You can help each other by sharing notes. If you don’t yet know anyone in the class, post on the discussion forum to connect with other students.
* Consult the class resources on here and on Canvas. We will post all the slides, readings, and recordings for each class day.
* Use Slack for help.
* Come to virtual office hours.
* See the marking scheme for reassurance about what flexibility you have. No part of your final grade will be directly impacted by missing class.
* **If you are sick on final exam day, do not attend the exam.** You must follow up with your home faculty's advising office to apply for [deferred standing](https://students.ubc.ca/enrolment/academic-learning-resources/academic-advising). Students who are granted deferred standing write the final exam at a later date. If you're a Science student, you must apply for deferred standing (an academic concession) through Science Advising no later than 48 hours after the missed final exam/assignment. Learn more and find the application [online](https://science.ubc.ca/students/advising/concession). For additional information about academic concessions, see the UBC policy [here](http://www.calendar.ubc.ca/vancouver/index.cfm?tree=3,329,0,0).
::: {.callout-note}
Please talk with me if you have any concerns or ask me if you are worried about falling behind.
:::
## University policies
UBC provides resources to support student learning and to maintain healthy lifestyles but recognizes that sometimes crises arise and so there are additional resources to access including those for survivors of sexual violence. UBC values respect for the person and ideas of all members of the academic community. Harassment and discrimination are not tolerated nor is suppression of academic freedom. UBC provides appropriate accommodation for students with disabilities and for religious, spiritual and cultural observances. UBC values academic honesty and students are expected to acknowledge the ideas generated by others and to uphold the highest academic standards in all of their actions. Details of the policies and how to access support are available
[here](http://senate.ubc.ca/policies-resources-support-student-success).
### Academic honesty and standards
__UBC Vancouver Statement__
Academic honesty is essential to the continued functioning of the University of British Columbia as an institution of higher learning and research. All UBC students are expected to behave as honest and responsible members of an academic community. Breach of those expectations or failure to follow the appropriate policies, principles, rules, and guidelines of the University with respect to academic honesty may result in disciplinary action.
For the full statement, please see the [2022/23 Vancouver Academic Calendar](http://www.calendar.ubc.ca/vancouver/index.cfm?tree=3,286,0,0#15620)
__Course specific__
Several commercial services have approached students regarding selling class notes/study guides to their classmates. Please be advised that selling a faculty member’s notes/study guides individually or on behalf of one of these services using UBC email or Canvas, violates both UBC information technology and UBC intellectual property policy. Selling the faculty member’s notes/study guides to fellow students in this course is not permitted. Violations of this policy will be considered violations of UBC Academic Honesty and Standards and will be reported to the Dean of Science as a violation of course rules. Sanctions for academic misconduct may include a failing grade on the assignment for which the notes/study guides are being sold, a reduction in your final course grade, a failing grade in the course, among other possibilities. Similarly, contracting with any service that results in an individual other than the enrolled student providing assistance on quizzes or exams or posing as an enrolled student is considered a violation of UBC's academic honesty standards.
Some of the problems that are assigned are similar or identical to those assigned in previous years by me or other instructors for this or other courses. Using proofs or code from anywhere other than the textbooks, this year's course notes, or the course website is not only considered cheating (as described above), it is easily detectable cheating. Such behavior is strictly forbidden.
In previous years, I have caught students cheating on the exams or assignments. I did not enforce any penalty because the action did not help. Cheating, in my experience, occurs because students don't understand the material, so the result is usually a failing grade even before I impose any penalty and report the incident to the Dean's office. I carefully structure exams and assignments to make it so that I can catch these issues. I __will__ catch you, and it does not help. Do your own work, and use the TAs and me as resources. If you are struggling, we are here to help.
::: {.callout-caution}
If I suspect cheating, your case will be forwarded to the Dean's office. No questions asked.
:::
__Generative AI__
Tools to help you code more quickly are rapidly becoming more prevalent. I use them regularly myself. The point
of this course is not to "complete assignments" but to learn coding (and other things). With that goal in mind,
I recommend you avoid the use of Generative AI. It is unlikely to contribute directly to your understanding of
the material. Furthermore, I have experimented with certain tools on the assignments for this course and have found
the results underwhelming.
The material in this course is best learned through trial and error. Avoiding this mechanism (with generative
AI or by copying your friend) is a short-term solution at best. I have tried to structure this course to
discourage these types of short cuts, and minimize the pressure you may feel to take them.
### Academic Concessions
These are handled according to UBC policy. Please see
* [UBC student services](https://students.ubc.ca/enrolment/academic-learning-resources/academic-concessions)
* [UBC Vancouver Academic Calendar](http://www.calendar.ubc.ca/vancouver/index.cfm?tree=3,0,0,0)
* [Faculty of Science Concessions](https://science.ubc.ca/students/advising/concession)
### Missed final exam
Students who miss the final exam must report to their Faculty advising office within 72 hours of the missed exam, and must supply supporting documentation. Only your Faculty Advising office can grant deferred standing in a course. You must also notify your instructor prior to (if possible) or immediately after the exam. Your instructor will let you know when you are expected to write your deferred exam. Deferred exams will ONLY be provided to students who have applied for and received deferred standing from their Faculty.
### Take care of yourself
Course work at this level can be intense, and I encourage you to take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress. I struggle with these issues too, and I try hard to set aside time for things that make me happy (cooking, playing/listening to music, exercise, going for walks).
All of us benefit from support during times of struggle. If you are having any problems or concerns, do not hesitate to speak with me. There are also many resources available on campus that can provide help and support. Asking for support sooner rather than later is almost always a good idea.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, I strongly encourage you to seek support. UBC Counseling Services is here to help: call 604 822 3811 or visit their [website](https://students.ubc.ca/health/counselling-services). Consider also reaching out to a friend, faculty member, or family member you trust to help get you the support you need.
<br>
[A dated PDF is available at this link.](syllabus.pdf)