Skip to content

Files

Latest commit

 

History

History
48 lines (23 loc) · 4.6 KB

README.md

File metadata and controls

48 lines (23 loc) · 4.6 KB

Stats 412: Advanced Regression and Predictive Modeling

Stats 412 is a graduate level statistics course restricted to UCLA Masters in Applied Statistics students. The course will introduce generalized linear model and maximum likelihood methods as essential tools all statistics students should understand. Subsequently, the course will shift gears to explore regression and classification techniques that have been ubiquitous in machine learning in recent years.

Course Description

Often we are interested in making inferences and predictions from data, either by (a) estimating particular, meaningful parameters of a model, or (b) finding a best fitting model which we can then manipulate to produce useful outputs such as predictions or counter-factual estimates. In previous courses, you will have focused on the use of linear models for these tasks. In this class, we will focus on what is to be done when linear models are not appropriate and may thus produce misleading estimates. The first part of the course will focus on the generalized linear model and maximum likelihood methods, as these remain essential tools every statistics students should understand. In the second part of the course, we examine shift gears to examine regression and classification techniques that have been ubiquitous in the machine learning in recent years.

The class will generally emphasize depth over breadth, in the belief that if you deeply understand something, you will better remember it, and more readily understand future generalizations and variations of the same themes. We will assume a working knowledge of basic probability theory and linear regression models. Students are expected to be familiar enough with R to use it for their problem sets and final projects (see below).

Key Dates

Problem set due dates will be announced when each problem set it distributed.

Other important deadlines and dates during the term are:

  • 6/4/2018 Final model submissions
  • 6/6/2018 Final slide submissions
  • 6/7/2018 Student presentations (Week 10)

Resources

The course will not strictly follow a textbook but here are some great ones that you should all have and read. Posted readings will largely come from these books. These are great books. Seriously! Do the readings. :)

Additional papers will be posted in the corresponding weekly directory.

Evaluation

Problem Sets

Working on actual problems is central to learning. Four problem sets will be assigned, on alternating weeks. These assignments will consist of analytical problems, computer simulations, and data analysis. Late submissions will not be accepted. Assignment will generally be made available by a Thursday and due two Thursdays later prior to lecture. All sufficiently attempted homework (ie. a typed and well organized write-up with all problems attempted) will be graded on a (+,✓,-) scale. Students are encouraged to discuss the problems together, but must independently produce and submit solutions. Work should be done as a RMarkdown file and committed to GitHub.

Final Project

A final project will be completed as individuals. The project will encourage collaboration, test your predictive modeling skills, promote presenting skills, and encompass a bit of competition. A classroom Kaggle competition will be held on a provided dataset to test skills learned in this course. In addition, each pair will also present their work to the class during the final week of the quarter. Effective verbal communication is a critical skill for statisticians, and it requires practice and feedback to develop. Additional information about the final project will given as the course progresses.