forked from niuers/Learning-From-Data-A-Short-Course
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
49 additions
and
53 deletions.
There are no files selected for viewing
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,50 @@ | ||
# machine-learning | ||
Machine Learning Study Notes | ||
This repositary holds my solutions to the exercises and problems in book "Learning from Data: A Short Course" by Yaser Abu-Mostafa et al. | ||
|
||
1. Solutions to the exercises and problems in the book, Learn from Data-A short course. | ||
# Questions | ||
## Chapter 1 | ||
* What are the components of a learning algorithm? | ||
* What are different types of learning problems? | ||
* Why we are able to learn at all? | ||
* What are the meanings of various probability quantities? | ||
|
||
## Chapter 2 | ||
* How do we generalize from training data? | ||
* How to understand VC dimension? What is the VC dimension of linear models, e.g. perceptron? What do we use it for? | ||
* How to understand the bounds? | ||
* How to understand the two competing forces: approximation and generalization? | ||
* How to understand the bias variance trade off? How to derive it? | ||
* Why do we need test set? What's the difference between train and test set? What are the advantages of using a test set? | ||
* What is learning curve? How to intepret it? What are the typical learning curves for linear regression? | ||
|
||
## Chapter 3 | ||
* What are the linear models? Linear classification, linear regression and logistic regression. | ||
* What's the application of approximation-generalization in linear models? | ||
* Why minimize perceptron requires combinatorial efforts while minimize linear regression requires just analytic solution. Logistic regression needs gradient descent. | ||
* When can GD be used? What algorithms use gradient descent method? What use sub-gradient methods? What are the requirements for GD? | ||
What are the advantages of using SGD? What does SGD work at all? What's the convergence speed between GD and SGD? | ||
* Why fixed rate learning rate GD works? | ||
* How does feature transformation affect the VC dimension? | ||
* What are the advantages and disadvantages of feature transformation? | ||
* What is a projection matrix? Give an example of projection in 1-D or 2-D. Where does it project to? | ||
|
||
|
||
## Chapter 4 | ||
* What is overfitting? What causes overfitting? When does it happen? How do we measure it? Why it's important? What can we do to reduce it? How is overfitting related with in-sample error, out-of-sample error etc. ? How does model complexity, number of data points, and hypothesis set affect overfitting? | ||
* What are the stochastic noise and deterministic noise? What're the differences betwen them? How do they show up in learning algorithm? | ||
* What is regularization? How does regularization affect the VC bound? What are the impacts of regularization on generalization error? What are different ways to do regularization? What are the components of regularization? Regularizer, weight decay, etc. | ||
* What are the Lagrange optimization? Understand it intuitively? | ||
* What is validation? What do we use it for? What's the relationship between validation and regularization? How are they related to the generalization error? | ||
* What is validation set? Why do we need re-train with all training data after we run through the validation set? Why do we need validation set? | ||
* What is cross-validation? When do we use it? What's the procedure to apply cross-validation? | ||
|
||
## Chapter 5 | ||
* What is the Occam's razor? | ||
* How do we measure complexity? What are the two recurring theme in complexity measurement in objects? | ||
* What is the axim of nonfalsifibility? | ||
* What is sampling bias? What's the implication for VC bounds and Hoeffding bound? What are other biases we see in learning? | ||
* What is data snooping? How to deal with data snooping? | ||
|
||
|
||
|
||
|
||
|
File renamed without changes.
50 changes: 0 additions & 50 deletions
50
learning from data_a short course_problem solutions/readme.md
This file was deleted.
Oops, something went wrong.