Questions

This repositary holds my solutions to the exercises and problems in book "Learning from Data: A Short Course" by Yaser Abu-Mostafa et al.

Chapter 1: The Learning Problem

Missing: Problem 1.7 (b)

Chapter 2: Training versus Testing

Missing:

Exercises: 2.4
Problems: 2.4, 2.9, 2.10, 2.14, 2.15, 2.19

Chapter 3: The Linear Model

Missing:

Exercises: 3.12, 3.15
Problems: 3.15 (c), 3.18

Chapter 4: Overfitting

Missing:

Exercises: 4.9, 4.10
Problems: 4.4 (f), 4.21

Chapter 5: Three Learning Principles

Chapter 6: Similarity-Based Methods

Missing:

Exercises: 6.3
Problems: 6.5, 6.9, 6.11 (b), 6.15

Chapter 7: Neural Networks

Missing:

Exercises:
Problems: 7.2, 7.5, 7.9, 7.12, 7.16

Chapter 8: Support Vector Machine

Missing:

Exercises:
Problems: 8.9 (c)

Chapter 9: Learning Aides

Missing:

Exercises: 9.17
Problems: 9.11, 9.16, 9.17, 9.26, 9.27, 9.28

Appendix B: Linear Algebra

Appendix C: The E-M Algorithm

Questions

Chapter 1

What are the components of a learning algorithm?
What are different types of learning problems?
Why we are able to learn at all?
What are the meanings of various probability quantities?

Chapter 2

How do we generalize from training data?
How to understand VC dimension? What is the VC dimension of linear models, e.g. perceptron? What do we use it for?
How to understand the bounds?
How to understand the two competing forces: approximation and generalization?
How to understand the bias variance trade off? How to derive it?
Why do we need test set? What's the difference between train and test set? What are the advantages of using a test set?
What is learning curve? How to intepret it? What are the typical learning curves for linear regression?

Chapter 3

What are the linear models? Linear classification, linear regression and logistic regression.
What's the application of approximation-generalization in linear models?
Why minimize perceptron requires combinatorial efforts while minimize linear regression requires just analytic solution. Logistic regression needs gradient descent.
When can GD be used? What algorithms use gradient descent method? What use sub-gradient methods? What are the requirements for GD? What are the advantages of using SGD? What does SGD work at all? What's the convergence speed between GD and SGD?
Why fixed rate learning rate GD works?
How does feature transformation affect the VC dimension?
What are the advantages and disadvantages of feature transformation?
What is a projection matrix? Give an example of projection in 1-D or 2-D. Where does it project to?

Chapter 4

What is overfitting? What causes overfitting? When does it happen? How do we measure it? Why it's important? What can we do to reduce it? How is overfitting related with in-sample error, out-of-sample error etc. ? How does model complexity, number of data points, and hypothesis set affect overfitting?
What are the stochastic noise and deterministic noise? What're the differences betwen them? How do they show up in learning algorithm?
What is regularization? How does regularization affect the VC bound? What are the impacts of regularization on generalization error? What are different ways to do regularization? What are the components of regularization? Regularizer, weight decay, etc.
What are the Lagrange optimization? Understand it intuitively?
What is validation? What do we use it for? What's the relationship between validation and regularization? How are they related to the generalization error?
What is validation set? Why do we need re-train with all training data after we run through the validation set? Why do we need validation set?
What is cross-validation? When do we use it? What's the procedure to apply cross-validation?

Chapter 5

What is the Occam's razor?
How do we measure complexity? What are the two recurring theme in complexity measurement in objects?
What is the axim of nonfalsifibility?
What is sampling bias? What's the implication for VC bounds and Hoeffding bound? What are other biases we see in learning?
What is data snooping? How to deal with data snooping?

Name		Name	Last commit message	Last commit date
Latest commit History 401 Commits
data		data
files		files
libs		libs
.gitignore		.gitignore
Appendix B Linear Algebra.ipynb		Appendix B Linear Algebra.ipynb
Appendix C The E-M Algorithm.ipynb		Appendix C The E-M Algorithm.ipynb
Chapter 2 Training vs. Testing.ipynb		Chapter 2 Training vs. Testing.ipynb
Learning From Data_A Short Course Notes.ipynb		Learning From Data_A Short Course Notes.ipynb
README.md		README.md
Solutions to Chapter 1 The Learning Problem.ipynb		Solutions to Chapter 1 The Learning Problem.ipynb
Solutions to Chapter 2 Training versus Testing.ipynb		Solutions to Chapter 2 Training versus Testing.ipynb
Solutions to Chapter 3 The Linear Model.ipynb		Solutions to Chapter 3 The Linear Model.ipynb
Solutions to Chapter 4 Overfitting.ipynb		Solutions to Chapter 4 Overfitting.ipynb
Solutions to Chapter 5 Three Learning Principles.ipynb		Solutions to Chapter 5 Three Learning Principles.ipynb
Solutions to Chapter 6 Similarity-Based Methods.ipynb		Solutions to Chapter 6 Similarity-Based Methods.ipynb
Solutions to Chapter 7 Neural Networks.ipynb		Solutions to Chapter 7 Neural Networks.ipynb
Solutions to Chapter 8 Support Vector Machine.ipynb		Solutions to Chapter 8 Support Vector Machine.ipynb
Solutions to Chapter 9 Learning Aides.ipynb		Solutions to Chapter 9 Learning Aides.ipynb
__init__.py		__init__.py
diff_Qfs_raw.csv		diff_Qfs_raw.csv
diff_sigmas.csv		diff_sigmas.csv
diff_sigmas_raw.csv		diff_sigmas_raw.csv
googlea4c5947d4a6e4a59.html		googlea4c5947d4a6e4a59.html
lasso_ivanov_Eout_C		lasso_ivanov_Eout_C
lasso_ivanov_all_Eouts.csv		lasso_ivanov_all_Eouts.csv
overfit_mean_diff_raw.csv		overfit_mean_diff_raw.csv
overfit_std_diff_raw.csv		overfit_std_diff_raw.csv
ridge_ivanov_Eout_C		ridge_ivanov_Eout_C
ridge_ivanov_all_Eouts.csv		ridge_ivanov_all_Eouts.csv
ridge_tikhonov_Eout_lambda		ridge_tikhonov_Eout_lambda
ridge_tikhonov_all_Eouts.csv		ridge_tikhonov_all_Eouts.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chapter 1: The Learning Problem

Chapter 2: Training versus Testing

Chapter 3: The Linear Model

Chapter 4: Overfitting

Chapter 5: Three Learning Principles

Chapter 6: Similarity-Based Methods

Chapter 7: Neural Networks

Chapter 8: Support Vector Machine

Chapter 9: Learning Aides

Appendix B: Linear Algebra

Appendix C: The E-M Algorithm

Questions

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

About

Releases

Packages

Languages

PlayX-2333/Learning-From-Data-A-Short-Course

Folders and files

Latest commit

History

Repository files navigation

Chapter 1: The Learning Problem

Chapter 2: Training versus Testing

Chapter 3: The Linear Model

Chapter 4: Overfitting

Chapter 5: Three Learning Principles

Chapter 6: Similarity-Based Methods

Chapter 7: Neural Networks

Chapter 8: Support Vector Machine

Chapter 9: Learning Aides

Appendix B: Linear Algebra

Appendix C: The E-M Algorithm

Questions

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages