Machine Learning Specialization on Coursera

Master Machine Learning topics

Instructors:

Goals

introduction to the exciting, high-demand field of Machine Learning
gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval
learn to analyze large and complex datasets, create systems that adapt and improve over time
build intelligent applications that can make predictions from data

Courses

Course1 : Machine Learning Foundations: A Case Study Approach

Week 1 - Welcome
- Note - intor for the course
Week 2 - Regression: Predicting House Prices
- Learning Objectives
  - Describe the input (features) and output (real-valued predictions) of a regression model
  - Calculate a goodness-of-fit metric (e.g., RSS)
  - Estimate model parameters by minimizing RSS (algorithms to come...)
  - Exploit the estimated model to form predictions
  - Perform a training/test split of the data
  - Analyze performance of various regression models in terms of test error
  - Use test error to avoid overfitting when selecting amongst candidate models
  - Describe a regression model using multiple features
  - Describe other applications where regression is useful
- Note - regression-intro
Week 3 - Classification: Analyzing Sentiment
- Learning Objectives
  - Identify a classification problem and some common applications
  - Describe decision boundaries and linear classifiers
  - Train a classifier
  - Measure its error
  - Some rules of thumb for good accuracy
  - Interpret the types of error associated with classification
  - Describe the tradeoffs between model bias and data set size
  - Use class probability to express degree of confidence in prediction
- Note - classification
Week 4 - Clustering and Similarity: Retrieving Documents
- Learning Objectives
  - Describe ways to represent a document (e.g., raw word counts, tf-idf,...)
  - Measure the similarity between two documents
  - Discuss issues related to using raw word counts
  - Normalize counts to adjust for document length
  - Emphasize important words using tf-idf
  - Implement a nearest neighbor search for document retrieval
  - Describe the input (unlabeled observations) and output (labels) of a clustering algorithm
  - Determine whether a task is supervised or unsupervised
  - Cluster documents using k-means (algorithmic details to come...)
  - Describe other applications of clustering
- Note - clustering-intro
Week 5 - Recommending Products
- Learning Objectives
  - Describe the goal of a recommender system
  - Provide examples of applications where recommender systems are useful
  - Implement a co-occurrence based recommender system
  - Describe the input (observations, number of “topics”) and output (“topic”
  - vectors, predicted values) of a matrix factorization model
  - Exploit estimated “topic” vectors (algorithms to come...) to make recommendations
  - Describe the cold-start problem and ways to handle it (e.g., incorporating features)
  - Analyze performance of various recommender systems in terms of precision and recall
  - Use AUC or precision-at-k to select amongst candidate algorithms
- Note - recommenders-intro
Week 6 - Deep Learning: Searching for Images
- Learning Objectives
  - Describe multi-layer neural network models
  - Interpret the role of features as local detectors in computer vision
  - Relate neural networks to hand-crafted image features
  - Describe some settings where deep learning achieves significant performance boosts
  - State the pros & cons of deep learning model
  - Apply the notion of transfer learning
  - Use neural network models trained in one domain as features for building a model in another domain
  - Build an image retrieval tool using deep features
- Note 1 - deeplearning
- Note 2 - closing

Course 2 : Regression

Week 1 - Welcome
- Learning Objectives
  - Describe the input (features) and output (real-valued predictions) of a regression model
  - Calculate a goodness-of-fit metric (e.g., RSS)
  - Estimate model parameters to minimize RSS using gradient descent
  - Interpret estimated model parameters
  - Exploit the estimated model to form predictions
  - Discuss the possible influence of high leverage points
  - Describe intuitively how fitted line might change when assuming different goodness-of-fit metrics
- Note 1 - intor for the course
- Note 2 - simple regression
Week 2 - Multiple Regression
- Learning Objectives
  - Describe polynomial regression
  - Detrend a time series using trend and seasonal components
  - Write a regression model using multiple inputs or features thereof
  - Cast both polynomial regression and regression with multiple inputs as regression with multiple features
  - Calculate a goodness-of-fit metric (e.g., RSS)
  - Estimate model parameters of a general multiple regression model to minimize RSS:
  - In closed form
  - Using an iterative gradient descent algorithm
  - Interpret the coefficients of a non-featurized multiple regression fit
  - Exploit the estimated model to form predictions
  - Explain applications of multiple regression beyond house price modeling
- Note - multiple regression
Week 3 - Assessing Performance
- Learning Objectives
  - Describe what a loss function is and give examples
  - Contrast training, generalization, and test error
  - Compute training and test error given a loss function
  - Discuss issue of assessing performance on training set
  - Describe tradeoffs in forming training/test splits
  - List and interpret the 3 sources of avg. prediction error
  - Irreducible error, bias, and variance
  - Discuss issue of selecting model complexity on test data
  - and then using test error to assess generalization error
  - Motivate use of a validation set for selecting tuning
  - parameters (e.g., model complexity)
  - Describe overall regression workflow
- Note - assessing performance
Week 4 - Ridge Regression
- Learning Objectives
  - Describe what happens to magnitude of estimated coefficients when model is overfit
  - Motivate form of ridge regression cost function
  - Describe what happens to estimated coefficients of ridge regression as tuning parameter λ is varied
  - Interpret coefficient path plot
  - Estimate ridge regression parameters:
  - In closed form
  - Using an iterative gradient descent algorithm
  - Implement K-fold cross validation to select the ridge regression tuning parameter λ
- Note - rideg regression
Week 5 - Feature Selection & Lasso
- Learning Objectives
  - Perform feature selection using “all subsets” and “forward stepwise” algorithms
  - Analyze computational costs of these algorithms
  - Contrast greedy and optimal algorithms
  - Formulate lasso objective
  - Describe what happens to estimated lasso coefficients as tuning parameter λ is varied
  - Interpret lasso coefficient path plot
  - Contrast ridge and lasso regression
  - Describe geometrically why L1 penalty leads to sparsity
  - Estimate lasso regression parameters using an iterative coordinate descent algorithm
  - Implement K-fold cross validation to select lasso tuning parameter λ
- Note - lasso regression
Week 6 - Nearest Neighbors & Kernel Regression
- Learning Objectives
  - Motivate the use of nearest neighbor (NN) regression
  - Define distance metrics in 1D and multiple dimensions
  - Perform NN and k-NN regression
  - Analyze computational costs of these algorithms
  - Discuss sensitivity of NN to lack of data, dimensionality, and noise
  - Perform weighted k-NN and define weights using a kernel
  - Define and implement kernel regression
  - Describe the effect of varying the kernel bandwidth λ or # of nearest neighbors k
  - Select λ or k using cross validation
  - Compare and contrast kernel regression with a global average fit
  - Define what makes an approach nonparametric and why NN and
  - kernel regression are considered nonparametric methods
  - Analyze the limiting behavior of NN regression
  - Use NN for classification
- Note 1 - kernel regression
- Note 2 - summary

Course 3 : Classification

Week 1 - Welcome
- Learning Objectives
  - Describe decision boundaries and linear classifiers
  - Use class probability to express degree of confidence in prediction
  - Define a logistic regression model
  - Interpret logistic regression outputs as class probabilities
  - Describe impact of coefficient values on logistic regression output
  - Use 1-hot encoding to represent categorical inputs
  - Perform multiclass classification using the 1-versus-all approach
- Note 1 - intor for the course
- Note 2 - logistic-regression-model
Week 2 - Learning Linear Classifiers
- Learning Objectives
  - Identify when overfitting is happening
  - Relate large learned coefficients to overfitting
  - Describe the impact of overfitting on decision boundaries and predicted probabilities of linear classifiers
  - Motivate the form of L 2 regularized logistic regression quality metric
  - Describe what happens to estimated coefficients as tuning parameter λ is varied
  - Interpret coefficient path plot
  - Estimate L 2 regularized logistic regression coefficients using gradient ascent
  - Describe the use of L 1 regularization to obtain sparse logistic regression solutions
- Note - 2.1_logistic-regression-learning
- Note - 2.2_logistic-regression-learning
Week 3 - Decision Trees
- Learning Objectives
  - Define a decision tree classifier
  - Interpret the output of a decision trees
  - Learn a decision tree classifier using greedy algorithm
  - Traverse a decision tree to make predictions
  - Majority class predictions
  - Probability predictions
  - Multiclass classification
- Note - decision-trees
Week 4 - Preventing Overfitting in Decision Trees
- Learning Objectives
  - Identify when overfitting in decision trees
  - Prevent overfitting with early stopping
  - Limit tree depth
  - Do not consider splits that do not reduce classification error
  - Do not split intermediate nodes with only few points
  - Prevent overfitting by pruning complex trees
  - Use a total cost formula that balances classification error and tree complexity
  - Use total cost to merge potentially complex trees into simpler ones
  - Describe common ways to handling missing data:
    - Skip all rows with any missing values
    - Skip features with many missing values
    - Impute missing values using other data points
  - Modify learning algorithm (decision trees) to handle missing data:
    - Missing values get added to one branch of split
    - Use classification error to determine where missing values go
- Note - 4.1_decision-trees-overfitting
- Note - 4.2_decision-trees-overfitting
Week 5 - Boosting
- Learning Objectives
  - Identify notion ensemble classifiers
  - Formalize ensembles as the weighted combination of simpler classifiers
  - Outline the boosting framework – sequentially learn classifiers on weighted data
  - Describe the AdaBoost algorithm
  - Learn each classifier on weighted data
  - Compute coefficient of classifier
  - Recompute data weights
  - Normalize weights
  - Implement AdaBoost to create an ensemble of decision stumps
  - Discuss convergence properties of AdaBoost & how to pick the maximum number of iterations T
- Note - Boosting
Week 6 - Precision-Recall
- Learning Objectives
  - Classification accuracy/error are not always right metrics
  - Precision captures fraction of positive predictions that are correct
  - Recall captures fraction of positive data correctly identified by the model
  - Trade-off precision & recall by setting probability thresholds
  - Plot precision-recall curves.
  - Compare models by computing precision at k
- Note 1 - precision-recall
Week 7 - Scaling to Huge Datasets & Online Learning
- Learning Objectives
  - Significantly speedup learning algorithm using stochastic gradient
  - Describe intuition behind why stochastic gradient works
  - Apply stochastic gradient in practice
  - Describe online learning problems
  - Relate stochastic gradient to online learning
- Note 1 - online-learning

Course 4 : Regression

Week 1 - Welcome
- Note 1 - intor for the course
Week 2 - Nearest Neighbor Search
- Learning Objectives
  - Implement nearest neighbor search for retrieval tasks
  - Contrast document representations (e.g., raw word counts, tf-idf,...)
  - Emphasize important words using tf-idf
  - Contrast methods for measuring similarity between two documents
    - Euclidean vs. weighted Euclidean
    - Cosine similarity vs. similarity via unnormalized inner product
  - Describe complexity of brute force search
  - Implement KD-trees for nearest neighbor search
  - Implement LSH for approximate nearest neighbor search
  - Compare pros and cons of KD-trees and LSH, and decide
  - which is more appropriate for given dataset
- Note - retrieval-intro
Week 3 - Clustering with k-means
- Learning Objectives
  - Describe potential applications of clustering
  - Describe the input (unlabeled observations) and output (labels) of a clustering algorithm
  - Determine whether a task is supervised or unsupervised
  - Cluster documents using k-means
  - Interpret k-means as a coordinate descent algorithm
  - Define data parallel problems
  - Explain Map and Reduce steps of MapReduce framework
  - Use existing MapReduce implementations to parallelize k- means, understanding what’s being done under the hood
- Note - kmeans
Week 4 - Mixture Models
- Learning Objectives
  - Interpret a probabilistic model-based approach to clustering using mixture models
  - Describe model parameters
  - Motivate the utility of soft assignments and describe what they represent
  - Discuss issues related to how the number of parameters grow with the number of dimensions
    - Interpret diagonal covariance versions of mixtures of Gaussians
  - Compare and contrast mixtures of Gaussians and k-means
  - Implement an EM algorithm for inferring soft assignments and cluster parameters
    - Determine an initialization strategy
    - Implement a variant that helps avoid overfitting issues
- Note - mixmodel-EM
Week 5 - Mixed Membership Modeling via Latent Dirichlet Allocation
- Learning Objectives
  - Compare and contrast clustering and mixed membership models
  - Describe a document clustering model for the bag- of-words doc representation
  - Interpret the components of the LDA mixed membership model
  - Analyze a learned LDA model
    - Topics in the corpus
    - Topics per document
  - Describe Gibbs sampling steps at a high level
  - Utilize Gibbs sampling output to form predictions or estimate model parameters
  - Implement collapsed Gibbs sampling for LDA
- Note - LDA
Week 6 - Hierarchical Clustering & Closing Remarks
- Note - closing

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
1- Machine Learning Foundations: A Case Study Approach		1- Machine Learning Foundations: A Case Study Approach
2- Regression		2- Regression
3- Classification		3- Classification
4- Retrieval & Clustering		4- Retrieval & Clustering
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Specialization on Coursera

Instructors:

Goals

Courses

Course1 : Machine Learning Foundations: A Case Study Approach

Course 2 : Regression

Course 3 : Classification

Course 4 : Regression

License

Each week contain it's assignment and data

About

Releases

Packages

Languages

License

Magho/ML-Washington-specialization-coursera

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Specialization on Coursera

Instructors:

Goals

Courses

Course1 : Machine Learning Foundations: A Case Study Approach

Course 2 : Regression

Course 3 : Classification

Course 4 : Regression

License

Each week contain it's assignment and data

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages