Coursera ML MOOC

Andrew's class may be the common sense among ML practitioners.

I don't want to fool myself.
Even I have read some api doc of sklearn and know how to call them, I don't know the soul of machine learning. I have to get the basics right. So I implement every exercise of the Coursera ML class using numpy, scipy and tensorflow.

The reason I choose python over matlab is purely practical concern. This cs224d Intro to TensorFlow (video) presents very good explanation of why python may be the right choice to do ML.

All these learning about theories and coding are preparation of real world application. Although the learning itself is it's own reward, I also want to create useful application that solves real world problems and create values to the community. This project is the very tiny step toward the goal. I learned so much.

The more I learn, the more I respect all those great scientific adventures before me that paves the way I have right now. Andrew's class is very good overview of general ML. It's hands on approach encourages new people like me keep moving, even some details are purposefully ignored. On the other hand, I found it very useful to pick up theories while doing these exercises. This book Learning from Data gives me so many aha moment about learning theories. This is my feeble foundation of ML theories.

Generally, Andrew's class shows me mostly what to do, and how to do it. The book shows me why. Theory and practice goes hand in hand. I couldn't express how happy I am when I read something in the book and suddenly understand the reason about what I was coding last night. Eureka!

Project structure

Each exercise has it's own folder. In each folder you will find:
1. pdf that guide you through the project
2. a series of Jupyter notebook
3. data
each notebook basically follows the logic flow of project pdf. I didn't present all codes in notebook because I personally think it's very messy. So you will only see visualization, project logic flows, simple experiments, equations and results in notebooks.
In helper folder, it has modules of different topics. This is where you can find details of model implementation, learning algorithm, and supporting functions.

Go solo with python or go with built-in Matlab project?

The Matlab project is guiding students to finish the overall project goal, be it implementing logistic regression, or backprop NN. It includes many supporting function to help you do visualization, gradient checking, and so on.
The way I do it is to focus on pdf that tells you what is this project about, then figure out how to achieve those objectives using Scipy stack. Most of time I don't even bother looking into original .m files. Just need their data.

Without those supports, I have to do:

visualization : seaborn, matplotlib are very handy
vetorized implementation of ML model and gradient function use numpy's power to manupulate ndarray
optimization : figure out how to use scipy optimizer to fit you parameters
support functions : nobody is loading, parsing, normalize data for you now, DIY

By doing those, I learn more, which is even better.

Supporting materials

I am learning by doing, not tools hoarding. Here is the list that helps me along the way.

Intuitions of Linear Algebra, Essence of linear algebra, this is the best source to my knowledge, for intuition.
Python, numpy tutorial
More math behind the scene. CS 229 Machine Learning Course Materials, basically Coursera ML is water down version of this cs229. The link has very good linear algebra review ,and probability theroy review.
Quoc Le’s Lectures on Deep Learning: 4k videos with perfect lecture notes.
Learning from Data: learning theory in less than 200 pages, God.

Run locally

If you find bugs, false logic, just anything that could be better, please do me a favor by creating issues. I would love to see constructively negative feedbacks

acknowledgement: Thank you John Wittenauer! I shamelessly steal lots of your code and idea. here
if you want to run notebooks locally, you could refer to requirement.txt for libraries I've been using.
I'm using python 3.5.2 for those notebooks. You will need it because I use @ operator for matrix multiplication extensively.

tensorflow (64-bit linux only) is now available on https://t.co/292ZKEfpjQ Use conda install tensorflow to get it!
— Continuum Analytics (@ContinuumIO) September 19, 2016

Read notebook with nbviewer, and references for each exercise

ex1-linear regression

Special thing I did in this project is I implement the linear regression model in TensorFlow. This is my first tf experience. Looking forward to learn more when I move into Deep Learning. code: linear_regression.py

Deep Learning ch5.4 has decent treatment of bias vs variance

ex6-SVM

ex7-kmeans and PCA

The Elements of Statistical Learning pg.64 has very good explanation about singular value decomposition, which is used to find principle components in our PCA. The book is free to download.
The Deep Learning (ch2.7, 2.8) has briefly talked about eigendecomposition and SVD
The Deep Learning (ch5.8.1) describes clearly the relationship between PCA and SVD

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
ex1-linear regression		ex1-linear regression
ex2-logistic regression		ex2-logistic regression
ex3-neural network		ex3-neural network
ex4-NN back propagation		ex4-NN back propagation
ex5-bias vs variance		ex5-bias vs variance
ex6-SVM		ex6-SVM
ex7-kmeans and PCA		ex7-kmeans and PCA
ex8-anomaly detection and recommendation		ex8-anomaly detection and recommendation
helper		helper
img		img
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coursera ML MOOC

Project structure

Go solo with python or go with built-in Matlab project?

Supporting materials

Run locally

Read notebook with nbviewer, and references for each exercise

ex1-linear regression

ex2-logistic regression

ex3-neural network

ex4-NN back propagation

ex5-bias vs variance

ex6-SVM

ex7-kmeans and PCA

ex8-anomaly detection and recommendation

About

Releases

Packages

Languages

License

pratikchhapolika/Coursera-ML-AndrewNg

Folders and files

Latest commit

History

Repository files navigation

Coursera ML MOOC

Project structure

Go solo with python or go with built-in Matlab project?

Supporting materials

Run locally

Read notebook with nbviewer, and references for each exercise

About

Resources

License

Stars

Watchers

Forks

Languages