Janelia ML Course - lecture 1

This page has moved here. Check that repo for updates

Exercises and milestones for lab 1

Intro Python and programming

Have Anaconda (conda) installed
git-clone this repository
Open and run the Lecture_1.ipynb notebook
Learn numpy

Linear regression 1

Read this website about the diabetes dataset
and this website sklearn's Linear Regression
Add one or more new features from the dataset, and solve using sklearn's LinearRegression
Set up the matrix equation for linear regression with two features
Solve the linear system and make sure the results are the same as those you got with sklearn

Bayes / Naive bayes

Run the document classification Naive Bayes example.
Play around with it - try with more than the two classes used in the lecture.
Make up a simple classification problem, write code to generate synthetic data, and train a naive bayes classifier on your data with sklearn.
Make up a different classification problem, and design it so that naive bayes classifier performs poorly. (Hint: make data that violates the assumption that naive bayes makes).

Probability / statistics

Derive Bayes rule. (Hint: start with the equation relating joint distribution to a conditional distribution)
Spot and understand the "mistake" on the Intuition -> math -> stats slide

Linear algebra

Work through a few examples of 2x2 matrix - vector multiplications by hand. Compare to results you get in code.
if AB=I, then is it the case that BA=I?
What is the solution to the 2x2 matrix equation in the slides?

Linear regression 2

Go to the end of Lecture_1.ipynb and find the slides / cells that we didn't go over in the lecture.
Run the sklearn code that uses Linear regression to fit a polynomial of degree-3.
Write down the matrix equation (linear system) for
Write code to generate the matrices and vectors for the equation you wrote down in (3).
Solve it yourself in code with np.linalg.lstsq and make sure you get the same answer as sklearn.