Intro to Machine Learning

Brought to you by Galvanize. Learn more about the way we teach at galvanize.com.

FAQs:

WIFI: g|Events | Password is learningcommunity

Setting up your computer

A web browser to see what we're working on as others see it (Recommend Google Chrome: [chrome.google.com] (http://chrome.google.com))
We will be using Google Colab for this workshop so make a Google account if you don't already have one.
Open this github Repo to follow along

What this workshop is

A super friendly introduction to Machine Learning No previous experience expected, but knowing some python will help!

You can't learn EVERYTHING in ~2 hours, especially when it comes to Machine Learning! But you can learn enough to get excited and comfortable to keep working and learning on your own!

This course is for absolute beginners
Ask Questions!
Answer Questions!
Help others when you can
Its ok to get stuck, just ask for help!
Feel free to move ahead
Be patient and nice

We're not going to focus on the math behind the models. We're going to focus more on when and how to use a model. If you would like to go into the math and more about each model I encourage you to do so!

About me:

Hello I'm Keenan Olsen. I'm a Technology Evangelist here at Galvanize!

I Originally got into Machine Learning by solving a manufacturing problem at my last job with computer vision, and I think its one of the coolest fields!

Note: I'm not a Galvanize Instructor

Twitter: @KeenanOlsen
LinkedIn: Keenan Olsen
Email: [email protected]

Reach out to me if interested in:

breaking into the tech industry
learning resources
meetup recommendations
learning more about Galvanize
giving me suggestions for events!
being friends

About you!

Give a quick Intro!

Whats your name?
Whats your background?
Why are you interested in Machine Learning?

FAQs Again for anyone who just came in:

WIFI: g|Events | Password is learningcommunity

Setup

Moderen web browser
Google account

Regression Project

Boston House Dataset

Looking at this data how do we know that regression will be a good choice? Why not Classification?

>>> Boston House price Linear Regression Notebook <<<

Classification Project

Iris Flower Dataset KNN Workbook

Looking at this data how do we know that Classification will be a good choice? Why not Regression?

>>> Iris K-Nearest Neighbors <<<

What is Machine Learning:

To put it very simply Machine Learning can usually be thought of using a statistical model built based on a dataset to solve a problem.

Instead of explicitly programming an algorithm to do a specific task, we let it "learn" from data to find patterns and inference.

We'll see examples of this soon!

Who uses Machine Learning?

More and more companies using making decisions with data are using machine learning. Here are just a few examples that you've probably experiences as a customer.

Amazon

Product Recommendations
Amazon GO Computer Vision
Alexa
Delivery Robots

Netflix

Show & Movie Recommendations

Google

Gmail Spam Filtering
Google Assistance
Youtube Content filtering & Recommendations
Self Driving Cars

Apple

Siri
App Store Recommendations

Facebook

Face Tagging Detection

Tesla

Self Driving Cars

These companies use Machine Learning in many other ways!

Machine Learning Applications

We talked about a some examples above from big companies we probably all know of. But here are several more types of applications that machine learning has become popular with.

Healthcare

Cancer Detection
X-Ray diagnostic

Smart Home Devices

Smart door Bell
Smart Lights
Security

Image generation

NVIDIA’s Hyperrealistic Face Generator
video game Character or level generation
art generation

Agriculture

Crop monitoring & planning

Supply Chain

Sourcing and Shipping Automation

Manufacturing

Quality Assurance
Design

Fraud Detection

Credit cards
Product listings

You can see how all of these applications revolve around finding patterns in data!

Types of Machine Learning:

Supervised Learning

Supervised Learning uses a dataset that is labeled. In this context imagine having a list of features and a label(group) that those features belong to.

Here we have features(sepal length (cm), etc) and a label(Flower Species)

sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	Species
5.1	3.5	1.4	0.2	setosa
5.7	2.9	4.2	1.3	versicolor
7.7	3.0	6.1	2.3	virginica

We could use a full dataset with data like above to make a prediction of the flower species given only the Petal and Sepal Lengths.

Another good example of supervised learning is a email spam filter.

Say we have a bunch of emails in our dataset and they all have a label of either spam or not_spam. We could then train a supervised learning model to look at all of those emails and pick up patterns that show up in the spam emails. There are probably certain words or formatting that repeat them selves. If you've ever looked in your email spam folder you can probably pick out some of those things yourself!

There are 2 main types of supervised learning Classification and Regression:

Classification

Classification tries to assign the correct label to a new piece of data not containing a label. Both examples above are good examples of classification problems.

Spam filter would look at an email and decide if it should be labeled as spam or not_spam

We could be given a new flower measurement and we want to try to label it with the correct Species: setosa, versicolor, virginica

sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
5.1	3.5	1.4	0.2

According to a model I trained it thinks this would be versicolor.

Regression

Instead of predicting a label like classification, Regression predicts a value.

This example has features crime rate, Zoning, rooms, square footage and a value price.

crime rate	Zoning	rooms	square footage	price
.5	3.5	5	1400	100000
.2	2	3	3000	50000
.3	4	7	1800	150000

Unlike the classification example where we tried to predict what group features belonged to, we want to predict what value the features would have. This could be a number ranging anywhere!

Given a list of new features from a house like below, we would then want to find out how much that house is worth by predicting a number value.

crime rate	Zoning	rooms	square footage
.7	4	2	1000

Some other examples to think about Predicting:

Stock price
Age

This workshop is going to focus on supervised Machine Learning, but we'll talk briefly about some of the other types!

Unsupervised Learning

Unsupervised Learning uses a dataset that is not labeled and gains insight about its patterns.

Clustering

A common way of using unsupervised learning is clustering.

This picture shows an example of visualizing the Iris Dataset we talked about before. We can see that there are features that relate to each species. If we didn't have those labels we could use unsupervised learning to create clusters separating the groups out that would probably look pretty similar to this. We could then add a label to those clusters.

An example to think about is if you have a large dataset of customers. Maybe you would like to segment them out to cluster similar customers.

Semi-Supervised Machine Learning

Uses mixed dataset labeled with labeled and unlabeled to train the model and a combination of supervised and unsupervised machine learning.

Semi Supervised Machine learning can be important to look into if you don't have enough labeled data to create a good model. Labeling and acquired labeled data can be extremely expensive / time consuming so developing a model that can use both types of data is super intriguing!

Imagine trying to label every piece of information you get from a self driving car! You have a constant video feed, Lidar, and other sensors.

Reinforcement Learning

Reinforcement Learning is often used in a situation where an algorithm can take an action in an environment and receive a reward based on making a good design.

You see a lot of example of this type of machine learning used to make computers excellent gamers!

A couple examples:

Open AI Gym

Flappy Bird

Deep Learning

Deep Learning is a subset of Machine Learning.

It uses layers of Artificial Neural Networks and can learn from data to change the weights of the neurons.

A Neural Network Playground - TensorFlow is a great place to start tinkering around and learning more about Artificial Neural Networks!

Deep Learning is killing it at recognizing and generating complicated patterns.

Computer vision (CV)
- Self Driving cars
- Amazon Go
Natural Language Processing (NLP)
- Alexa
- Siri
Generative Adversarial Networks (GANs)
- NVIDIA’s Hyperrealistic Face Generator
- video game Character or level generation
- art generation

For this class we're going to stay focused on Supervised Machine learning. It's a great place to start!

But out of all these what would you like to see a class on next?

Supervised Learning Models

Some of the common models. Having an idea of what these do and applications they should be used for is important! I will only briefly go over them so please read more about them!

Linear Regression

Typically used for regression

Generally regression problems predict a value on a continuous spectrum

Logistic Regression

Typically used for classification

NOT used for regression problems! Has regression in the same due to the statistics behind the model.

Used to predict binary outputs (yes, no | true, false | Pass, fail)

looking for probability above a certain threshold

if .5

K-Nearest Neighbors

Typically used for classification

k-NN finds the k number of nearest data points and makes a educated guess based on the classifications of the nearest datapoint.

Decision Trees

Typically used for classification

Maybe an over simplification but a Decision tree can be thought of like a bunch of if statements.

You've probably seen flow chats before with different paths to take depending on the data.

More

There are of course more than these 4 models, a few more popular ones you should look into are Support Vector Machines, Random Forests, and Naive Bayes.

How do you choose the algorithm?

There can be a lot of factors to consider, like the size of data, Labels, Accuracy, Scalability, etc... A lot of these out of the scope for this workshop.

But when you're first starting out It's important to think about your desired outcome(output of the model).

Is it a number? Its probably a Regression problem.
Is it a class / label? Its probably a classification problem.
Are you separating unlabeled data into groups? It’s probably a clustering problem.

https://scikit-learn.org/stable/tutorial/machine_learning_map/

Some Basics Terms:

We can only scratch the surface of Machine Learning tonight in this workshop, so this is by no means everything you need to know, but it should help you get started!

Fitting

Training your model on your dataset. You'll see terms like fit and train used interchangeably

Overfitting

relies too much the relationships in training data, Fails to work correctly on new data.

Underfitting

Fails to learn the relationships in the training data to be used on new data

Cross Validation

Validate that your machine learning model is working on well on data that it was not trained on.

We trained the model, but need to validate that its working as expected. A common way is to split the dataset into training and testing(We'll do this soon in python).

Machine Learning with Python:

Popular Python Data & Machine Learning Libraries

Again this just some of them, there are soooooo many.....

Regression Project

Boston House Dataset

Looking at this data how do we know that regression will be a good choice? Why not Classification?

>>> Boston House price Linear Regression Notebook <<<

Classification Project

Iris Dataset

Looking at this data how do we know that Classification will be a good choice? Why not Regression?

>>> Iris K-Nearest Neighbors <<<

YOU MADE IT THROUGH!

Did you learn something new?

Do you feel more comfortable with the ideas of Machine Learning?

Do you have an awesome idea you want to use try using machine learning? What is it?

KEEP LEARNING!

Best way to learn is solving a problem you're excited about!

Use an "ugly" dataset. Understanding how to make a good dataset is important.

Scikit Learn has more built in datasets. Use them and apply what you learned today!

Hack Reactor Software Engineer Prep FREE | study at your own pace
Galvanize Data Science Prep Course - FREE | study at your own pace

What is Galvanize?

We create a technology ecosystem for learners, entrepreneurs, startups and established companies to meet the needs of the rapidly changing digital world.

Education
Co-Working
Events
Enterprise

Galvanize Classes

Immersive Bootcamp

Transform your career with our 13 week immersive programs

Software Engineer - 6/3/19 - 10/11/19
Data Science - 8/19/19 - 11/15/19

Part-Time Courses

Learn while working with out evening part-time classes

Python Fundamentals - 6/4/19 - 7/11/19
Data Analytics - 6/3/19 - 8/21/19

Co-working Space

Work in our building!

Questions

Please feel free to reach out to me with any questions! Let me know what you're planning to do next and how I can help!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.DS_Store		.DS_Store
Boston_house.ipynb		Boston_house.ipynb
README.md		README.md
iris_data.ipynb		iris_data.ipynb
irisknn.png		irisknn.png
irisviz.png		irisviz.png

amoakopoku/intro-machine-learning

Folders and files

Latest commit

History

Repository files navigation

Intro to Machine Learning

FAQs:

Setting up your computer

What this workshop is

About me:

About you!

FAQs Again for anyone who just came in:

Setup

Regression Project

>>> Boston House price Linear Regression Notebook <<<

Classification Project

>>> Iris K-Nearest Neighbors <<<

What is Machine Learning:

Who uses Machine Learning?

Amazon

Netflix

Google

Apple

Facebook

Tesla

Machine Learning Applications

Healthcare

Smart Home Devices

Image generation

Agriculture

Supply Chain

Manufacturing

Fraud Detection

Types of Machine Learning:

Supervised Learning

Classification

Regression

Unsupervised Learning

Clustering

Semi-Supervised Machine Learning

Reinforcement Learning

Deep Learning

Supervised Learning Models

More

How do you choose the algorithm?

Some Basics Terms:

Fitting

Machine Learning with Python:

Popular Python Data & Machine Learning Libraries

Regression Project

>>> Boston House price Linear Regression Notebook <<<

Classification Project

>>> Iris K-Nearest Neighbors <<<

YOU MADE IT THROUGH!

KEEP LEARNING!

What is Galvanize?

Galvanize Classes

Immersive Bootcamp

Part-Time Courses

Co-working Space

Questions

About

Resources

Stars

Watchers

Forks

Languages