Skip to content

amoakopoku/intro-machine-learning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro to Machine Learning

Brought to you by Galvanize. Learn more about the way we teach at galvanize.com.

FAQs:

  • WIFI: g|Events | Password is learningcommunity

Setting up your computer

  • A web browser to see what we're working on as others see it (Recommend Google Chrome: [chrome.google.com] (http://chrome.google.com))
  • We will be using Google Colab for this workshop so make a Google account if you don't already have one.
  • Open this github Repo to follow along

What this workshop is

A super friendly introduction to Machine Learning No previous experience expected, but knowing some python will help!

You can't learn EVERYTHING in ~2 hours, especially when it comes to Machine Learning! But you can learn enough to get excited and comfortable to keep working and learning on your own!

  • This course is for absolute beginners
  • Ask Questions!
  • Answer Questions!
  • Help others when you can
  • Its ok to get stuck, just ask for help!
  • Feel free to move ahead
  • Be patient and nice

We're not going to focus on the math behind the models. We're going to focus more on when and how to use a model. If you would like to go into the math and more about each model I encourage you to do so!

About me:

Hello I'm Keenan Olsen. I'm a Technology Evangelist here at Galvanize!

I Originally got into Machine Learning by solving a manufacturing problem at my last job with computer vision, and I think its one of the coolest fields!

Note: I'm not a Galvanize Instructor

Reach out to me if interested in:

  • breaking into the tech industry
  • learning resources
  • meetup recommendations
  • learning more about Galvanize
  • giving me suggestions for events!
  • being friends

About you!

Give a quick Intro!

  • Whats your name?
  • Whats your background?
  • Why are you interested in Machine Learning?

FAQs Again for anyone who just came in:

  • WIFI: g|Events | Password is learningcommunity
Setup
  • Moderen web browser
  • Google account

Regression Project

Boston House Dataset

Looking at this data how do we know that regression will be a good choice? Why not Classification?

Classification Project

Iris Flower Dataset KNN Workbook

Looking at this data how do we know that Classification will be a good choice? Why not Regression?

What is Machine Learning:

To put it very simply Machine Learning can usually be thought of using a statistical model built based on a dataset to solve a problem.

Instead of explicitly programming an algorithm to do a specific task, we let it "learn" from data to find patterns and inference.

We'll see examples of this soon!

Who uses Machine Learning?

More and more companies using making decisions with data are using machine learning. Here are just a few examples that you've probably experiences as a customer.

Amazon

  • Product Recommendations
  • Amazon GO Computer Vision
  • Alexa
  • Delivery Robots

Netflix

  • Show & Movie Recommendations

Google

  • Gmail Spam Filtering
  • Google Assistance
  • Youtube Content filtering & Recommendations
  • Self Driving Cars

Apple

  • Siri
  • App Store Recommendations

Facebook

  • Face Tagging Detection

Tesla

  • Self Driving Cars

These companies use Machine Learning in many other ways!

Machine Learning Applications

We talked about a some examples above from big companies we probably all know of. But here are several more types of applications that machine learning has become popular with.

Healthcare

  • Cancer Detection
  • X-Ray diagnostic

Smart Home Devices

  • Smart door Bell
  • Smart Lights
  • Security

Image generation

Agriculture

  • Crop monitoring & planning

Supply Chain

  • Sourcing and Shipping Automation

Manufacturing

  • Quality Assurance
  • Design

Fraud Detection

  • Credit cards
  • Product listings

You can see how all of these applications revolve around finding patterns in data!

Types of Machine Learning:

Supervised Learning

Supervised Learning uses a dataset that is labeled. In this context imagine having a list of features and a label(group) that those features belong to.

Here we have features(sepal length (cm), etc) and a label(Flower Species)

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) Species
5.1 3.5 1.4 0.2 setosa
5.7 2.9 4.2 1.3 versicolor
7.7 3.0 6.1 2.3 virginica

We could use a full dataset with data like above to make a prediction of the flower species given only the Petal and Sepal Lengths.

Another good example of supervised learning is a email spam filter.

Say we have a bunch of emails in our dataset and they all have a label of either spam or not_spam. We could then train a supervised learning model to look at all of those emails and pick up patterns that show up in the spam emails. There are probably certain words or formatting that repeat them selves. If you've ever looked in your email spam folder you can probably pick out some of those things yourself!

There are 2 main types of supervised learning Classification and Regression:

Classification

Classification tries to assign the correct label to a new piece of data not containing a label. Both examples above are good examples of classification problems.

Spam filter would look at an email and decide if it should be labeled as spam or not_spam

We could be given a new flower measurement and we want to try to label it with the correct Species: setosa, versicolor, virginica

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
5.1 3.5 1.4 0.2

According to a model I trained it thinks this would be versicolor.

Regression

Instead of predicting a label like classification, Regression predicts a value.

This example has features crime rate, Zoning, rooms, square footage and a value price.

crime rate Zoning rooms square footage price
.5 3.5 5 1400 100000
.2 2 3 3000 50000
.3 4 7 1800 150000

Unlike the classification example where we tried to predict what group features belonged to, we want to predict what value the features would have. This could be a number ranging anywhere!

Given a list of new features from a house like below, we would then want to find out how much that house is worth by predicting a number value.

crime rate Zoning rooms square footage
.7 4 2 1000

Some other examples to think about Predicting:

  • Stock price
  • Age

This workshop is going to focus on supervised Machine Learning, but we'll talk briefly about some of the other types!

Unsupervised Learning

Unsupervised Learning uses a dataset that is not labeled and gains insight about its patterns.

Clustering

A common way of using unsupervised learning is clustering.

iris

This picture shows an example of visualizing the Iris Dataset we talked about before. We can see that there are features that relate to each species. If we didn't have those labels we could use unsupervised learning to create clusters separating the groups out that would probably look pretty similar to this. We could then add a label to those clusters.

An example to think about is if you have a large dataset of customers. Maybe you would like to segment them out to cluster similar customers.

Semi-Supervised Machine Learning

Uses mixed dataset labeled with labeled and unlabeled to train the model and a combination of supervised and unsupervised machine learning.

Semi Supervised Machine learning can be important to look into if you don't have enough labeled data to create a good model. Labeling and acquired labeled data can be extremely expensive / time consuming so developing a model that can use both types of data is super intriguing!

Imagine trying to label every piece of information you get from a self driving car! You have a constant video feed, Lidar, and other sensors.

Reinforcement Learning

Reinforcement Learning is often used in a situation where an algorithm can take an action in an environment and receive a reward based on making a good design.

You see a lot of example of this type of machine learning used to make computers excellent gamers!

A couple examples:

Open AI Gym

Flappy Bird

Deep Learning

Deep Learning is a subset of Machine Learning.

It uses layers of Artificial Neural Networks and can learn from data to change the weights of the neurons.

A Neural Network Playground - TensorFlow is a great place to start tinkering around and learning more about Artificial Neural Networks!

Deep Learning is killing it at recognizing and generating complicated patterns.


For this class we're going to stay focused on Supervised Machine learning. It's a great place to start!

But out of all these what would you like to see a class on next?

Supervised Learning Models

Some of the common models. Having an idea of what these do and applications they should be used for is important! I will only briefly go over them so please read more about them!

Typically used for regression

Generally regression problems predict a value on a continuous spectrum

Typically used for classification

NOT used for regression problems! Has regression in the same due to the statistics behind the model.

Used to predict binary outputs (yes, no | true, false | Pass, fail)

looking for probability above a certain threshold

if .5

Typically used for classification

k-NN finds the k number of nearest data points and makes a educated guess based on the classifications of the nearest datapoint.

iris

Typically used for classification

Maybe an over simplification but a Decision tree can be thought of like a bunch of if statements.

You've probably seen flow chats before with different paths to take depending on the data.

More

There are of course more than these 4 models, a few more popular ones you should look into are Support Vector Machines, Random Forests, and Naive Bayes.

How do you choose the algorithm?

There can be a lot of factors to consider, like the size of data, Labels, Accuracy, Scalability, etc... A lot of these out of the scope for this workshop.

But when you're first starting out It's important to think about your desired outcome(output of the model).

  • Is it a number? Its probably a Regression problem.
  • Is it a class / label? Its probably a classification problem.
  • Are you separating unlabeled data into groups? It’s probably a clustering problem.

https://scikit-learn.org/stable/tutorial/machine_learning_map/

Some Basics Terms:

We can only scratch the surface of Machine Learning tonight in this workshop, so this is by no means everything you need to know, but it should help you get started!

Fitting

Training your model on your dataset. You'll see terms like fit and train used interchangeably

relies too much the relationships in training data, Fails to work correctly on new data.

Fails to learn the relationships in the training data to be used on new data

Validate that your machine learning model is working on well on data that it was not trained on.

We trained the model, but need to validate that its working as expected. A common way is to split the dataset into training and testing(We'll do this soon in python).

Machine Learning with Python:

Popular Python Data & Machine Learning Libraries

Again this just some of them, there are soooooo many.....

Pandas is often used to explore, clean, and visualize your data.

Numpy is often used for muulti dimensional array manipulation

matplotlib is often used to visualizing your data in a chart like format

Scikitlearn a.k.a. sklearn is a powerfil opensource machine learning libray

Library from Google for Machine Learning. Popular in Deep Learning.

Library from Facebook for Machine Learning. Popular in Deep Learning.

A higher level wrapper that can be used with TensorFlow to make writing deep learning projects easier.

An opensource library to get started with NLP.

An opensource library to get started with computer vision and image manipulation.

Note: if you're thinking of exploring data science with python locallyClassification on your computer look into using Anaconda to manage your python and data libraries. I'd go crazy without it!

Regression Project

Boston House Dataset

Looking at this data how do we know that regression will be a good choice? Why not Classification?

Classification Project

Iris Dataset

Looking at this data how do we know that Classification will be a good choice? Why not Regression?

YOU MADE IT THROUGH!

Did you learn something new?

Do you feel more comfortable with the ideas of Machine Learning?

Do you have an awesome idea you want to use try using machine learning? What is it?

KEEP LEARNING!

Best way to learn is solving a problem you're excited about!

Use an "ugly" dataset. Understanding how to make a good dataset is important.

Scikit Learn has more built in datasets. Use them and apply what you learned today!

What is Galvanize?

We create a technology ecosystem for learners, entrepreneurs, startups and established companies to meet the needs of the rapidly changing digital world.

  • Education
  • Co-Working
  • Events
  • Enterprise

Galvanize Classes

Immersive Bootcamp

Transform your career with our 13 week immersive programs

Part-Time Courses

Learn while working with out evening part-time classes

Co-working Space

Work in our building!

Questions

Please feel free to reach out to me with any questions! Let me know what you're planning to do next and how I can help!

About

A introduction to machine learning with python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%