Students will learn the foundational concepts and techniques of machine learning and how to apply those techniques to data science. Principles of data science and machine learning will be examined and applied to problem solving. Students will master data science processes and its applications, including how to wrangle and use data to train classification or prediction models. To demonstrate mastery, students will apply these techniques to develop and train models on data sets using industry-standard modern software libraries and tools. Students will develop “sharp” data science questions, select a data set and apply a variety of methods to explore those questions and find relevant answers.
"Machine Learning is the new electricity" according to Stanford Professor Andrew Ng. In this course you will learn the fundamental techniques of Machine Learning, the science of autonomously making decisions and predictions from tabular data, images, and text. Such knowledge is currently in extremely high demand and is the key to high-paying and interesting jobs in industry.
Course Delivery: Synchronous | 7 weeks | 18 sessions
Course Credits: 3 units | 37.5 Contact Hours/Term | 92 Non-Contact Hours/Term | 129.5 Total Hours/
By the end of the course the students will be able to
- Identify a prediction problem and choose the appropriate regression model
- Identify a classification problem and choose the appropriate classifier
- Evaluate either a regression model or a classifier
- Cluster un-labeled datasets to groups
- Compare models and choose the best model for the task, while continuing to tune the model's hyper-parameters
Course Dates: Monday, May 31 – Friday, July 16, 2021 (7 weeks)
Class Times: Monday, Wednesday, Friday at 2:15pm – 4:00pm (19 class sessions)
Class | Date | Topics |
---|---|---|
- | Mon, May 31 | No Class - Memorial Day |
1 | Wed, June 2 | Introduction to Machine Learning |
2 | Fri, June 4 | Logistic Regression |
3 | Mon, June 7 | Lab |
4 | Wed, June 9 | Linear Regression |
5 | Fri, June 11 | Model Evaluation |
6 | Mon, June 14 | Lab |
7 | Wed, June 16 | Support Vector Machines |
8 | Fri, June 18 | Decision Trees |
9 | Mon, June 21 | Lab |
10 | Wed, June 23 | Ensemble Methods |
11 | Fri, June 25 | Principal Component Analysis |
12 | Mon, June 28 | Lab |
13 | Wed, June 30 | Clustering |
14 | Fri, July 2 | Anomaly Detection |
- | Mon, July 5 | No Class - Independence Day Observed |
15 | Wed, July 7 | Naive Bayes |
16 | Fri, July 9 | TFIDF and its Application |
17 | Mon, July 12 | Lab |
18 | Wed, July 14 | Lab |
19 | Fri, July 16 | Project Presentations |
Assignment | Date Assigned | Due Date | Gradescope Link |
---|---|---|---|
Homework 1 - Linear Regression for Boston Housing Dataset | Wed, June 9 | Wed, June 16 | Homework 1 |
Homework 2 - SVM for Breast Cancer Dataset | Wed, June 16 | Wed, June 23 | Homework 2 |
Homework 3 - PCA, K-Means Clustering, Anomaly Detection | Fri, June 25 | Wed, July 7 | Homework 3 |
Homework 4 - Naive Bayes and TFIDF | Wed, July 7 | Wed, July 42 | Homework 4 |
Final Project | Mon, June 7 | Fri, July 16 | Homework 5 |
We will be using Gradescope, which allows us to provide fast and accurate feedback on your work. All assigned work will be submitted through Gradescope, and assignment and exam grades will be returned through Gradescope.
As soon as grades are posted, you will be notified immediately so that you can log in and see your feedback. You may also submit regrade requests if you feel we have made a mistake.
Your Gradescope login is your Make School email, and your password can be changed at https://gradescope.com/reset_password. The same link can be used if you need to set your password for the first time.
- We'll explore basic machine learning algorithms applied to simple datasets with an emphasis on getting basic model functionality
- Classification Tutorial
- Apply Linear Regression for Boston Housing Dataset
- Apply SVM for Breast Cancer Dataset
- Apply PCA, K-Means Clustering and Anomaly Detection
- Apply Naive Bayes and TFIDF
- Final Project: You will choose a dataset to clean, investigate, and make predictions or classification or clustering on it
- All projects will be submitted on Gradescope
All projects will require a minimum of 10 commits, and must take place throughout the entirety of the course
-
Good Example: 40+ commits throughout the length of the course, looking for a healthy spattering of commits each week (such as 3-5 per day).
-
Bad Example: 10 commits on one day during the course and no others. Students who do this will be at severe risk of not passing the class.
-
Unacceptable Example: 2 commits the day before a project is due. Students who do this should not expect to pass the class.
To pass this course you must meet the following requirements:
- Actively participate in class and abide by the Attendance Policy
- Complete and pass all Assignments and Projects with a score of above 70%
- Make up classwork from all absences
- Awesome Data Science
- make.sc/library Data Science
- DS-2.1 Study Guide
- Example Machine Learning Project
- Data visualization libraries
- Data manipulation, transformation and preprocessing libraries
- Data modeling libraries
- Model Training
- Model Evaluation
- Hyperparameter Tuning
- Supervised learning -- Regression
- Supervised learning -- Classification
- Unsupervised learning and dimensionality reduction
Students are expected to practice academic integrity in all of its forms, including abstaining from plagiarism, cheating, and other forms of academic misconduct. Make School reserves the right to determine in any given instance what action constitutes a violation of academic honesty and integrity. Plagiarism, defined as the practice of presenting another's work or ideas as one’s own, is an act of academic dishonesty and is a serious ethical and scholarly violation.
Copying text or ideas, whether verbatim or by paraphrasing from a source without using proper citation, is not accepted at Make School. Any materials incorporated into your work, regardless of format, must be properly acknowledged using a citation style appropriate for the discipline of the course.
Though plagiarism may be the most common form, other violations of scholarly integrity also constitute cheating, including:
- Using or copying information from another student’s code or written work;
- Copying information from another student’s test or using unauthorized materials during an examination, whether an in-class or take-home exam;
- Buying, selling, or stealing test questions, answers, or term papers;
- Doing work or taking tests on behalf of another student or submitting work done by another person;
- Falsifying data or laboratory results; and
- Submitting the same work for more than one course without explicit instructor approval.
If an incident of plagiarism or cheating occurs, the instructor will investigate the incident and consult with the Dean. If the student has been found to have committed an act of academic dishonesty, an Academic Misconduct Report will be filed and the student will be placed on a Participation Improvement Plan (PIP). A student who believes they have been wrongly accused of plagiarism or cheating, or that the instructor’s resolution of the alleged incident is unjust, may file a Request for Appeal of Disciplinary Action.
- Program Learning Outcomes - What you will achieve after finishing Make School, all courses are designed around these outcomes.
- Grading System - How grading is done at Make School
- Code of Conduct, Equity, and Inclusion - Learn about Diversity and Inclusion at Make School
- Academic Honesty - Our policies around plagerism, cheating, and other forms of academic misconduct
- Attendance Policy - What we expect from you in terms of attendance for all classes at Make School
- Course Credit Policy - Our policy for how you obtain credit for your courses
- Disability Services (Academic Accommodations) - Services and accommodations we provide for students
- Online Learning Tutorial - How to succeed in online learning at Make School
- Student Handbook - Guidelines, policies, and resources for all Make School students