Skip to content

Make-School-Courses/DS-2.1-Machine-Learning

Repository files navigation

Robot icon

DS 2.1 Machine Learning

Course Description

Students will learn the foundational concepts and techniques of machine learning and how to apply those techniques to data science. Principles of data science and machine learning will be examined and applied to problem solving. Students will master data science processes and its applications, including how to wrangle and use data to train classification or prediction models. To demonstrate mastery, students will apply these techniques to develop and train models on data sets using industry-standard modern software libraries and tools. Students will develop “sharp” data science questions, select a data set and apply a variety of methods to explore those questions and find relevant answers.

Why should you take this class?

"Machine Learning is the new electricity" according to Stanford Professor Andrew Ng. In this course you will learn the fundamental techniques of Machine Learning, the science of autonomously making decisions and predictions from tabular data, images, and text. Such knowledge is currently in extremely high demand and is the key to high-paying and interesting jobs in industry.

Prerequisites:

Course Specifics

Course Delivery: Synchronous | 7 weeks | 18 sessions

Course Credits: 3 units | 37.5 Contact Hours/Term | 92 Non-Contact Hours/Term | 129.5 Total Hours/

Learning Outcomes

By the end of the course the students will be able to

  1. Identify a prediction problem and choose the appropriate regression model
  2. Identify a classification problem and choose the appropriate classifier
  3. Evaluate either a regression model or a classifier
  4. Cluster un-labeled datasets to groups
  5. Compare models and choose the best model for the task, while continuing to tune the model's hyper-parameters

Schedule

Course Dates: Monday, May 31 – Friday, July 16, 2021 (7 weeks)

Class Times: Monday, Wednesday, Friday at 2:15pm – 4:00pm (19 class sessions)

Class Date Topics
- Mon, May 31 No Class - Memorial Day
1 Wed, June 2 Introduction to Machine Learning
2 Fri, June 4 Logistic Regression
3 Mon, June 7 Lab
4 Wed, June 9 Linear Regression
5 Fri, June 11 Model Evaluation
6 Mon, June 14 Lab
7 Wed, June 16 Support Vector Machines
8 Fri, June 18 Decision Trees
9 Mon, June 21 Lab
10 Wed, June 23 Ensemble Methods
11 Fri, June 25 Principal Component Analysis
12 Mon, June 28 Lab
13 Wed, June 30 Clustering
14 Fri, July 2 Anomaly Detection
- Mon, July 5 No Class - Independence Day Observed
15 Wed, July 7 Naive Bayes
16 Fri, July 9 TFIDF and its Application
17 Mon, July 12 Lab
18 Wed, July 14 Lab
19 Fri, July 16 Project Presentations

Assignment Schedule

Assignment Date Assigned Due Date Gradescope Link
Homework 1 - Linear Regression for Boston Housing Dataset Wed, June 9 Wed, June 16 Homework 1
Homework 2 - SVM for Breast Cancer Dataset Wed, June 16 Wed, June 23 Homework 2
Homework 3 - PCA, K-Means Clustering, Anomaly Detection Fri, June 25 Wed, July 7 Homework 3
Homework 4 - Naive Bayes and TFIDF Wed, July 7 Wed, July 42 Homework 4
Final Project Mon, June 7 Fri, July 16 Homework 5

Class Assignments

We will be using Gradescope, which allows us to provide fast and accurate feedback on your work. All assigned work will be submitted through Gradescope, and assignment and exam grades will be returned through Gradescope.

As soon as grades are posted, you will be notified immediately so that you can log in and see your feedback. You may also submit regrade requests if you feel we have made a mistake.

Your Gradescope login is your Make School email, and your password can be changed at https://gradescope.com/reset_password. The same link can be used if you need to set your password for the first time.

Tutorials

  • We'll explore basic machine learning algorithms applied to simple datasets with an emphasis on getting basic model functionality
  • Classification Tutorial

Projects

  • Apply Linear Regression for Boston Housing Dataset
  • Apply SVM for Breast Cancer Dataset
  • Apply PCA, K-Means Clustering and Anomaly Detection
  • Apply Naive Bayes and TFIDF
  • Final Project: You will choose a dataset to clean, investigate, and make predictions or classification or clustering on it
  • All projects will be submitted on Gradescope

All projects will require a minimum of 10 commits, and must take place throughout the entirety of the course

  • Good Example: 40+ commits throughout the length of the course, looking for a healthy spattering of commits each week (such as 3-5 per day).

  • Bad Example: 10 commits on one day during the course and no others. Students who do this will be at severe risk of not passing the class.

  • Unacceptable Example: 2 commits the day before a project is due. Students who do this should not expect to pass the class.

  • The Final Project Guideline for DS 2.1

  • The Rubric for Final Project

Evaluation

To pass this course you must meet the following requirements:

  • Actively participate in class and abide by the Attendance Policy
  • Complete and pass all Assignments and Projects with a score of above 70%
  • Make up classwork from all absences

Information Resources

Data Science Technical Interview Topics Covered in this course

Data Science Tools

  • Data visualization libraries
  • Data manipulation, transformation and preprocessing libraries
  • Data modeling libraries

Data Science Techniques

  • Model Training
  • Model Evaluation
  • Hyperparameter Tuning

Machine learning models

  • Supervised learning -- Regression
  • Supervised learning -- Classification
  • Unsupervised learning and dimensionality reduction

Students are expected to practice academic integrity in all of its forms, including abstaining from plagiarism, cheating, and other forms of academic misconduct. Make School reserves the right to determine in any given instance what action constitutes a violation of academic honesty and integrity. Plagiarism, defined as the practice of presenting another's work or ideas as one’s own, is an act of academic dishonesty and is a serious ethical and scholarly violation.

Copying text or ideas, whether verbatim or by paraphrasing from a source without using proper citation, is not accepted at Make School. Any materials incorporated into your work, regardless of format, must be properly acknowledged using a citation style appropriate for the discipline of the course.

Though plagiarism may be the most common form, other violations of scholarly integrity also constitute cheating, including:

  • Using or copying information from another student’s code or written work;
  • Copying information from another student’s test or using unauthorized materials during an examination, whether an in-class or take-home exam;
  • Buying, selling, or stealing test questions, answers, or term papers;
  • Doing work or taking tests on behalf of another student or submitting work done by another person;
  • Falsifying data or laboratory results; and
  • Submitting the same work for more than one course without explicit instructor approval.

If an incident of plagiarism or cheating occurs, the instructor will investigate the incident and consult with the Dean. If the student has been found to have committed an act of academic dishonesty, an Academic Misconduct Report will be filed and the student will be placed on a Participation Improvement Plan (PIP). A student who believes they have been wrongly accused of plagiarism or cheating, or that the instructor’s resolution of the alleged incident is unjust, may file a Request for Appeal of Disciplinary Action.

Make School Course Policies

Robot icon on this README is by Icons8

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •