This is to facilitate the “Machine Learning in Physics” course that I am teaching at Sharif University of Technology for winter-19 semester. For more information, see the course page at
-
Decent understanding of programming and python and the following libraries
-
Numpy
-
Pandas
-
Plotting and graphical presentation tools in python
-
-
Git and Github (if you not familiar, let me know.)
-
Basic understanding of quantum mechanics and statistics.
-
Basic understanding of machine learning
This is a tentative plan and we may change it as we move on.
-
Course Project: 40%
-
Assignments: 30%
-
In-class exercises 10%
-
Final exam (set for Thursday, June 20th, 9AM): 30%
These add up to 110% which include the bonus as well.
This is a group project and counts towards 40% of the final grade.
The idea is that each group decides on a project at the beginning of the course and apply everything that we cover to their project. Here are some of the expectations for the course project:
-
Some initial proposal: Clear statement of the problem and some primary assessment of why using ML could help answer this problem. (Due Feb 28th)
-
Data collection/generation and preparation: (Due March 15th => Extended to March 20th )
- Create a folder for this part
- Have a description (readme file) for the data
- Describe your data: Where it comes from, different feautres and their physical significance, your target value(s)
- Create a notebook and implement the following in different sections:
- Clean up the data (remove the missing data and convert everything to numerical values)
- Scale your data
- Analysis of features and target (Histograms and )
- Feature selection (Try different techniques and assess how well they work on your data)
- Feature extraction (Try different techniques and assess how well they work on your data)
-
Application of the basic ML techniques: (Due April 15th)
-
A table of assessment (Will give an example later.)
-
Investigation of variance and bias of the techniques investigated.
-
Learning and validation curves
-
-
Application of NN and setting the hyperparameters (Due April 30th)
-
Oral presentation (See me to set up the time, it should be before June 24th.)
-
Written term paper (It should be submitted by July 5th.)
Some notes:
- Make sure you include citations to all the resources you use!
- You should submit your work as a group rather than separate individual submissions.
- Scripts, notebooks and figures without description would not count toward your grade.
- Your codes should include enough comments and information that can be easily followed.
- It is essential that all group members contribute (make commits) to their repositories, this is the only way I can make sure that everyone participated in their project.
Assigment | Deadline and Submission link | Solutions |
---|---|---|
Assigment 1 | Submit it here | Solution 1 |
Assigment 2 | Deadline: extended to March 22th | Solution 2 |
Assigment 3 | Deadline: April 18th | |
Assigment 4 | Deadline: May 9th | |
Assigment 5 | Deadline: May 26th |
-
Mehta, Pankaj, et al. "A high-bias, low-variance introduction to machine learning for physicists." Physics Reports (2019).
-
Nielsen, Michael A. Neural networks and deep learning. Vol. 25. San Francisco, CA, USA:: Determination press, 2015. (Available online )
-
Chollet, François, "Deep Learning with Python." (2018).
The course material is posted here and you can use either Google Colab or Mybinder to work with these Jupyter notebooks.
Topic | Contents of the Lectures | Notebook(s) |
---|---|---|
Basics of machine learning | Lecture 1: Introduction To Machine Learning Notation Regression, logistic regression and classification Lecture 1: Noise ML beyond simple examples |
|
Basic Techniques | Lecture 2: Basic Techniques Overview of some of the most common techniques Lecture 2: Kernels |
|
Model Selection | Lecture 3: Concepts from Statistical Learning - Variance and bias - Learning and Validation curves Bayesian inference Lecture 3: Model Complexity - Practical model selection with scikit-learn Lecture 3: Model Evaluation -Confusion Matrix - Recall, precision, f-score - Precision-recall and ROC curve, ROC_AUC |
|
Data Preparation | Lecture 4: Data Preparation - Standardization - Clean-up: nan and outliers - Feature Selection: Features Importance, variance threshold Lecture 4: Data Reduction Feautre Extraction: PCA, Manifold Learning |
|
Ensemble Techniques | Lecture 5: Ensemble Techniques Aggregation, Stacking, Bagging, Boosting |
|
Neural Networks | Feedforward - Model Geometry and formulation - Universality - Non-linearity: Activation function Back propagation - Details - Initialization - Optimizations - Batch and epoch - Couple of example Practical implementations: - TensorFlow and Keras |
|
Convolutional Neural Networks | - Basic Idea of Convolution - Simple implementation of convnet with Keras - Well-known models Some examples |
|
Recurrent Neural Network | - Basic Idea - Example |
Note |
Reinforcement Learning | Basic Idea and details Example(s) |
See the files in the CheatSheet folder.
Item | Description |
---|---|
Jupyter | Jupyter provides an interactive environment for programming. We will be mostly using the python 3 kernel. |
Git and Github | Git provides a strong infrastructure for version control. Github is web-based hosting service for version control and it also provides services for collaboration. |
Python | It is the programming language that we will be mostly using for this course. |
NumPy | It’s a python library that provides strong and efficient tools for manipulation of high-dimensional arrays. |
SciPy | It’s a python library, built on NumPy for mathematical and scientific computing. |
Pandas_basics Pandas 2 Importing data |
It’s a python library, built on NumPy that provides efficient tools for handling and analysis of data. |
Matplotlib Seaborn |
These are two of the most common python library for visualization. |
Scikit-Learn | It’s a python library that provides a nice and fairly efficient implementation of most the machine learning techniques and ideas. |
Keras | It is python library that provides a high-level and easy-to-use interface for Tensorflow and some other deep learning libraries. |