Notes and exercise attempts for "An Introduction to Statistical Learning"
I have combined the two repository that is ISLR-python and stat-learning
- "(*)" means I am not sure about the answer
- Try out RStudio (www.RStudio.com) as an R IDE with the knitr package.
- Pull requests gladly accepted. If a pull request is too much effort, please at least file a new issue. :)
- Visit An Introduction to Statistical Learning Unofficial Solutions for an index of exercise solutions.
- This repository contains Python code for a selection of tables, figures and LAB sections from the book 'An Introduction to Statistical Learning with Applications in R' by James, Witten, Hastie, Tibshirani (2013).
<IMG src='http://www-bcf.usc.edu/%7Egareth/ISL/ISL%20Cover%202.jpg', height=20%, width=20%>
Chapter 3 - Linear Regression
Chapter 4 - Classification
Chapter 5 - Resampling Methods
Chapter 6 - Linear Model Selection and Regularization
Chapter 7 - Moving Beyond Linearity
Chapter 8 - Tree-Based Methods
Chapter 9 - Support Vector Machines
Chapter 10 - Unsupervised LearningExtra: Misclassification rate simulation - SVM and Logistic Regression
- This great book gives a thorough introduction to the field of Statistical/Machine Learning. The book is available for download (see link below), but I think this is one of those books that is definitely worth buying. The book contains sections with applications in R based on public datasets available for download or which are part of the R-package ISLR. Furthermore, there is a Stanford University online course based on this book and taught by the authors (See course catalogue for current schedule).
- Since Python is my language of choice for data analysis, I decided to try and do some of the calculations and plots in Jupyter Notebooks using:
- pandas
- numpy
- scipy
- scikit-learn
- matplotlib
- seaborn
- statsmodels
- patsy
- It was a good way to learn more about Machine Learning in Python by creating these notebooks. I created some of the figures/tables of the chapters and worked through some LAB sections. At certain points I realize that it may look like I tried too hard to make the output identical to the tables and R-plots in the book. But I did this to explore some details of the libraries mentioned above (mostly matplotlib and seaborn). Note that this repository is not a tutorial and that you probably should have a copy of the book to follow along. Suggestions for improvement and help with unsolved issues are welcome!
- For an advanced treatment of these topics see Hastie et al. (2009)
- James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R, Springer Science+Business Media, New York.
- Hastie, T., Tibshirani, R., Friedman, J. (2009). Elements of Statistical Learning, Second Edition, Springer Science+Business Media, New York.
- The free video resources provided by BigData School
- free pdf download
- official web
- http://www.statlearning.com
- http://statlearning.class.stanford.edu/