Projects from UTDallas STAT6340
Experiment 1 - Visualization of KNN decision boundary for small data set.
Experiment 2 - KNN applied to the cifar data set.
Experiment 1 - Multiple linear regression and model variable selection applied to wine data set.
Experiment 2 - Comparison of LDA and QDA methods applied to admission data. Visualization of decision boundaries.
Experiment 3 - Classification probability cutoff selection via ROC curve and score of LDA and QDA, applied to diabetes data set.
Experiment 1 - Logistic Regression and model variable selection applied to diabetes data set.
Experiment 2 - Logistic Regression, LDA, QDA and KNN comparison via LOOCV estimated error.
Experiment 3 - Bootstrap parameter estimation for agreement of methods measuring oxygen saturation levels.
Experiment 1 - Best-Subset, Forward and Backward stepwise selection, Ridge Regression and LASSO applied to the wine data set.
Experiment 2 - Best-Subset, Forward and Backward stepwise selection, Ridge Regression and LASSO applied to the diabetes data set.
Note: Best parameters for Ridge Regression and LASSO obtained via LOOCV on 1 and 10-fold CV on 2, number of predictors selected obtained via best adjusted R^2 on 1 and AIC on 2.
Experiment 1 - Principal Components Analysis of the Hitters data set.
Experiment 2 - Cluster Analysis via Complete Linkage and Kmeans of Hitters.
Experiment 3 - Linear Regression, PCR and PLS to predict the log(salary) of Hitters. Comparison via LOOCV estimated error.
Experiment 1 - Comparison of DT, Bagging, Random Forest and Boosting applied to the Hitters data set.
Experiment 2 - Comparison of SVC and SVM with radial and polynomial kernel.
Experiment 1 - Comparison of different number of layers and sizes, including L2 regularization and dropout for the MNIST data set.
Experiment 2 - Comparison of different number of layers and sizes, including L2 regularization and dropout for the Boston Housing Price data set.