This repository contains the code used to solve the ML Higgs challenge for the first project of EPFL's ML course. (full details: https://www.aicrowd.com/challenges/epfl-machine-learning-higgs)
The files in the root directory contain the final codes used for the submission. The "Testing Files" directory contains scripts used to test multiple methods.
Contains function to manipulate the data:
load_csv_data
: Reads the data filecreate_csv_submission
: Writes the output on filestandardize
: Standardizes the featuresbuild_model_data
: Adds a column of 1s to the featuresbatch_iter
: Generate a mini-batch iterator for a dataset
Contains the 6 required implementations for this project:
mean squared error gd
: Linear regression using gradient descentmean squared error sgd
: Linear regression using stochastic gradient descentleast_squares
: Least squares regression using normal equationsridge_regression
: Ridge regression using normal equationslogistic_regression
: using stochastic gradient descentreg_logistic_regression
: Regularized logistic regression
Contains the main function to train and predict:
train_model_least_squares
,train_model_logistic_regression
,train_Hessian
: Methods to call specific ML training algorithmstrain_model
: Trains a model and return the parametersrunModel
: Runs the model on the testing data
Contains code for cross-validation and hyper-parameter optimization: