MLBas_Romanenko_exam

This is a student project for the PhD course on Applied Machine Learning (Basic) in the University of Bologna.

For this work the "Credit Card Fraud Detection" dataset was chosen: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud/data

The project is divided in two parts:

"Frauds_DataPreparation.ipynb" where the dataset was examined and prepared for the future training. The preparation includes checking the distribution of the features and rescailing some of them, clearing from the fake entries and outliers. After all the transformation and cleaning, the resulting sub-set was written into a separete file for the futurre training.
"Frauds_ModelsTrain.ipynb" where different classification algorithms were tested -- resulting in the choice of the Linear one. Then, the importance of the features was estimated leading to dropping 8 of the less important for the sake of computational resourses economy. The resulting setup (linear model and set of selected features) was then used for the further tests and adjustments of the model configuration. The choice of the main metric for this particular case is discussed.

The project is prepared by a PhD student Gleb Romanenko for the final exam.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitattributes		.gitattributes
Frauds_DataPreparation.ipynb		Frauds_DataPreparation.ipynb
Frauds_ModelsTrain.ipynb		Frauds_ModelsTrain.ipynb
README.md		README.md
creditcard.csv		creditcard.csv
creditcard_selected.csv		creditcard_selected.csv

Provide feedback