Combining finance, student progress and statics (30 csv files) in different semesters by Training and Testing as df_train and df_test.
- NA into -1: information are missing
- NA into 0: financial records
- NA into unknown, for string columns
- Conduct normality diagnosis and perform necesaary log transformation.
- Conduct wilcoxon rank sum test on continuous variables
- Conduct Chi-Square test on categorical variables.
Implemented SVM, Naive Bayes, Logistic Regression, Decision Trees, Random Forest, and XGBoost.