Predicting the relapsing of breast cancer patients:
Data description: We have collected gene expression levels for 4654 genes on 184 early-stage breast cancer samples: xtrain.txt (each row is a gene, each column a sample). After surgical removal of the tumour, some unfortunately relapsed within 5 years (label=+1), while other did not (label=-1). The labels of the the 184 samples are available in the file ytrain.txt.
Project task:
- Propose and test different techniques to predict the relapse from gene expression data. Check the effect of parameters, estimate the performance.
- Make a prediction of relapse for the following 92 samples: xtest.txt.