Classifier comparison with noise dimensions #17

sahanasrihari · 2019-12-11T22:24:02Z

Tutorial on testing the performance of Random Forest, Support Vector Machine, K Nearest Neighbours given additional noise dimensions of different variance values.

Reference Issues/PRs

This is in reference to the issue stated in neurodata#1

What does this implement/fix? Explain your changes.

It is a new tutorial demoing the effect of addition noise dimensions on the accuracy of three classifiers. This gives us insight into one setting - which classification algorithm performs best amidst all the noise dimensions.
Here is a link to the code: https://github.com/sahanasrihari/scikit-learn/blob/master/examples/classification/CLASSIFIER_COMPARISON_PR.ipynb

Any other comments?

Random Forest is known to be robust in the sense of additional noise dimensions and especially with respect to the variance in the dataset. It outperforms both SVM and KNN according to the experiments run.

sahanasrihari · 2019-12-12T19:19:03Z

Tutorial on testing the performance of Random Forest, Support Vector Machine, K Nearest Neighbours given additional noise dimensions of different variance values.

Reference Issues/PRs

This is in reference to the issue stated in #19

What does this implement/fix? Explain your changes.

It is a new tutorial demoing the effect of addition noise dimensions on the accuracy of three classifiers. This gives us insight into one setting - which classification algorithm performs best amidst all the noise dimensions.

Any other comments?

Random Forest is known to be robust in the sense of additional noise dimensions and especially with respect to the variance in the dataset. It outperforms both SVM and KNN according to the experiments run.

sahanasrihari · 2019-12-12T19:19:48Z

Here is a link to the code
https://github.com/sahanasrihari/scikit-learn/blob/master/examples/classification/CLASSIFIER_COMPARISON_PR.ipynb

bdpedigo · 2019-12-12T19:47:29Z

typo: "trails" instead of trials
call fit_predict something else. Just because this is a common sklearn term already. maybe just fit_models or something like that
typo \ in "Computation of accuracy"
you have a bug where you are not passing a new variance into fit_predict each time
compute is too vague, maybe run_classification_experiment? Open to better suggestions
make file name not all caps
remove ticks and tick labels for dataset visualizations
remove xtick labels for all accuracy plots besides bottom row. Can keep ticks themselves
add Noise dimensions label to the bottom row of all 3 cols
add accuracy label to leftmost column of all rows
after doing all of the above, see if you can increase font size by 1.5x or 2x without it looking too cluttered
put everything on same yscale, then you can remove all yticklabels besides left-most ones. Could probably keep ticks themselves but see how it looks
I wonder whether it is worth pointing out where chance is - e.g. 0.5 with a line or something like that. You decide, maybe it would look bad idk.

Despite above comments (which I think should actually be easy to address) code is very clear and has come a long way. Nice work!

bdpedigo · 2019-12-12T19:48:21Z

@sahanasrihari let me know the status of

you have a bug where you are not passing a new variance into fit_predict each time
in particular, I am curious to see new results if I am right about this

sahanasrihari

@bdpedigo File changes after the feedback.

Classifier comparison with noise dimensions

c06b318

sahanasrihari added 2 commits December 15, 2019 00:23

Delete CLASSIFIER_COMPARISON_PR.ipynb

9024d8f

Classifier comparison with noise dimensions

f426ecf

sahanasrihari commented Dec 16, 2019

View reviewed changes

Update classifier_comparison_noise_dimensions.ipynb

294c4ab

bdpedigo merged commit 1a201fc into NeuroDataDesign:master Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classifier comparison with noise dimensions #17

Classifier comparison with noise dimensions #17

sahanasrihari commented Dec 11, 2019 •

edited

Loading

sahanasrihari commented Dec 12, 2019 •

edited

Loading

sahanasrihari commented Dec 12, 2019 •

edited

Loading

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

sahanasrihari left a comment

Classifier comparison with noise dimensions #17

Classifier comparison with noise dimensions #17

Conversation

sahanasrihari commented Dec 11, 2019 • edited Loading

sahanasrihari commented Dec 12, 2019 • edited Loading

sahanasrihari commented Dec 12, 2019 • edited Loading

bdpedigo commented Dec 12, 2019

bdpedigo commented Dec 12, 2019

sahanasrihari left a comment

Choose a reason for hiding this comment

sahanasrihari commented Dec 11, 2019 •

edited

Loading

sahanasrihari commented Dec 12, 2019 •

edited

Loading

sahanasrihari commented Dec 12, 2019 •

edited

Loading