Skip to content

Latest commit

 

History

History
15 lines (11 loc) · 802 Bytes

File metadata and controls

15 lines (11 loc) · 802 Bytes

Feature-Selection-and-Data-Cleaning-

feature selection are used as it enables the machine learning algorithm to train faster and reduces the complexity of a model and makes it easier to interpret. It improves the accuracy of a model if the right subset is chosen and it reduces overfitting. Filter methods are generally used as a preprocessing step. The selection of features is independent of any machine learning algorithms. Instead, features are selected on the basis of their scores in various statistical tests for their correlation with the outcome variable. The correlation is a subjective term here. In this file, filter method i.e. select-k-best i.e. chi2 is used for the feature selection.

Libraries Used:

Pandas, Numpy

Programing Language

Python

IDE Used

Jupyter Notebook