Skip to content

Latest commit

 

History

History

6-feature_engineering

Feature Engineering

A model can only be as good as it's inputs, and in data analytics it is not always obvious what should be used as features or inputs to the model. This may occur because there are a large number of possible inputs, in which case the task is to find the best ones, or the best combinations of inputs. In other cases, the structure of the data prevents it from being used directly as an input, for example categories, images or time series. There are numerous techniques for dealing with these challenges, and this module will explore a few that can be used to improve the efficiency and accuracy of analytics models by modifying their inputs.

Recommended Reading:

  • "Elements of Statistical Learning" Sec. 3.5, 4.3

Associated Notebooks:

Lectures

  • Feature Transformations - One-hot encoding, partial least squares, linear discriminant analysis, and symbolic regression.
  • Time Series Analysis - Data importing and cleaning, smoothing and autocorrelation, stationarity, autoregressive models, and ARIMA modeling.