Skip to content

Feature Dependencies

Ben Fulcher edited this page Nov 8, 2021 · 6 revisions

(Analysis credit: Brendan Harris using the catch22 Julia package).

Although the selection framework used to generate the catch22 feature set included a step to reduce redundancy, it was not designed to generate an independent set of features.

Pairwise Spearman correlations

Below is an example of the generic non-independence of features. We have plotted the Spearman correlation coefficient between all pairs of features, quantifying the similarity of their outputs across a diverse range of 1000 empirical time series:

Feature dependencies

  • We find a large cluster of features sensitive to the autocorrelation of a time series.
  • We also find a small cluster of two highly correlated features, DN_HistogramMode_5 and DN_HistogramMode_10, which measure the mode of the z-scored time-series distribution using different numbers of bins.

This dependency structure should be taken in mind when interpreting the results of catch22 analyses: Does your dataset exhibit any of these generic dependencies, or some unique dependencies?

PC Loadings

Below is a similar plot, but with color overlayed according to weights onto the first three principal components:

Feature PC loadings

Broadly,

  • The first two principal components capture different aspects of the autocorrelation structure.
  • The second principal components captures different aspects of the distribution asymmetry.
Clone this wiki locally