Skip to content

scores_metrics

Jakob edited this page Sep 4, 2020 · 25 revisions

Machine Learning in Climate Sciences University of Tübingen

Jakob Schloer

Time-series data

Keyword sample

  • Quantile score

  • Cross validation error

    • Leave one out error
  • Correlation coefficients

  • Wasserstein metric

Comparing Time series:

  • Correlations

  • Eucledian Distance

  • Dynamic Time Warping: DTW finds an optimal match between two sequences of feature vectors which allows for stretched and compressed sections of the sequence.

  • Mutual Information: Entropy based metric, introduced by Shannon. Applied to time series, there are quite a few papers by now. For example (this)[https://arxiv.org/abs/0904.4753].

  • iSAX: The final one I want to flag is the so-called “Motif Discovery” and the related (iSAX)[http://www.cs.ucr.edu/~eamonn/iSAX.pdf] representation of time series (by Eamon Keogh), which is very scalable.

Scale-dependent errors

The errors are on the same scale as the data, i.e. different data sets cannot be compared

Mean absolute error (MAE)

$$ MAE = mean(|y_i - \hat{y}i|) = \frac{1}{N} \sum{i=1}^{N} |y_i - \hat{y}_i| $$

Minimizing MAE leads to prediction of median.

Mean square error (MSE)

Strongly penalizes large wrong predictions

Root mean square error (RMSE)

Minimizing RMSE leads to prediction of mean

Percentage errors

Percentage errors are unit free and therefore allow comparison of different data sets.

Mean absolute percentage error (MAPE)

Problems:

  • cannot be used if there are zero values

  • puts more weight on negative errors

Symmetric absolute percentage error (sMAPE) Used to overcome the problems of MAPE

Scaled error

Alternative to percentage errors when comparing different datasets

Mean absolute scaled error (MASE)

  • scale invariance

  • symmetric

  • less than one if it arises from a better forecast than the average naïve forecast and conversely it is greater than one if the forecast is worse than the average naïve forecast

https://otexts.com/fpp2/accuracy.html https://scikit-learn.org/stable/modules/model_evaluation.html#

Goodness of Fit

Check if a hypothesis is correct. Often referred to as explained variance scores.

Coefficient of determination ((R^2)) Describes the variance (of y) which is explained by the model prediction.

F-test The F-value expresses how much of the model has improved compared to the mean (null hypothesis) given the variance of the model and data. The F-test is obtained by

where (Y) is the set of data points, (\hat{Y} = \left[ \hat{y}_{i} \right]) with (i = 1,.., N) are a set of predicted points.

Chi-square test

Cross-validation

Leave-one out error (LOOE)

Correlation/ Synchrony

Pearson correlation Pearson correlation measures how two continuous signals co-vary over time. The linear relationship between these signals are given from -1 (anticorrelated) to 0 (incorrelated) to 1 (perfecly correlated).

The Pearson correlation coefficient for two random variables (X_1) and (X_2) is:

For time-series on can calculate a

  • global correlation coefficient: a single value

  • local correlation coefficient: determine correlation in a rolling window over time

Caution:

  1. outliers can skew the correlation

  2. assuming the data is homoscedatic, i.e. constant variances

Time Lagged Cross Correlation (TLCC) TLCC is a measure of similarity of two series as a function of displacement. It captures directionality between two signals, i.e. leader-follower relationship. Idea: Similar to convolution of two signals, i.e. shifting one signal with respect to the other while repeatedly calculating the correlation.

0.5 Cross correlation of f and g

0.5 Window time lagged cross correlation

Windowed time lagged cross correlations (WTLCC) are an extension of TLCC where local correlations coefficients are computed for each lag-time which is then plotted as a matrix.

Granger causality

Dynamic Time Wrapping (DTW)

Spatial Data

  • saliency maps: highlights which changes in the input would most affect the output.

  • heat maps: highlights which inputs are most important for the prediction

Saliency maps

Clone this wiki locally