Finding Order in Chaos: A Novel Data Augmentation Method for Time Series in Contrastive Learning (NeurIPS 2023, Official Code)
Berken Utku Demirel, Christian Holz
In this paper, we propose a novel data augmentation method for quasi-periodic time-series tasks that aims to connect intra-class samples together, and thereby find order in the latent space. Our method builds upon the well-known mixup technique by incorporating a novel approach that accounts for the periodic nature of non-stationary time-series. Also, by controlling the degree of chaos created by data augmentation, our method leads to improved feature representations and performance on downstream tasks.
- Datasets
Activity recognition
UCIHAR, HHAR, USC.Heart rate prediction
IEEE SPC12 and IEEE SPC22, DaLiA.Cardiovascular disease (CVD) classification
CPSC2018, Chapman.
- After downloading the raw data, they should be processed with the corresponding scripts, if there is any.
The command to run the whole process:
python runner_function.py --framework 'simclr' --backbone 'DCL' --dataset 'ucihar' --aug1 'na' --aug2 'resample' --n_epoch 120 --batch_size 256 --lr 3e-3 --lr_cls 0.03 --cuda 0 --cases 'subject_large' --VAE --mean 0.9 --std 0.1
If VAE models are not trained in the corresponding folders, e.g., 'ucihar0' for domain 0 of the ucihar dataset, the VAE training will start first.
The command to run the ablation experiments,
without the proposed mixup with VAE:
python runner_function.py --framework 'simclr' --backbone 'DCL' --dataset 'ucihar' --aug1 'na' --aug2 'resample' --n_epoch 120 --batch_size 256 --lr 3e-3 --lr_cls 0.03 --cuda 0 --cases 'subject_large' --BestMixup
with the proposed mixup without VAE:
python runner_function.py --framework 'simclr' --backbone 'DCL' --dataset 'ucihar' --aug1 'na' --aug2 'resample' --n_epoch 120 --batch_size 256 --lr 3e-3 --lr_cls 0.03 --cuda 0 --cases 'subject_large' --ablation_2
Some figures to summarize. Two sinusoidal with a constant coefficient for mixup while changing the phase of them between π and -π. The anchor has two frequencies [f1: 2Hz, f2: 10Hz] and the sample has only one at 2Hz, e.g., 2Hz carries information while the anchor has noise at 10Hz.
| Figure 1. | Sum of two sinusoidal in time and frequency domain using linear mixup |
Observe the amplitude change when the linear mixup is used. The amplitude in 2Hz is about to vanish at some phase values even though both samples have a 2Hz component, i.e., linear mixup can destroy the information instead of interpolation.
| Figure 2. | Sum of two sinusoidal in time and frequency domain using the proposed method |
The proposed method preserves frequency information without loss. Check the figures below to observe its behavior in polar coordinates.
| Figure 3. | Amplitude and phase for 2Hz sinusoidal in the generated sample using linear mixup given in polar coordinates |
| Figure 4. | Amplitude and phase for 2Hz sinusoidal in the generated sample using the proposed method given in polar coordinates |
When the proposed method is used, the amplitude of critical frequency does not change even though the phase difference.
If you find our paper or codes useful, please cite our work:
@inproceedings{
demirel2023finding,
title={Finding Order in Chaos: A Novel Data Augmentation Method for Time Series in Contrastive Learning},
author={Berken Utku Demirel and Christian Holz},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=dbVRDk2wt7}
}
Scripts for training VAE models are mainly taken from Isolating Sources of Disentanglement in Variational Autoencoders
while the structure of contrastive learning comes from What Makes Good Contrastive Learning on Small-Scale Wearable-Based Tasks?
IEEE SPC dataset is from the Measuring Heart Rate During Physical Exercise by Subspace Decomposition and Kalman Smoothing and the GitHub implementation of the paper.