Working packages #50

vladdez · 2024-11-05T13:01:45Z

After much discussion, we came up with the following work plan:

Create two ground truth datasets that could be used to evaluate any possible pattern detection method in ERP images.
1.1. The first should be based on real data. Find about 10 existing EEG datasets and manually label them: yes pattern or no pattern. Ideally there should be 100 instances for each pattern and (100 * number of patterns) noisy instances. Note that the dataset will be in numbers, but to score the patterns you will need to plot each entry.
1.2. The second should be simulated using existing simulation functions. For noise instance you can use real data or simulate yourself.
1.3. Each GT dataset should not be too large, no more than 100 MiB.
Evaluate several detection methods by comparing their results with the GT dataset.
2.1. Train a CNN classifier model on the simulated dataset. Use ERPimages of size 50x50 pixels and train a small model on Julia.
2.2. Try some methods from the Magnostics paper
2.3. Assess entropy-based methods we already used before

behinger · 2024-11-05T13:07:55Z

thanks for the write-up. Now I have more thoughts ;)

I'd change the order - no need for a GT dataset if we cant get it to work on the simulated data
I would skip the size requirement, it is an afterthought. Especially as we are not going to include all e.g. 128 channels
Whether we downsample to e.g. 50x50 (that was just any random number, please think about it), or not might be discussed with Victoria
some methods from magnostics paper, is that really specified enough for a student to understand?
for the simulation data, it is important to nail the noise so we get realistic "no-pattern" stimuli. Any thoughts on this?

vladdez · 2024-11-12T10:40:10Z

Training dataset

In order to create training data, we will use UnfoldSim.jl to simulate EEG data with different patterns with varying number of trials, sampling rates, durations, Signal-to-noise, continuos / non-linear predictors, overlap distribution, and noisetypes
ERPimages can have different dimensions, so we need to either resize the erpimages to e.g. 50x50px or find other (CNN-based?) methods to account for different imagesizes

Test dataset

Test-set based on real data. Find about 10 existing (already preprocessed) EEG datasets, choose 2-3 subjects, choose ~10 channels around the head, choose 5-10 features of that dataset, and manually label them: yes pattern or no pattern. Save the data, labels and the erpimages.

CNN

We train a small CNN classifier model on the simulated dataset via Flux.jl or Lux.jl (for maintenance)
Evaluate this CNN on simulated validation data

Explicit features

Look at the Magnostics paper and identify features we could alternatively use to classify into pattern / no pattern (e.g. we already used entropy)
Train XGBoost or find other classifiers to use these features (use MLJ.jl)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working packages #50

Working packages #50

vladdez commented Nov 5, 2024 •

edited

Loading

behinger commented Nov 5, 2024

vladdez commented Nov 12, 2024 •

edited by behinger

Loading

Working packages #50

Working packages #50

Comments

vladdez commented Nov 5, 2024 • edited Loading

behinger commented Nov 5, 2024

vladdez commented Nov 12, 2024 • edited by behinger Loading

Training dataset

Test dataset

CNN

Explicit features

vladdez commented Nov 5, 2024 •

edited

Loading

vladdez commented Nov 12, 2024 •

edited by behinger

Loading