Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working packages #50

Open
vladdez opened this issue Nov 5, 2024 · 2 comments
Open

Working packages #50

vladdez opened this issue Nov 5, 2024 · 2 comments

Comments

@vladdez
Copy link
Collaborator

vladdez commented Nov 5, 2024

After much discussion, we came up with the following work plan:

  1. Create two ground truth datasets that could be used to evaluate any possible pattern detection method in ERP images.
    1.1. The first should be based on real data. Find about 10 existing EEG datasets and manually label them: yes pattern or no pattern. Ideally there should be 100 instances for each pattern and (100 * number of patterns) noisy instances. Note that the dataset will be in numbers, but to score the patterns you will need to plot each entry.
    1.2. The second should be simulated using existing simulation functions. For noise instance you can use real data or simulate yourself.
    1.3. Each GT dataset should not be too large, no more than 100 MiB.
  2. Evaluate several detection methods by comparing their results with the GT dataset.
    2.1. Train a CNN classifier model on the simulated dataset. Use ERPimages of size 50x50 pixels and train a small model on Julia.
    2.2. Try some methods from the Magnostics paper
    2.3. Assess entropy-based methods we already used before
@behinger
Copy link
Member

behinger commented Nov 5, 2024

thanks for the write-up. Now I have more thoughts ;)

  • I'd change the order - no need for a GT dataset if we cant get it to work on the simulated data
  • I would skip the size requirement, it is an afterthought. Especially as we are not going to include all e.g. 128 channels
  • Whether we downsample to e.g. 50x50 (that was just any random number, please think about it), or not might be discussed with Victoria
  • some methods from magnostics paper, is that really specified enough for a student to understand?
  • for the simulation data, it is important to nail the noise so we get realistic "no-pattern" stimuli. Any thoughts on this?

@vladdez
Copy link
Collaborator Author

vladdez commented Nov 12, 2024

Training dataset

  • In order to create training data, we will use UnfoldSim.jl to simulate EEG data with different patterns with varying number of trials, sampling rates, durations, Signal-to-noise, continuos / non-linear predictors, overlap distribution, and noisetypes
  • ERPimages can have different dimensions, so we need to either resize the erpimages to e.g. 50x50px or find other (CNN-based?) methods to account for different imagesizes

Test dataset

Test-set based on real data. Find about 10 existing (already preprocessed) EEG datasets, choose 2-3 subjects, choose ~10 channels around the head, choose 5-10 features of that dataset, and manually label them: yes pattern or no pattern. Save the data, labels and the erpimages.

CNN

  • We train a small CNN classifier model on the simulated dataset via Flux.jl or Lux.jl (for maintenance)
  • Evaluate this CNN on simulated validation data

Explicit features

  • Look at the Magnostics paper and identify features we could alternatively use to classify into pattern / no pattern (e.g. we already used entropy)
  • Train XGBoost or find other classifiers to use these features (use MLJ.jl)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants