Pre-Processing the Original Dataset

1. Download the data

	train	validation	test
1 Mpx	download	download	download
crc32	`d677488a`	`72f13c3e`	`643e61ef`
Gen1	download	download	download
crc32	`3d23bd30`	`cc802022`	`cdd4fd69`

2. Extract the tar files

The following directory structure is assumed:

data_dir
├── test
│   ├── ..._bbox.npy
│   ├── ..._td.dat.h5
│   ...
│
├── train
│   ├── ....npy
│   ├── ..._td.dat.h5
│   ...
│
└── val
    ├── ..._bbox.npy
    ├── ..._td.dat.h5
    ...

3. Run the pre-processing script

${DATA_DIR} should point to the directory structure mentioned above. ${DEST_DIR} should point to the directory to which the data will be written.

For the 1 Mpx dataset:

NUM_PROCESSES=20  # set to the number of parallel processes to use
python preprocess_dataset.py ${DATA_DIR} ${DEST_DIR} conf_preprocess/representation/stacked_hist.yaml \
conf_preprocess/extraction/const_duration.yaml conf_preprocess/filter_gen4.yaml -ds gen4 -np ${NUM_PROCESSES}

For the Gen1 dataset:

NUM_PROCESSES=20  # set to the number of parallel processes to use
python preprocess_dataset.py ${DATA_DIR} ${DEST_DIR} conf_preprocess/representation/stacked_hist.yaml \
conf_preprocess/extraction/const_duration.yaml conf_preprocess/filter_gen1.yaml -ds gen1 -np ${NUM_PROCESSES}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pre-Processing the Original Dataset

1. Download the data

2. Extract the tar files

3. Run the pre-processing script

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pre-Processing the Original Dataset

1. Download the data

2. Extract the tar files

3. Run the pre-processing script