Learning Data Manipulation

This repo contains preliminary code of the following paper:

Learning Data Manipulation for Augmentation and Weighting
Zhiting Hu*, Bowen Tan*, Ruslan Salakhutdinov, Tom Mitchell, Eric P. Xing
NeurIPS 2019 (equal contribution)

Requirements

python3.6
pytorch==1.0.1
pytorch_pretrained_bert==0.6.1
torchvision==0.2.2

Code

baseline_main.py: Vanilla BERT Classifier.
ren_main.py: Described in (Ren et al.).
weighting_main.py: Our weighting algorithm.
augmentation_main.py: Our augmentation algorithm.

Running

Running scripts for experiments are available in scripts/.

Results

All the detailed training logs are availble in results/.

(Note: The result numbers may be slightly different from those in the paper due to slightly different implementation details and random seeds, while the improvements over comparison methods are consistent.)

low data

SST-5

Base Model: BERT	Ren et al.	Weighting	Augmentation
33.32 ± 4.04	36.09 ± 2.26	36.51 ± 2.54	37.55 ± 2.63

CIFAR-10

	Pretrained	Not Pretrained
Base Model: ResNet	34.58 ± 4.13	24.68 ± 3.29
Ren et al.	23.29 ± 5.95	22.26 ± 2.80
Weighting	36.75 ± 3.09	26.47 ± 1.69

imbalanced data

SST-2

	20 : 1000	50 : 1000	100 : 1000
Base Model: BERT	54.91 ± 5.98	67.73 ± 9.20	75.04 ± 4.51
Ren et al.	74.61 ± 3.54	76.89 ± 5.07	80.73 ± 2.19
Weighting	75.08 ± 4.98	79.35 ± 2.59	81.82 ± 1.88

CIFAR-10

	20 : 1000	50 : 1000	100 : 1000
Base Model: ResNet	70.65 ± 4.98	79.52 ± 4.81	86.12 ± 3.37
Ren et al.	76.68 ± 5.35	77.34 ± 7.38	78.57 ± 5.61
Weighting	79.07 ± 5.02	82.65 ± 5.13	87.63 ± 3.72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Data Manipulation

Requirements

Code

Running

Results

low data

SST-5

CIFAR-10

imbalanced data

SST-2

CIFAR-10

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
augmentation		augmentation
baseline		baseline
data_utils		data_utils
results		results
scripts		scripts
weighting		weighting
README.md		README.md
augmentation_main.py		augmentation_main.py
baseline_main.py		baseline_main.py
magic_module.py		magic_module.py
ren_main.py		ren_main.py
weighting_main.py		weighting_main.py

tanyuqian/learning-data-manipulation

Folders and files

Latest commit

History

Repository files navigation

Learning Data Manipulation

Requirements

Code

Running

Results

low data

SST-5

CIFAR-10

imbalanced data

SST-2

CIFAR-10

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages