IS-ASGD

This is an implementation of importance aampling for ASGD, namely, IS-ASGD. We recommend the following datasets, the first two datasets are sufficiently large and sparse, the last two datasets are small-scale and relative dense. IS-ASGD shows different performance on them.

Data Preparation

copy these datasets to 'data' folder and unzip

Preparation

The cal_xnorm.py in script folder is used to calculate the Lipschtz constant for data sample.
The cal_random.py in script folder is used to generate random data segmentation.
Program reads in norm file and generate sampling distribution at the beginning of each epoch.
In fact, the sampling sequence can be generated only once and randomly shuffled for each epoch.

Run Command

Use the run scripts in 'script' folder

Example

To run kddb with lr=0.5 and threads=44

$2 -> threads_num

$3 -> lrate

$4 -> lr_decay

$5 -> epoch_count

$6 -> 0=importance balanced

$7 -> using IS

../bin/svm --splits $2 --stepinitial $3 --step_decay $4 --mu 0.00001 --epochs $5 --dis 1 --random_dis $6 --svrg 0 --CrossEntropy 1 --lip 1 --use_IS $7 --binary 0 ../data/kddb x

Visualize the result

Use the print scripts in 'script' folder, IS-ASGD shows better absolute convergence curve than ASGD and SVRG-ASGD in these large-scale sparse datasets due to sparsity.

Example

To print the results of kddb ran with lr=0.5 and threads=44

                `dataset` `lr` `threads`

./print_iterative.py kddb 0.5 44

./print_absolute.py kddb 0.5 44

Testbed

Intel Xeon series is preferred since it has many cores, our testbed is a two-sockets server of Xeon-2699 V4 CPU.

Other Configuration

We recommend to turn off hyper-threading, which is likely to be controlled in your BIOS.
Using Intel g++, a.k.a, icpc is highly recommended. By using icpc, the scalability increases ~15%.
Chech the OMP_THREADS_NUM in your environment.

Thanks

Jason.y.ye, Intel Asia Pacific R&D Ltd., Shanghai. Advanced Networking Lab, Shanghai Jiao Tong University, Shanghai.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bin		bin
data		data
hazytl		hazytl
hogwildtl		hogwildtl
obj		obj
scripts		scripts
src		src
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IS-ASGD

Data Preparation

Preparation

Run Command

Example

Visualize the result

Example

Testbed

Other Configuration

Thanks

About

Releases

Packages

Languages

License

Hieda-no-TzZ/IS-ASGD

Folders and files

Latest commit

History

Repository files navigation

IS-ASGD

Data Preparation

Preparation

Run Command

Example

Visualize the result

Example

Testbed

Other Configuration

Thanks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages