This repository contains simulation study codes and results for my paper, Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties -- DOI link for publication or arXiv link for preprint.
- The paper-related material is under this directory. All the codes and outputs (both intermediate outputs and final outputs) can be found here.
- All the results were run on Compute Canada. The job submission bash scripts contain commands showing the computing resource name and information in the
slurm
outputs; theseff
outputs show the computing time. - All studies using GPUs were run on Compute Canada Nvidia A100 GPU(s), which the Compute Canada slurm outputs confirm and show to have a CUDA compute capability of 8.0 (
cupy.cuda.Device.compute_capability
returns a string '80' stands for compute capability of 8.0). - All the
Python
scripts referred in thebash
job submission scripts were generated from the Jupyter (iPython) notebooks (i.e.,jupyter nbconvert *.ipynb --to python
and move the python script to a separate sub-directory calleddist
). - Compute Canada has
slurm-[jobID].out
files consisting of outputs from running the scripts; I also createdseff-[jobID].out
files from commandseff [jobID] >> seff-[jobID].output
to record and report the wall-clock times to finish the computing-time comparison jobs. - With identical simulation setups, under the same
$(\epsilon-)$ convergence criteria,seff
files show that the computing time simulations for the AG method finished within$20$ minutes for SCAD or MCP-penalized logistic models; however, the computing time simulations could not finish within the 7-day time limit imposed by Compute Canada Narval cluster for the coordinate descent method on SCAD or MCP-penalized logistic models. - Again, all the above simulations were run on identical GPUs -- Nvidia A100 with CUDA compute capability of 8.0.
- To ensure the fairness of comparison, we coded coordinate descent in
Python
/CuPy
and compared the computing time with AG -- this was coded based on the state-of-the-art pseudo-code for the coordinate descent method (Breheny & Huang, 2011).
Model | Penalty | Comparison | Optimization Method | Output Data | Jupyter Notebook/R code | Bash Script | slurm file | seff output |
---|---|---|---|---|---|---|---|---|
Penalized Linear Models (LM) | SCAD | Signal Recovery Performance | coordinate descent (ncvreg ); with strong rule |
R_results_SCAD_signal_recovery.npy |
ncvreg_LM_sim.R |
LM.sh |
slurm-10933899.out |
|
Penalized Linear Models (LM) | MCP | Signal Recovery Performance | coordinate descent (ncvreg ); with strong rule |
R_results_MCP_signal_recovery.npy |
ncvreg_LM_sim.R |
LM.sh |
slurm-10933899.out |
|
Penalized Logistic Models | SCAD | Signal Recovery Performance | coordinate descent (ncvreg ); with strong rule |
R_results_SCAD_signal_recovery.npy |
ncvreg_logistic_sim.R |
logistic.sh |
slurm-10933900.out |
|
Penalized Logistic Models | MCP | Signal Recovery Performance | coordinate descent (ncvreg ); with strong rule |
R_results_MCP_signal_recovery.npy |
ncvreg_logistic_sim.R |
logistic.sh |
slurm-10933900.out |
|
Penalized Linear Models (LM) | SCAD | Signal Recovery Performance | AG (proposed optimization hyperparameters); with strong rule | results_SCAD_signal_recovery.npy |
task1.ipynb |
task1.sh |
slurm-10933901.out |
|
Penalized Linear Models (LM) | MCP | Signal Recovery Performance | AG (proposed optimization hyperparameters); with strong rule | results_MCP_signal_recovery.npy |
task1.ipynb |
task1.sh |
slurm-10933901.out |
|
Penalized Logistic Models | SCAD | Signal Recovery Performance | AG (proposed optimization hyperparameters); with strong rule | results_SCAD_signal_recovery.npy |
task2.ipynb |
task2.sh |
slurm-10933902.out |
|
Penalized Logistic Models | MCP | Signal Recovery Performance | AG (proposed optimization hyperparameters); with strong rule | results_MCP_signal_recovery.npy |
task2.ipynb |
task2.sh |
slurm-10933902.out |
|
Penalized Linear Models (LM) | SCAD | Number of Gradient Evaluations | AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent | SCAD_sim_results.npy |
task1speed.ipynb |
task1speed.sh |
slurm-10933903.out |
seff-10933903.out |
Penalized Linear Models (LM) | SCAD | GPU Computing Time | AG (proposed optimization hyperparameters), coordinate descent (coded in Python/CuPy ) |
SCAD_sim_results.npy |
task1speed.ipynb |
task1speed.sh |
slurm-10933903.out |
seff-10933903.out |
Penalized Linear Models (LM) | MCP | Number of Gradient Evaluations | AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent | MCP_sim_results.npy |
task1speed.ipynb |
task1speed.sh |
slurm-10933903.out |
seff-10933903.out |
Penalized Linear Models (LM) | MCP | GPU Computing Time | AG (proposed optimization hyperparameters), coordinate descent (coded in Python/CuPy ) |
MCP_sim_results.npy |
task1speed.ipynb |
task1speed.sh |
slurm-10933903.out |
seff-10933903.out |
Penalized Logistic Models | SCAD | GPU Computing Time | coordinate descent (coded in Python/CuPy ) |
task2speed_SCAD_coord_time.ipynb |
task2speed_SCAD_coord_time.sh |
slurm-10933904.out |
seff-10933904.out |
|
Penalized Logistic Models | MCP | GPU Computing Time | coordinate descent (coded in Python/CuPy ) |
task2speed_MCP_coord_time.ipynb |
task2speed_MCP_coord_time.sh |
slurm-10933905.out |
seff-10933905.out |
|
Penalized Logistic Models | SCAD | GPU Computing Time | AG (proposed optimization hyperparameters) | SCAD_sim_results_AG_time.npy |
task2speed_SCAD_AG_time.ipynb |
task2speed_SCAD_AG_time.sh |
slurm-10933906.out |
seff-10933906.out |
Penalized Logistic Models | MCP | GPU Computing Time | AG (proposed optimization hyperparameters) | MCP_sim_results_AG_time.npy |
task2speed_MCP_AG_time.ipynb |
task2speed_MCP_AG_time.sh |
slurm-10933907.out |
seff-10933907.out |
Penalized Logistic Models | SCAD | Number of Gradient Evaluations | AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent | SCAD_sim_results.npy |
task2speed_SCAD.ipynb |
task2speed_SCAD.sh |
slurm-10933908.out |
seff-10933908.out |
Penalized Logistic Models | MCP | Number of Gradient Evaluations | AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent | MCP_sim_results.npy |
task2speed_MCP.ipynb |
task2speed_MCP.sh |
slurm-10933909.out |
seff-10933909.out |
- The summary of the simulation results is in this Jupyter notebook, where most of the simulation study results in the paper are from.
- The two very original Jupyter notebooks are the simulation study for SCAD/MCP-penalized linear models and the simulation study for SCAD/MCP-penalized logistic models -- all other notebooks and
Python
codes are generated and modified based on them. They are also where the plots in the main text come from. - To run on the server, I divided the codes into several chunks; they were in this folder for python simulations
-
task1
andtask2
contain files to test signal recovery performance for SCAD/MCP-penalized linear models and logistic models using AG; -
task1speed
andtask2speed
contain files to test$(\epsilon-)$ convergence speed and computing times for SCAD/MCP-penalized linear models and logistic models using AG v.s. proximal gradient v.s. coordinate descent.
-
- The R codes and results for
ncvreg
simulations are contained in this directory -- click here for penalized linear models or here for penalized logistic models. - Some algebra calculations from the paper can be replicated using this SageMath notebook (SageMath); the MATLAB codes to generate plots are here for "Figure 1: Numerical plots for Corollary 1.".
- Breheny, P., & Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics 2011, Vol. 5, No. 1, 232-253. https://doi.org/10.1214/10-AOAS388