Supplementary Material of "Constraint-Free Structure Learning with Smooth Acyclic Orientations" @ICLR 2024
The conda environment is specified in envs/environment.yml
.
Therefore,
to install the environment,
run the following command:
conda env create -f envs/environment.yml
We use the notation n{samples}_d{nodes}_{ER\|SF}{edge_factor}_{noise}
to denote
syntethic datasets.
Therefore,
the dataset n1000_d100_SF6_gauss
has 1000 samples,
on a scale-free graph with 100 nodes and edge factor 6,
and the noise is gaussian.
We admit the following noise terms:
gauss
, exp
, and gumbel
for the linear case
and
mlp
for the non-linear case.
For each experiment, run the following command:
python -m benchmark grid {model} {dataset} config/grids/{grid_name} --n_grid_samples={NSAMPLES} --num_cpus={NCPUS} --n_repetitions=5
The command performs a grid search, whose results are store in the grids
folder, and validates the best configuration (according to the ROCAUC metric), whose results are stored in the validation
folder. Both files are named {dataset}_{grid_name}.csv
. We include the results of our runs in the grids and validation directories.
The available model names are:
cosmo
dagma
nocurl
notears
The hyperparameter ranges for the grid search in the config/grids/
directory are:
cosmo.yaml
; Linear COSMOcosmo_nl.yaml
; Non-Linear COSMOdagma.yaml
; Linear DAGMAdagma_nl.yaml
; Non-Linear DAGMAnocurl.yaml
; Linear NOCURLnocurl_joint.yaml
; Linear NOCURL-U, i.e., the unconstrained variant.notears.yaml
; Linear NOTEARS
For instance, to run non-linear cosmo
on n1000_d20_ER4_mlp
, the command would be:
python -m benchmark grid cosmo n1000_d20_ER4_mlp config/grids/cosmo_nl.yaml --n_grid_samples=200 --num_cpus=50 --n_repetitions=5
The table.ipynb
notebook can be used
to generate
the tables in the paper
by selecting
the number of nodes,
the edge factor
and the noise type.
The step time simulation experiment can be replicated for each model by running the following command:
python steptime.py cosmo
The model names are, as before, cosmo
, dagma
, nocurl
, and notears
.
Each run produces a {modelname}.csv
file in the steptime
directory.
Then, the steptime.ipynb
notebook can be used to concatenate and plot the results across all models.