Official implementation of Learning Latent Structural Causal Models for approximate Bayesian inference over causal variables, structure and parameters of latent SCMs from low-level data.
- Create conda env and activate
conda create --name biols_env python=3.10
conda activate biols_env
- Load cuda 12 if on Mila cluster module load cuda/12.0/cudnn/9.1
- Install all packages by running
bash install.sh
cd modules && cython -3 mine.pyx
cd c_modules && g++ -I <path_to_conda>/.conda/envs/biols_env/include/python3.10 -shared -pthread -fPIC -fwrapv -O3 -Wall -fno-strict-aliasing -o mine.so mine.c
Data generation currently supports single and multi node interventions, with different kinds of projection (linear, nonlinear SO(n)) from latent space to observation space. Finally, data generation of images can also be performed.
Examples:
Linear projection: python create_datasets.py --config create_dataset d5 linear_proj ws_datagen_interv_noise_fix_noise gaussian_intervs er1 multi_interv --data_seed 0
Nonlinear projection: python create_datasets.py --config create_dataset d5 nonlinear_proj ws_datagen_interv_noise_fix_noise gaussian_intervs er1 multi_interv --data_seed 0
SO(n) projection: python create_datasets.py --config create_dataset son_d5 ws_datagen_no_interv_noise_fix_noise gaussian_intervs er1 multi_interv --data_seed 0
Image generation with chemistry environment: python create_datasets.py --config create_dataset d5 chemdata_proj ws_datagen_interv_noise_fix_noise gaussian_intervs er1 single_interv --n_pairs 2000 --n_interv_sets 20 --data_seed 0
single_interv
can be used in place of multi_interv
to generate data with single node interventions.
- Running data on linearly projected data:
python biols_vector_data.py --config defaults biols_learn_L --biols_data_folder <biols_data_folder> --exp_name BIOLS_learnL
where biols_data_folder
is the code obtained from the data generation step. For example, it might look like er1-ws_datagen_fix_noise_interv_noise_nonlineargauss_SCM_2-linearproj-d005-D0100-multi-n_pairs2000-sets20-gaussianinterv
-
Running data on nonlinearly projected data: Instead of running
biols_vector_data.py
runnonlinear_biols_vector.py
-
To run on image data (from images in the chemistry dataset proposed in Ke et al), run
biols_image_data.py
file instead with the same arguments