Skip to content

Simulations for predictive model selection in causal inference

License

Notifications You must be signed in to change notification settings

soda-inria/caussim

Repository files navigation

How to select predictive model for causal inference

Overview

This package contains simulations for causal inference, estimators for ATE and CATE as well as code for experiments described in the paper : How to select predictive models for causal inference ?

Package Features

The package code is contained in: caussim

  • estimation contains CATE and ATE estimators usable with any scikit-learn compatible base estimators and meta-learners such as TLearner, SLearner or RLearner.

  • simulations simulations with basis expansion (available Nystroem, Splines)

  • experiences used to run extensive evaluation of causal metrics on ACIC 2016 and handcrafted simulations.

  • reports contains the scripts used to derive figures and tables presented in the paper. The main results are obtained by launching the

  • utils.py plot utils

  • pdistances naive implementation of MMD, Total Varation and Jensen Shannon Divergences used to measure population overlap

  • demos contains notebooks used to create toy example and risks maps for the 2D simulations.

  • data contains utils to load semi-simulated datasets (ACIC 2016, ACIC 2018, TWINS). A dedicated README is available in the root data folder.

Experiences

Experiences outputs are mainly csvs (one for each sampled dataset). To launch an experience, run python scripts/experiences/<experience.py> and it should output the csv in a dedidacted folder in the corresponding subfolder data/experiences/<dataset>/<experience_name>.

🔎 Replicate the main experience of the paper (section 5.), launch the script scripts/experiences/causal_scores_evaluation.py. Make sure that the configurations for the datasets at the beginning of the file is :

from caussim.experiences.base_config import DATASET_GRID_FULL_EXPES
DATASET_GRID = DATASET_GRID_FULL_EXPES

📢 Note that the results of the section 5 are already provided in the zenodo link experiences.zip.

Reports

Reports outputs are mainly figures for the papers. To obtain the results, run pytest scripts/reports/<report.py> and it should output the figures in one or several corresponding folders in figures/.

The main report type is a pytest function contained in the reports/causal_scores_evaluation.py script. For each macro-dataset, it plot the results of running a given set of candidate estimators with a fixed nuisance estimator on several generation process of the macro-dataset (often hundreds of sampled datasets).

🔎 Replicate the main figure of the paper (Figure 3.), launch the script scripts/reports/_1_r_risk_domination.py. It should take some time because of the high number of simulations results. Make sure that the appropriate experiences results exists. The one used in the paper are provided in experiences.zip.

pytest scripts/reports/causal_scores_evaluation.py

Installation

  • We recommend the use of poetry and python>=3.9 to manage dependencies.

You can install caussim via poetry:

poetry install

or

pip. In this case you also need to install the dependies listed in the pyproject.toml:

pip install caussim

Dependencies:

python = ">=3.9, <3.11"
python-dotenv = "^0.15.0"
click = "^8.0.1"
yapf = "^0.31.0"
matplotlib = "^3.4.2" numpy = "^1.20.3"
seaborn = "^0.11.1"
jupytext = "^1.11.5"
rope = "^0.19.0"
scikit-learn = "^1.0"
jedi = "^0.18.0"
tqdm = "^4.62.3"
tabulate = "^0.8.9"
statsmodels = "^0.13.1"
pyarrow = "^6.0.1"
submitit = "^1.4.1"
rpy2 = "^3.4.5"
moepy = "^1.1.4"

About

Simulations for predictive model selection in causal inference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published