A Python package implementing a variety of statistical/machine learning methods that rely on kernels (e.g. HSIC for independence testing).
- Independence testing with HSIC (Hilbert-Schmidt Independence Criterion), as introduced in A Kernel Statistical Test of Independence, A. Gretton, K. Fukumizu, C. Hui Teo, L. Song, B. Schölkopf, and A. Smola (NIPS 2007).
- Measurement of conditional independence with HSCIC (Hilbert-Schmidt Conditional Independence Criterion), as introduced in A Measure-Theoretic Approach to Kernel Conditional Mean Embeddings, J. Park and K. Muandet (NeurIPS 2020).
- The Kernel-based Conditional Independence Test (KCIT), as introduced in Kernel-based Conditional Independence Test and Application in Causal Discovery, K. Zhang, J. Peters, D. Janzing, B. Schölkopf (UAI 2011).
- Two-sample testing (also known as homogeneity testing) with the MMD (Maximum Mean Discrepancy), as presented in A Fast, Consistent Kernel Two-Sample Test, A. Gretton, K. Fukumizu, Z. Harchaoui, and B. K. Sriperumbudur (NIPS 2009) and in A Kernel Two-Sample Test, A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola (JMLR, volume 13, 2012).
Resource | Description |
---|---|
HSIC | For independence testing |
HSCIC | For the measurement of conditional independence |
KCIT | For conditional independence testing |
MMD | For two-sample testing |
The following table details the implementation schemes for the different resources available in the package.
Resource | Implementation Scheme | Numpy based available | PyTorch based available |
---|---|---|---|
HSIC | Resampling (permuting the xi's but leaving the yi's unchanged) | Yes | No |
HSIC | Gamma approximation | Yes | No |
HSCIC | N/A | Yes | Yes |
KCIT | Gamma approximation | Yes | No |
KCIT | Monte Carlo simulation (weighted sum of χ2 random variables) | Yes | No |
MMD | Gram matrix spectrum | Yes | No |
- Joint independence testing with dHSIC.
- Goodness-of-fit testing.
- Methods for time series models.
- Bayesian statistical kernel methods.
- Regression by independence maximisation.