Skip to content

Commit

Permalink
UPD: updating Documentation LTT
Browse files Browse the repository at this point in the history
  • Loading branch information
SZiane committed Jul 10, 2023
1 parent c5324b2 commit a039327
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 8 deletions.
17 changes: 10 additions & 7 deletions doc/theoretical_description_multilabel_classification.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ our training data :math:`(X, Y) = \{(x_1, y_1), \ldots, (x_n, y_n)\}`` has an un

For any risk level :math:`\alpha` between 0 and 1, the methods implemented in MAPIE allow the user to construct a prediction
set :math:`\hat{C}_{n, \alpha}(X_{n+1})` for a new observation :math:`\left( X_{n+1},Y_{n+1} \right)` with a guarantee
on the recall or the precision. RCPS, LTT and CRC give three slightly different guarantees:
on the recall. RCPS, LTT and CRC give three slightly different guarantees:

- RCPS:

Expand All @@ -32,6 +32,8 @@ on the recall or the precision. RCPS, LTT and CRC give three slightly different
.. math::
\mathbb{P}(R(\mathcal{T}_{\lambda_{\lambda\in\hat{\Lambda}}) \leq \alpha ) \geq 1 - \delta
Notice that at the opposite of the other two methods, LTT allows to control any non-monotone loss. In Mapie for multilabel classification,
we use CRC and RCPS for recall control and LTT for precision control.

1. Risk-Controlling Prediction Sets
-----------------------------------
Expand Down Expand Up @@ -163,21 +165,22 @@ With :
3. Learn Then Test
------------------
The goal of this method is to control any loss whether monotone, bounded or not. The main goal of this method is to achieve risk control
throught multiple hypothesis testing. We can express the goal of the procedure as follows:
The goal of this method is to control any loss whether monotonic, bounded or not, by performing risk control through multiple
hypothesis testing. We can express the goal of the procedure as follows:

.. math::
\mathbb{P}(R(\mathcal{T}_{\lambda}) \leq \alpha ) \geq 1 - \delta
In order to find all the parameters :math:`\lambda` that satisfy the above condition, Learn Then Test propose to do the following:

0: First across the collections of functions :math:`(T_\lambda)_{\lambda\in\Lambda}`, we estimate the risk on the calibration data
1: First across the collections of functions :math:`(T_\lambda)_{\lambda\in\Lambda}`, we estimate the risk on the calibration data
\{(x_1, y_1), \dots, (x_n, y_n)\}`.
1: For each :math:`\lambda_j` in a discrete set :math:`\Lambda = \{\lambda_1, \lambda_2,\dots, \lambda_n\}`, we associate the null hypothesis
2: For each :math:`\lambda_j` in a discrete set :math:`\Lambda = \{\lambda_1, \lambda_2,\dots, \lambda_n\}`, we associate the null hypothesis
:math:`\mathbb{H}_j: R(\lambda_j)>\alpha`, as rejecting the hypothesis corresponds to selecting :math:`\lambda_j` as a point where risk the risk
is controlled.
2: For each null hypothesis, we compute a valid p-value using a concentration inequality.
3: Return :math:`\hat{\Lambda} = \mathbb{A}(\{p_j\}_{j\in\{1,\dots,lvert \Lambda \rvert})`, where :math:`\mathbb{A}`, is an algorithm
3: For each null hypothesis, we compute a valid p-value using a concentration inequality. Here we choose to compute the Hoeffding-Bentkus p-value
introduced in the paper [3].
4: Return :math:`\hat{\Lambda} = \mathbb{A}(\{p_j\}_{j\in\{1,\dots,lvert \Lambda \rvert})`, where :math:`\mathbb{A}`, is an algorithm
that controls the family-wise-error-rate (FWER).


Expand Down
2 changes: 1 addition & 1 deletion mapie/control_risk/risks.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def _compute_risk_precision(
y: NDArray
) -> NDArray:
"""
In `MapieMultiLabelClassifier` when`metric_control=precision`,
In `MapieMultiLabelClassifier` when `metric_control=precision`,
compute the precision per observation for each different
thresholds lambdas.
Expand Down

0 comments on commit a039327

Please sign in to comment.