diff --git a/README.md b/README.md index 41e5ad7..55acfe0 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,9 @@ Entropy regularization in given in the context of a multi-label image classification problem between labels, $$ - L(\bm{i}, \bm{l}) = L_{\iota}(\bm{i}, \bm{l}) \underbrace{+ \kappa \sum_{\mathbf{i}_k \in \bm{i}}\sum_{\mathbf{l}_j \in \bm{l}} p(\mathbf{l}_j; \mathbf{i}_k, \bm{l}) \log p(\mathbf{l}_j; \mathbf{i}_k, \bm{l})}_{\text{Entropy Regularization}}, + L(\mathbf{i}, \mathbf{l}) = L_{\iota}(\mathbf{i}, \mathbf{l}) \underbrace{+ \kappa \sum_{i_k \in \mathbf{i}}\sum_{l_j \in \mathbf{l}} p(l_j; i_k, \mathbf{l}) \log p(l_j; i_k, \mathbf{l})}_{\text{Entropy Regularization}}, $$ -where $L_{\iota}(\bm{i}, \bm{l})$ is the original loss of the model given a batch of input images $\bm{i}$ and labels $\bm{l}$, $\kappa$ is the regularization strength, and $p(\mathbf{l}_j; \mathbf{i}_k, \bm{l})$ is the probability of the $j$-th label given the $k$-th image. +where $L_{\iota}(\mathbf{i}, \mathbf{l})$ is the original loss of the model given a batch of input images $\mathbf{i}$ and labels $\mathbf{l}$, $\kappa$ is the regularization strength, and $p(l_j; i_k, \mathbf{l})$ is the probability of the $j$-th label given the $k$-th image. The context of this is using such models as a source of rewards for Reinforcement Learning. The original application of this was to fine-tune CLIP models so that they have less noise and their semantics entropy reward trajectories are smoother.