diff --git a/README.md b/README.md
index 41e5ad7..55acfe0 100644
--- a/README.md
+++ b/README.md
@@ -4,9 +4,9 @@
 
 Entropy regularization in given in the context of a multi-label image classification problem between labels,
 $$
-    L(\bm{i}, \bm{l}) = L_{\iota}(\bm{i}, \bm{l}) \underbrace{+ \kappa \sum_{\mathbf{i}_k \in \bm{i}}\sum_{\mathbf{l}_j \in \bm{l}} p(\mathbf{l}_j; \mathbf{i}_k, \bm{l}) \log p(\mathbf{l}_j; \mathbf{i}_k, \bm{l})}_{\text{Entropy Regularization}},
+    L(\mathbf{i}, \mathbf{l}) = L_{\iota}(\mathbf{i}, \mathbf{l}) \underbrace{+ \kappa \sum_{i_k \in \mathbf{i}}\sum_{l_j \in \mathbf{l}} p(l_j; i_k, \mathbf{l}) \log p(l_j; i_k, \mathbf{l})}_{\text{Entropy Regularization}},
 $$
-where $L_{\iota}(\bm{i}, \bm{l})$ is the original loss of the model given a batch of input images $\bm{i}$ and labels $\bm{l}$, $\kappa$ is the regularization strength, and $p(\mathbf{l}_j; \mathbf{i}_k, \bm{l})$ is the probability of the $j$-th label given the $k$-th image.
+where $L_{\iota}(\mathbf{i}, \mathbf{l})$ is the original loss of the model given a batch of input images $\mathbf{i}$ and labels $\mathbf{l}$, $\kappa$ is the regularization strength, and $p(l_j; i_k, \mathbf{l})$ is the probability of the $j$-th label given the $k$-th image.
 
 The context of this is using such models as a source of rewards for Reinforcement Learning.
 The original application of this was to fine-tune CLIP models so that they have less noise and their semantics entropy reward trajectories are smoother.