fix link

WWWonderer · Feb 6, 2024 · 735292e · 735292e
1 parent 77fd68b
commit 735292e
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/_posts/2024-02-01-multi-armed-bandit-and-epsilon-greedy.markdown b/_posts/2024-02-01-multi-armed-bandit-and-epsilon-greedy.markdown
@@ -113,12 +113,12 @@ Below is the key implementation part for the epsilon-greedy:
     R_over_t.append(curr_R) # cumulative rewards
 {% endhighlight %}
 
-Some experimentation results on a 3-arm problem can be seen here:
+Some experimentation results on a 3-armed problem can be seen here:
 
 ![p15_multibandit_epsilon_greedy_1](https://raw.githubusercontent.com/WWWonderer/tech_blog/main/assets/images/p15_multibandit_epsilon_greedy_1.png){:style="display:block; margin-left:auto; margin-right:auto"}
 
 We can conclude that the value of $\epsilon$ does affect convergence, and a small value works best in the long run. 
 
 Reference: [Reinforcement Learning - An Introduction][sutton_book] by Richard S. Sutton and Andrew G. Barto (chapter 2)
 
-[sutton_book]: https://github.com/tensorflow/nmt/blob/master/nmt/scripts/bleu.py
+[sutton_book]: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf