Some multi-armed bandit code. The code
- Reproduce some of the results of chapter 2 of the "Reinforcement Learning, An introduction" book, by Sutton and Barto.
- Reproduce some of the results of [1].
- Fit model using [1] of experimental data.
File descriptions:
models.py
-- Simulation of different environment-agent pairs. Each pair has a fixed structure.contextual_bandit.py
-- Simulation of the contextual bandit experiment. The environment has the structure of the real experiment. It is possible to define different action rules.ml.py
-- Maximum Likelihood estimation of bandit parameters.analysis.py
-- Analysis of the models fitted to the experimental data.bandit.py
-- Reproduce results from Sutton's book.parse.py
-- Parse experimental data.vis.py
-- Visualization of results.utils.py
-- Some useful functions.filter.py
-- Filter parameters for filtfilt (doesn't really belongs here).
[1] N. D. Daw, "Trial-by-trial data analysis using computational models," Decision making, affect, and learning: Attention and performance XXIII, vol. 23, p. 1, 2011.