Skip to content

Latest commit

 

History

History
26 lines (18 loc) · 3.01 KB

README.md

File metadata and controls

26 lines (18 loc) · 3.01 KB

Reinforcement learning models of decision making

This repository contains materials for modeling choice data recorded while participants played a two-armed bandit task. Three reinforcement learning (RL) models are implemented along with two heuristic models for comparison. RL models are all variations of temporal difference learning models.

These models were coded in the spring of 2022 by Kaustubh Kulkarni (reach me via email, GitHub, or Twitter with any questions).

🧮 Models

Model Model Description Parameters Details
Biased Preference for one machine bias The bias parameter fits a preference for one machine over the other.
Heuristic Switch to other machine after two losses epsilon The epsilon parameter fits random choice that does not adhere to the switching strategy.
RW Temporal difference learning (TDL) alpha, beta Alpha refers to the learning rate, beta refers to the inverse temperature parameter.
RWDecay TDL with center decay alpha, decay, beta Alpha and beta are the same as above. The decay parameter fits the speed by which the values move towards a neutral value.
RWRL TDL with separate learning rates alpha_pos, alpha_neg, beta Alpha_pos and alpha_neg refer to learning rates for positive and negative prediction errors respectively.

🙏 Acknowledgments

I am grateful to Dr. Xiaosi Gu, Dr. Daniela Schiller, and the Center for Computational Psychiatry at Mount Sinai. I am also grateful to Project Jupyter for making it possible to create and share these materials in a Jupyter Book.

🎫 License

Creative Commons License
Content in this repository (i.e., any .md or .ipynb files in the content/ folder) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.