From e90d459fe349a0132247475c3c1328990997bac8 Mon Sep 17 00:00:00 2001 From: Shawn Rhoads Date: Wed, 20 Dec 2023 18:31:24 -0500 Subject: [PATCH] added README and requirements.txt --- README.md | 93 ++++++++++++++++++++++++++++++++++++++++++++++-- requirements.txt | 7 ++++ 2 files changed, 98 insertions(+), 2 deletions(-) create mode 100644 requirements.txt diff --git a/README.md b/README.md index d99a5de..22e834c 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,91 @@ -# pyEM - Expectation Maximization with MAP +
+ +# pyEM: Expectation Maximization with MAP estimation in Python + +
+ +This is a Python implementation of the Hierarchical Expectation Maximization algorithm with MAP estimation. [See below](#key-concepts) for more information on the algorithm. + +## Relevant Modules +* `pyEM.fitting`: contains the main function (`EMfit`) for fitting models +* `pyEM.math`: contains functions used for fitting +* `pyEM.plotting`: contains functions for simple plotting + +## Usage +Users should create new functions based on their modeling needs: +1. Simulation function to simulate behavior (see `examples.rw_models.simulate`) +2. Fit function to fit the model to the behavior (see `examples.rw_models.fit`) + +## Requirements +This algorithm requires Python 3.7.10 with the following packages: +``` +numpy: 1.21.6 +scipy: 1.6.2 +joblib: 1.1.0 +matplotlib: 3.5.3 +seaborn: 0.12.2 +pandas: 1.1.5 +tqdm: 4.65.0 +``` + +We also use: +``` +copy +datetime +pickle +sys +``` + +## Installation +To install the package, you can use `pip install` in a new Anaconda environment (recommended), but you can also just `pip install` it into your current environment: +``` +conda create --name emfit pip python=3.7.10 +conda activate emfit +pip install git+https://github.com/shawnrhoads/pyEM.git +``` + +To update the package, you can use pip: +``` +pip install --upgrade git+https://github.com/shawnrhoads/pyEM.git +``` + +## Examples +See `examples/RW.ipynb` with an example notebook on implementing the algorithm using the Rescorla-Wagner model of reinforcement learning. This notebook simulates behavior and fits the model to the simulated data to demonstrate how hierarchical EM-MAP can be used for parameter recovery. + +## Future Implementations +In future versions, I would love to add support using Python classes. For exampe, there could be a base model class with `simulate()` and `fit()` methods that can be inherited by other models. This would allow for a more flexible implementation of the algorithm, and would allow for the use of different models without having to change the code. + +## For Contributors +This is meant to be a basic implementation of hierarchical EM with MAP. There is still so much left out. That being said, other researchers and educators are invited to help improve and expand the code here! + +Here are some ways you can help! +- If you spot an error (e.g., typo, bug, inaccurate descriptions, etc.), please open a new issue on GitHub by clicking on the GitHub Icon in the top right corner on any page and selecting "open issue". Alternatively, you can open a new issue directly through GitHub. +- If there is inadvertently omitted credit for any content that was generated by others, please also open a new issue directly through GitHub. +- If you have an idea for a new example tutorial or a new module to include, please either open a new issue and/or submit a pull request directly to the repository on GitHub. + +
+ +## Key Concepts +*Negative Log-Likelihood* + +The negative log-likelihood is a measure of how well the model fits the observed data. It is obtained by taking the negative natural logarithm of the likelihood function. The goal of MLE is to find the parameter values that minimize the negative log-likelihood, effectively maximizing the likelihood of the observed data given the model. + +*Prior Probability* + +The prior probability represents your knowledge or belief about the parameters before observing the data. It is typically based on some prior information or assumptions. In this case, you are using a normal distribution to represent your prior beliefs about the parameters, with mean $\mu$ and standard deviation $\sqrt{\sigma}$. + +*MAP Estimation* + +In MAP estimation, you are incorporating the prior probability into the estimation process. Instead of only maximizing the likelihood (as in MLE), you are maximizing the posterior probability, which combines the likelihood and the prior. Mathematically, MAP estimation can be expressed as: + +$argmax_{\theta} (likelihood(\theta | data) * prior(\theta))$ + +where $\theta$ represents the model parameters + +We are effectively combining the likelihood and the prior in a way that biases the parameter estimation towards the prior beliefs. Since we are maximizing this combined term, we are seeking parameter values that not only fit the data well (as indicated by the likelihood) but also align with the prior probability distribution. + +**Code originally adapted for Python from:** +
Wittmann, M. K., Fouragnan, E., Folloni, D., Klein-Flügge, M. C., Chau, B. K., Khamassi, M., & Rushworth, M. F. (2020). Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys. Nature Communications, 11(1), 3771. https://doi.org/10.1038/s41467-020-17343-w
+ +See also: +
Daw, N. D. (2011). Trial-by-trial data analysis using computational models. Decision making, affect, and learning: Attention and performance XXIII, 23(1). https://doi.org/10.1093/acprof:oso/9780199600434.003.0001 [pdf](https://www.princeton.edu/~ndaw/d10.pdf)
\ No newline at end of file diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..5c88106 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,7 @@ +numpy==1.21.6 +scipy==1.6.2 +joblib==1.1.0 +matplotlib==3.5.3 +seaborn==0.12.2 +pandas==1.1.5 +tqdm==4.65.0 \ No newline at end of file