on-policy TD learning for classical-conditioning problem: classical_conditioning.py
The trace interval (time step from the offset of the conditioned stimuli (CS) and the onset of unconditioned stimuli (US) is controlable by num_state
.
example plot of the prediction: