Using TD learning to simulate working memory performance in a model of the prefrontal cortex and basal ganglia

Ahmed A. Moustafa, Anthony S. Maida*

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

24 Citations (Scopus)


Delayed-response tasks (DRTs) have been used to assess working memory (WM) processes in human and nonhuman animals. Experiments have shown that the basal ganglia (BG) and dorsolateral prefrontal cortex (DLPFC) subserve DRT performance. Here, we report the results of simulation studies of a systems-level model of DRT performance. The model was trained using the temporal difference (TD) algorithm and uses an actor-critic architecture. The matrisomes of the BG represent the actor and the striosomes represent the critic. Unlike existing models, we hypothesize that the BG subserve the selection of both motor- and cognitive-related information in these tasks. We also assume that the learning of both processes is based on reward presentation. A novel feature of the model is the incorporation of delay-active neurons in the matrisomes, in addition to DLPFC. Another novel feature of the model is the subdivision of the matrisomal neurons into segregated winner-take-all (WTA) networks consisting of delay- versus transiently-active units. Our simulation model proposes a new neural mechanism to account for the occurrence of perseverative responses in WM tasks in striatal-, as well as in prefrontal damaged subjects. Simulation results also show that the model both accounts for the phenomenon of time shifting of dopamine phasic signals and the effects of partial reinforcement and reward magnitude on WM performance at both behavioral and neural levels. Our simulation results also found that the TD algorithm can subserve learning in delayed-reversal tasks.

Original languageEnglish
Pages (from-to)262-281
Number of pages20
JournalCognitive Systems Research
Issue number4
Publication statusPublished - Dec 2007
Externally publishedYes


Dive into the research topics of 'Using TD learning to simulate working memory performance in a model of the prefrontal cortex and basal ganglia'. Together they form a unique fingerprint.

Cite this