Delayed-response tasks (DRTs) have been used to assess working memory (WM) processes in human and nonhuman animals. Experiments have shown that the basal ganglia (BG) and dorsolateral prefrontal cortex (DLPFC) subserve DRT performance. Here, we report the results of simulation studies of a systems-level model of DRT performance. The model was trained using the temporal difference (TD) algorithm and uses an actor-critic architecture. The matrisomes of the BG represent the actor and the striosomes represent the critic. Unlike existing models, we hypothesize that the BG subserve the selection of both motor- and cognitive-related information in these tasks. We also assume that the learning of both processes is based on reward presentation. A novel feature of the model is the incorporation of delay-active neurons in the matrisomes, in addition to DLPFC. Another novel feature of the model is the subdivision of the matrisomal neurons into segregated winner-take-all (WTA) networks consisting of delay- versus transiently-active units. Our simulation model proposes a new neural mechanism to account for the occurrence of perseverative responses in WM tasks in striatal-, as well as in prefrontal damaged subjects. Simulation results also show that the model both accounts for the phenomenon of time shifting of dopamine phasic signals and the effects of partial reinforcement and reward magnitude on WM performance at both behavioral and neural levels. Our simulation results also found that the TD algorithm can subserve learning in delayed-reversal tasks.