The human brain has the impressive capacity to adapt how it processes information to high-level goals. While it is known that these cognitive control skills are malleable and can be improved through ...training, the underlying plasticity mechanisms are not well understood. Here, we develop and evaluate a model of how people learn when to exert cognitive control, which controlled process to use, and how much effort to exert. We derive this model from a general theory according to which the function of cognitive control is to select and configure neural pathways so as to make optimal use of finite time and limited computational resources. The central idea of our Learned Value of Control model is that people use reinforcement learning to predict the value of candidate control signals of different types and intensities based on stimulus features. This model correctly predicts the learning and transfer effects underlying the adaptive control-demanding behavior observed in an experiment on visual attention and four experiments on interference control in Stroop and Flanker paradigms. Moreover, our model explained these findings significantly better than an associative learning model and a Win-Stay Lose-Shift model. Our findings elucidate how learning and experience might shape people's ability and propensity to adaptively control their minds and behavior. We conclude by predicting under which circumstances these learning mechanisms might lead to self-control failure.
Recent years have witnessed a surge of interest in understanding the neural and cognitive dynamics that drive sequential decision making in general and foraging behavior in particular. Due to the ...intrinsic properties of most sequential decision-making paradigms, however, previous research in this area has suffered from the difficulty to disentangle properties of the decision related to (a) the value of switching to a new patch versus, which increases monotonically, and (b) the conflict experienced between choosing to stay or leave, which first increases but then decreases after reaching the point of indifference between staying and switching. Here, we show how the same problems arise in studies of sequential decision-making under risk, and how they can be overcome, taking as a specific example recent research on the ‘pig’ dice game. In each round of the ‘pig’ dice game, people roll a die and accumulate rewards until they either decide to proceed to the next round or lose all rewards. By combining simulation-based dissections of the task structure with two experiments, we show how an extension of the standard paradigm, together with cognitive modeling of decision-making processes, allows to disentangle properties related to either switch value or choice conflict. Our study elucidates the cognitive mechanisms of sequential decision making and underscores the importance of avoiding potential pitfalls of paradigms that are commonly used in this research area.
People can evaluate a set of options as a whole, or they can approach those same options with the purpose of making a choice between them. A common network has been implicated across these two types ...of evaluations, including regions of ventromedial prefrontal cortex and the posterior midline. We test the hypothesis that sub-components of this reward circuit are differentially involved in triggering more automatic appraisal of one's options (Dorsal Value Network) versus explicitly comparing between those options (Ventral Value Network). Participants undergoing fMRI were instructed to appraise how much they liked a set of products (Like) or to choose the product they most preferred (Choose). Activity in the Dorsal Value Network consistently tracked set liking, across both task-relevant (Like) and task-irrelevant (Choose) trials. In contrast, the Ventral Value Network was particularly sensitive to evaluation condition (more active during Choose than Like trials). Within vmPFC, anatomically distinct regions were dissociated in their sensitivity to choice (ventrally, in medial OFC) versus appraisal (dorsally, in pregenual ACC). Dorsal regions additionally tracked decision certainty across both types of evaluation. These findings suggest that separable mechanisms drive decisions about how good one's options are versus decisions about which option is best.
Win—win choices cause anxiety, often more so than decisions lacking the opportunity for a highly desired outcome. These anxious feelings can paradoxically co-occur with positive feelings, raising ...important implications for individual decision styles and general well-being. Across three studies, people chose between products that varied in personal value. Participants reported feeling most positive and most anxious when choosing between similarly high-valued products. Behavioral and neural results suggested that this paradoxical experience resulted from parallel evaluations of the expected outcome (inducing positive affect) versus the cost of choosing a response (inducing anxiety). Positive feelings were reduced when there was no high-value option, and anxiety was reduced when only one option was highly valued. Dissociable regions within the striatum and the medial prefrontal cortex (mPFC) tracked these dueling affective reactions during choice. Ventral regions, associated with stimulus valuation, tracked positive feelings and the value of the best item. Dorsal regions, associated with response valuation, tracked anxiety. In addition to tracking anxiety, the dorsal mPFC was associated with conflict during the current choice, and activity levels across individual items predicted whether that choice would later be reversed during an unexpected reevaluation phase. By revealing how win—win decisions elicit responses in dissociable brain systems, these results help resolve the paradox of win—win choices. They also provide insight into behaviors that are associated with these two forms of affect, such as why we are pulled toward good options but may still decide to delay or avoid choosing among them.
When choosing between options, whether menu items or career paths, we can evaluate how rewarding each one will be, or how congruent it is with our current choice goal (e.g., to point out the best ...option or the worst one.). Past decision-making research interpreted findings through the former lens, but in these experiments the most rewarding option was always most congruent with the task goal (choosing the best option). It is therefore unclear to what extent expected reward vs. goal congruency can account for choice value findings. To deconfound these two variables, we performed three behavioral studies and an fMRI study in which the task goal varied between identifying the best vs. the worst option. Contrary to prevailing accounts, we find that goal congruency dominates choice behavior and neural activity. We separately identify dissociable signals of expected reward. Our findings call for a reinterpretation of previous research on value-based choice.
Animals, including humans, consistently exhibit myopia in two different contexts: foraging, in which they harvest locally beyond what is predicted by optimal foraging theory, and intertemporal ...choice, in which they exhibit a preference for immediate vs. delayed rewards beyond what is predicted by rational (exponential) discounting. Despite the similarity in behavior between these two contexts, previous efforts to reconcile these observations in terms of a consistent pattern of time preferences have failed. Here, via extensive behavioral testing and quantitative modeling, we show that rats exhibit similar time preferences in both contexts: they prefer immediate vs. delayed rewards and they are sensitive to opportunity costs of delays to future decisions. Further, a quasi-hyperbolic discounting model, a form of hyperbolic discounting with separate components for short- and long-term rewards, explains individual rats' time preferences across both contexts, providing evidence for a common mechanism for myopic behavior in foraging and intertemporal choice.
Decision-making is typically studied as a sequential process from the selection of what to attend (e.g., between possible tasks, stimuli, or stimulus attributes) to which actions to take based on the ...attended information. However, people often process information across these various levels in parallel. Here we scan participants while they simultaneously weigh how much to attend to two dynamic stimulus attributes and what response to give. Regions of the prefrontal cortex track information about the stimulus attributes in dissociable ways, related to either the predicted reward (ventromedial prefrontal cortex) or the degree to which that attribute is being attended (dorsal anterior cingulate cortex, dACC). Within the dACC, adjacent regions track correlates of uncertainty at different levels of the decision, regarding what to attend versus how to respond. These findings bridge research on perceptual and value-based decision-making, demonstrating that people dynamically integrate information in parallel across different levels of decision-making.
The explore-exploit dilemma describes the trade off that occurs any time we must choose between exploring unknown options and exploiting options we know well. Implicit in this trade off is how we ...value future rewards - exploiting is usually better in the short term, but in the longer term the benefits of exploration can be huge. Thus, in theory there should be a tight connection between how much people value future rewards, i.e. how much they discount future rewards relative to immediate rewards, and how likely they are to explore, with less 'temporal discounting' associated with more exploration. By measuring individual differences in temporal discounting and correlating them with explore-exploit behavior, we tested whether this theoretical prediction holds in practice. We used the 27-item Delay-Discounting Questionnaire to estimate temporal discounting and the Horizon Task to quantify two strategies of explore-exploit behavior: directed exploration, where information drives exploration by choice, and random exploration, where behavioral variability drives exploration by chance. We find a clear correlation between temporal discounting and directed exploration, with more temporal discounting leading to less directed exploration. Conversely, we find no relationship between temporal discounting and random exploration. Unexpectedly, we find that the relationship with directed exploration appears to be driven by a correlation between temporal discounting and uncertainty seeking at short time horizons, rather than information seeking at long horizons. Taken together our results suggest a nuanced relationship between temporal discounting and explore-exploit behavior that may be mediated by multiple factors.
Recent research has highlighted a distinction between sequential foraging choices and traditional economic choices between simultaneously presented options. This was partly motivated by observations ...in Kolling, Behrens, Mars, and Rushworth,
Science, 336
(6077), 95–98 (
2012
) (hereafter, KBMR) that these choice types are subserved by different circuits, with dorsal anterior cingulate (dACC) preferentially involved in foraging and ventromedial prefrontal cortex (vmPFC) preferentially involved in economic choice. To support this account, KBMR used fMRI to scan human subjects making either a foraging choice (between exploiting a current offer or swapping for potentially better rewards) or an economic choice (between two reward-probability pairs). This study found that dACC better tracked values pertaining to foraging, whereas vmPFC better tracked values pertaining to economic choice. We recently showed that dACC’s role in these foraging choices is better described by the difficulty of choosing than by foraging value, when correcting for choice biases and testing a sufficiently broad set of foraging values (Shenhav, Straccia, Cohen, & Botvinick
Nature Neuroscience, 17
(9), 1249–1254,
2014
). Here, we extend these findings in 3 ways. First, we replicate our original finding with a larger sample and a task modified to address remaining methodological gaps between our previous experiments and that of KBMR. Second, we show that dACC activity is best accounted for by choice difficulty alone (rather than in combination with foraging value) during both foraging and economic choices. Third, we show that patterns of vmPFC activity, inverted relative to dACC, also suggest a common function across both choice types. Overall, we conclude that both regions are similarly engaged by foraging-like and economic choice.
Animal studies have shown that acetylcholine decreases excitatory receptive field size and spread of excitation in early visual cortex. These effects are thought to be due to facilitation of ...thalamocortical synaptic transmission and/or suppression of intracortical connections. We have used functional magnetic resonance imaging (fMRI) to measure the spatial spread of responses to visual stimulation in human early visual cortex. The cholinesterase inhibitor donepezil was administered to normal healthy human subjects to increase synaptic levels of acetylcholine in the brain. Cholinergic enhancement with donepezil decreased the spatial spread of excitatory fMRI responses in visual cortex, consistent with a role of acetylcholine in reducing excitatory receptive field size of cortical neurons. Donepezil also reduced response amplitude in visual cortex, but the cholinergic effects on spatial spread were not a direct result of reduced amplitude. These findings demonstrate that acetylcholine regulates spatial integration in human visual cortex.