Learning in Non-Stationary Environments
Abstract
Real-world decision making is challenging due, in part, to changes in the underlying reward structure: the best option last week may be less rewarding today. Determining the best response is even more challenging when feedback validity is low. Presented here are the results of two experiments designed to determine the degree to which midbrain reward processing is responsible for detecting reward contingency changes when feedback validity is low. These results suggest that while midbrain reward systems may be involved in detecting unexpected uncertainty in non-stationary environments, other systems are likely involved when feedback validity is low – namely, the locus-coeruleus-norepinephrine system. Finally, a computational model that combines these systems is described and tested. Taken together, these results downplay the role of the midbrain reward system when feedback validity is low, and highlight the importance of the locus-coeruleus-norepinephrine system in detecting reward contingency changes.