Mathematical Biology Seminar, DMS, NJIT

......................................................................

NJIT Mathematical Biology Seminar

Tuesday, October 23, 2007, 4:00pm
Cullimore Hall 611
New Jersey Institute of Technology

......................................................................

Neural mechanisms for reinforcement learning in humans: combining neural, behavioral, and computational approaches

Nathaniel Daw

Center for Neural Science, New York University

Abstract

Decisions in the real world or the laboratory often involve substantial uncertainty about the task contingencies which is only resolved via trial-and-error learning. Examples include "bandit" problems requiring choice between slot machines with unknown payoff probabilities, and more structured tasks such as traversing an unfamiliar maze. While one can define optimal trial-by-trial choices during learning of such tasks, it is typically intractable actually to compute them. For this reason, the study of these problems in computer science ("reinforcement learning"; RL) concerns the development of tractable shortcuts and the analysis of under what conditions they approximate optimality. RL algorithms therefore offer a set of detailed hypotheses for how subjects (and their brains) might plausibly approach difficult decision problems. Moreover, their formal properties license novel inquiries such as whether subjects employ shortcuts well suited to the circumstances they face.

I discuss evidence concerning what shortcuts organisms use to address two key problems in decision-making: the problem of assigning credit for deferred reinforcement and the exploration-exploitation dilemma. First, the idea that the brain contains multiple decision systems (ubiquitous in psychology and systems neuroscience) can be recast in terms of multiple algorithmic strategies for the credit assignment problem. This matches known psychological and neural properties of the systems, and rationalizes a body of data concerning which system animals deploy in various circumstances. Second, I detail our search for the behavioral and neural signatures of different heuristics for trading off exploiting favored options versus exploring unfamiliar ones. I present preliminary evidence that humans employ so-called 'novelty bonuses' to promote exploration, as had been postulated on the basis of dopaminergic recordings in primates.

Last Modified: Aug 22, 2007
Horacio G. Rotstein
h o r a c i o @ n j i t . e d u
Last modified: Wed Oct 10 11:52:44 EDT 2007