, 2007 and Rilling et al., 2002) or when there are similarities between their own and the other’s personal characteristics (Mobbs et al., 2009). The sRPE was a specific form of reward prediction error related to the other, made in reference to the simulated-other and used for learning their hidden variables. Different forms of the other’s reward prediction error also modulated activity in the vmPFC. Activity in the vmPFC was correlated with an “observational” reward prediction error (the difference between the other’s stimulus choice outcome and the subject’s value of the stimulus) (Burke et al., 2010 and Cooper et al.,
2011). This error indicated which stimulus was more likely to be rewarding to subjects, whereas in the study presented here, the sRPE indicated which stimulus was more likely to be rewarding to the other. vmPFC signals have also been reported to be modulated by different perceptions of the other’s intentions (Cooper et al., 2010). http://www.selleckchem.com/products/gdc-0068.html An interesting avenue for future research selleck screening library is to deepen our understanding of the relationship between, and use of, different types of vicarious reward prediction errors involved in forms of fictive or counterfactual learning (Behrens et al., 2008, Boorman et al., 2011, Hayden et al., 2009 and Lohrenz et al., 2007).
Our findings demonstrate that during simulation, humans use another learning signal—the sAPE—to model the other’s internal variables. This error was entirely unexpected based on the direct recruitment hypothesis, and it indicates that simulation is dynamically refined during learning using observations of the other’s choices, thus also rejecting the stronger hypothesis. The sAPE significantly modulated BOLD signals in the dmPFC/dlPFC
and several other areas (Table 1), but the sRPE did not. This activation pattern suggests that these areas may have a particular role in utilizing the other’s choices rather than the other’s outcomes much (Amodio and Frith, 2006). This view is convergent with earlier studies in a social context, in which subjects considered the other’s behaviors, choices, or intentions, but not necessarily their outcomes (Barraclough et al., 2004, Hampton et al., 2008, Izuma et al., 2008, Mitchell et al., 2006, Yoshida et al., 2010 and Yoshida et al., 2011), and also with studies in nonsocial settings (Gläscher et al., 2010, Li et al., 2011 and Rushworth, 2008). Among the other areas, the temporoparietal junction and posterior superior temporal sulcus (TPJ/pSTS) were noteworthy. Our results support a role for the TPJ/pSTS in utilizing the other’s choices, consistent with previous studies using RL paradigms in social settings (Behrens et al., 2008, Hampton et al., 2008 and Haruno and Kawato, 2009). Our findings that the dmPFC/dlPFC and TPJ/pSTS were significantly activated by the sAPE in both the value and action levels provide an important twist on the distinction between action and outcome encoding or between action and outcome monitoring (Amodio and Frith, 2006).