The Emerging Science of People-Watching: Forming Impressions From Third-Party Encounters

Traditional impression formation studies have focused almost exclusively on the perception and evaluation of isolated individuals. In recent years, however, portrayals of third-party encounters between two (or more) people have been used increasingly often to probe impressions about the interactions and relations between individuals. This tacit paradigm change has revealed an intriguing scope of judgments that concern how and why people relate to one another. Though these judgments recruit well-known neural networks of impression formation, their underlying cognitive operations and functional significance remain largely speculative. By providing an overview of recent theoretical and empirical approaches on encounter-based impressions, this article highlights their prevalent role in human social cognition.

Observing people in each other's company and making sense of their encounters, an endeavor sometimes referred to as people-watching, is widely known as an entertaining pastime. From a psychological perspective, however, this activity also signifies an impressive cognitive feat: By analyzing mere appearances and overt behavior, people-watchers form intricate impressions about those they witness without directly getting to know them. These impressions may even affect the observers' own intentions and actions. When navigating busy streets, for instance, most people refrain from penetrating the space between individuals whom they consider a meaningful social unit (Knowles, 2015).
Despite a long-standing psychological interest in rapid impression formation, systematic research on the perception and interpretation of so-called third-party encounters (TPEs) remains rare. Instead, impression formation studies typically ask participants to observe and evaluate isolated individuals or their parts (e.g., a face). This single-target approach has successfully established that people's visible attributes, including their facial appearances, elicit consensual judgments about their social group memberships, emotional states, and/or personalities (cf. Macrae & Quadflieg, 2010;Penton-Voak et al., 2013). Yet it has failed to capture the scope of impressions derived from the interactions and relations between people.
This oversight is surprising considering that the latter are regularly exploited in the media. A case in point is Coca-Cola's contemporary "Taste the feeling" campaign in which the portrayal of intimate moments between friends and lovers acts as a pivotal marketing tool (see Fig. 1). Psychologists and neuroscientists, by contrast, have just begun to study the effects of TPEs on uninvolved bystanders. By doing so, they have launched a new line of research that presents human dyads or triads as the observational unit of interest (see Fig. 2). But what has this tacit paradigm change uncovered? To address this question, the current article reflects on recent insights into encounter-based impressions.
What Are Encounter-Based Impressions?
Everyday experience attests that observing TPEs elicits a wide range of social judgments. Although some of these judgments can also occur in response to single targets (e.g., emotion recognition) and may simply get modified based on people's social context (cf. Hess, Blaison, & Kafetsios, 2016), others address specifically how and why an encounter between multiple individuals unfolds. These encounter-based impressions strictly require the observation of a particular combination of people and are inherently relational in nature. Action perception studies (e.g., Fawcett & Gredebäck, 2013;Sinke, Sorger, Goebel, & de Gelder, 2010), for instance, have revealed that observers of TPEs rapidly assess whether co-occurring individuals engage in independent or joint actions (e.g., reading vs. hugging), in mirroring or complementary actions (e.g., shaking hands vs. giving/receiving a gift), and/or in goalcompatible or -incompatible actions (e.g., cooperating vs. competing).
The existing work clearly illustrates that TPEs can invite numerous relational judgments of social relevance. Whether these judgments have been investigated in a truly comprehensive manner to date is less certain. To advance our understanding of how people think about their social world, future work must integrate different lines of research on the perception and evaluation of human encounters and define an evidencebased taxonomy of encounter-based impressions. Though developing such a taxonomy will require significant research effort, it promises to shed light on a cardinal aspect of human social cognition that has escaped empirical attention for far too long.

What Function(s) Do Encounter-Based Impressions Serve?
Similarly deserving of scientific scrutiny is the functional significance of encounter-based impressions. Initial data suggest that these impressions arise at a very young age. Six-month-old infants, for instance, express surprise when dyadic interactions entail irrational behavior (e.g., inappropriate feeding actions; Gredebäck & Melinder, 2010). Early interest in TPEs may reflect the fact that they provide ample opportunity for observational learning. Empirical support for this idea comes from research with 1.5-year-old toddlers who have been found to imitate actions they have first seen in other people's encounters (Shimpi, Akhtar, & Moore, 2013). The educational effects of TPEs, however, extend well beyond childhood. Adult observers of positive interactions between members of their own racial group and racial outgroup members, for example, tend to improve their attitudes toward the outgroup (Christ et al., 2014). These data suggest that encounter-based impressions prompt the acquisition of new skills and attitudes throughout people's lifetime.
In addition, encounter-based impressions seem to provide vital social insights. Based on the careful analysis of TPEs in their immediate environment, observers can, after all, identify (and avoid) individuals prone to dangerous, uncooperative, unfair, or immoral social behavior (Hamlin, 2013). They can further discover cooperative and/or influential individuals interested in forming new alliances and/or detect coalitions that threaten their own social standing (Schmid Mast & Hall, 2004). Given that humans must forge close bonds with others to survive in the face of adversity, evolutionary pressures may even have facilitated the development of cognitive mechanisms dedicated toward understanding TPEs (Bryant et al., 2016). Tentative support for this view comes from research revealing cross-cultural similarities in encounter-based impressions (Place, Todd, Zhuang, Penke, & Asendorpf, 2012). But are these impressions actually accurate? Accuracy seems necessary for these impressions to function as an effective social monitoring device, yet the literature on impression formation indicates that consensus in social judgments and accuracy are not always linked (Penton-Voak, Pound, Little, & Perrett, 2006).

Are Encounter-Based Impressions Accurate?
Observing TPEs typically provides access to an abundance of visual information. From short glances at human encounters, observers can learn whether two (or more) people look alike, share physical proximity, smile or lean toward each other, mimic each other's expressions and postures, and engage in eye contact, interpersonal touch, or direct communication (via gestures or speech acts). Upon longer inspection, observers can further extract the frequency, duration, and coordination of various nonverbal events (e.g., reciprocated smiles) and the degree of motion synchrony and turn-taking between people. Though recent eyetracking studies have revealed that observers of TPEs typically look back and forth between the different individuals involved in an encounter (Villani et al., 2015), it is less clear what exactly they are looking for.
Early research on the topic simply assumed that observers would identify visual information of diagnostic value for the impressions they were trying to form. Yet a seminal study by Bierneri, Gillis, Davis, and Grahe (1996) challenged this view. In this study, dyads of strangers were filmed during a discussion and afterward asked to rate how much rapport they had felt during their exchange. The researchers then showed the recorded videos (without sound) to a new group of participants and asked them to also assess the dyads' levels of rapport. This approach revealed little overlap between the discussants' and the observers' judgments. Further analyses demonstrated that observer judgments were largely based on the discussants' number of smiles, whereas the discussants' own ratings were mainly linked to their degree of physical closeness during the exchange. In other words, in this study, observers drew inaccurate conclusions about other people's rapport due to relying overly strongly on nondiagnostic visual input.
Observers' ability to distinguish between diagnostic and nondiagnostic visual input during TPE processing can differ, however, depending on their impression formation goal. Observers are reasonably accurate, for example, at judging whether others are acquainted with one another, are romantically interested in each other, or have different levels of authority (Latif, Barbosa, Vatiokiotis-Bateson, Castelhano, & Munhall, 2014;Place et al., 2012;Schmid Mast & Hall, 2004). Improvements in accuracy have also been detected whenever observers are able to witness relatively unstructured TPEs (e.g., two people solving a puzzle) compared to TPEs that are constrained by social norms (e.g., two people introducing themselves to each other; Puccinelli, Tickle-Degnen, & Rosenthal, 2004). Based on these data, it must be concluded that accuracy in encounter-based impressions depends critically on the quality of the visual input available and the type of impression drawn.
There is further evidence that accuracy rates in encounter-based impressions are also determined by observers' mental health. People with autism, for example, are less accurate at forming encounter-based impressions than healthy controls (Byrge, Dubois, Tyszka, Adolphs, & Kennedy, 2015). This finding is particularly noteworthy as it signals that differences in people's encounter-based impressions could ultimately inform clinical assessments of psychological deficits. Though a similar goal has guided much single-target research in the past, little progress has been made adopting this traditional approach (cf. Dalili, Penton-Voak, Harmer, & Munafo, 2015). Despite 20 years of research, for instance, it remains uncertain whether basic emotion recognition is disturbed in autism (Uljarevic & Hamilton, 2013). Given this unsatisfactory development, probing typical and atypical socialcognitive functioning with TPEs promises to provide a particularly fertile avenue for future research.

How Are Encounter-Based Impressions Accomplished?
In an initial attempt to understand how exactly TPEs trigger far-reaching social judgments, two major psychological strategies have been proposed: On the one hand, it has been postulated that observers of TPEs extract salient visual subcomponents (e.g., number of smiles) to evaluate human encounters (Bierneri et al., 1996). On the other hand, it has been argued that observers compare incoming visual information against stored templates of typical human encounters in a holistic manner (Neri, 2009). At present, there is no conclusive empirical evidence that favors either strategy. Both strategies may even be used at different stages of the impression formation process.
When it comes to the basic visual processing of TPEs, however, the important role of prior templates has recently been demonstrated. Specifically, it has been shown that human agents are more easily detected in point light displays when they are seen in the presence of other people who interact with them than when seen in isolation or in the presence of independently acting individuals (e.g., Manera, Del Giudice, Bara, Verfailie, & Becchio, 2011). These data indicate that observers' expectations about typical visual properties that characterize dyadic encounters (e.g., motion coordination between agents) can actually facilitate the perception of their partaking individuals. Similar top-down effects have been found at later stages of the impression formation process. Observers' ideas about typical social relationships, for instance, can bias their interpretation of people's actions: An ambiguous shove between two men may be considered playful if both look alike in terms of race, but aggressive otherwise (Duncan, 1976). The findings suggest that both the visual analysis and social evaluation of TPEs are guided by observers' expectations.
Future research should scrutinize the exact nature of people's expectations during TPE processing in further detail. It must be addressed, for instance, whether observers primarily analyze human encounters based on people's actions as recently proposed (de la Rosa et al., 2015). If this was indeed the case, changes in action understanding based on people's group memberships (as described above) should require some time to emerge given that the initial processing of TPEs should be untarnished by these memberships. Equally deserving of further exploration is the question to what degree mental processes of simulation, rather than expectancy-based evaluation, may fuel encounter-based impressions. The mental simulation of other people's actions and internal states is often considered a hallmark of social cognition. But when faced with two (or more) targets simultaneously, whom would observers simulate, especially if the different targets endorse competing goals? Initial work on the topic does not suggest a lack of simulation in the face of TPEs but rather the simultaneous simulation of multiple agents (Cracco, De Coster, Andres, & Brass, 2015). Given that this conclusion rests solely on experiments with isolated hand actions, however, a re-examination of these effects with full-body TPEs seems warranted.

What Are the Neural Correlates of Encounter-Based Impressions?
During the last 5 years, photographs and video clips of TPEs have featured increasingly often as stimuli in neuroimaging studies. This development is partially inspired by the so-called social intelligence hypothesis. According to this hypothesis, the evolutionary benefit of being able to understand and track numerous social relationships may have facilitated the development of relatively large brains in humans. Although the hypothesis remains controversial (Benson-Amram, Dantzer, Stricker, Swanson, & Holekamp, 2016), work stimulated by it has revealed several noteworthy insights. First and foremost, it has shown that observing TPEs generally relies on the recruitment of three well-known brain networks (e.g., Canessa et al., 2012;Georgescu et al., 2014;Wang & Quadflieg, 2015): the person perception network (PPN, involved in the visual analysis of human faces and bodies), the action perception network (APN, involved in understanding people's actions), and the mentalizing network (MN, involved in understanding people's mental states).
Second, it has provided accumulating evidence that activity in all three networks increases whenever cooccurring individuals engage in joint rather than independent actions (Centelles, Assaiante, Nazarian, Anton, & Schmitz, 2011;Kujala, Carlson, & Hari, 2012). Third, it has demonstrated that activity in all three networks differs based on what type of encounter-based impressions observers form during TPE exposure. When participants are prompted to evaluate differences in power relative to differences in weight between two individuals, for example, enhanced PPN, APN, and MN activity can be observed (Mason et al., 2014). Taken together, these studies provide further support for the notion that observers are particularly attentive toward the encounters of others and easily engage in a wide range of relational appraisals during TPE processing.
Beyond these initial insights, however, the exact functional contributions of the different networks during TPE processing remain poorly understood. This lack of understanding largely reflects the fact that existing neuroimaging studies on encounter-based impressions differ substantially in terms of their stimuli and procedures. Though this circumstance reiterates the richness and diversity of such impressions, it also means that some fascinating findings await systematic replication. Witnessing mismatching actions between agents (e.g., one person trying to high-five another who intends to shake hands), for example, seems to enhance activity in the extrastriate body area (EBA), a brain region of the PPN dedicated toward the encoding of body postures Sinke et al., 2010). These data tentatively suggest that the EBA may generate perceptual predictions about compatible body postures between individuals and engage in additional processing whenever these predictions get violated. By confirming and advancing this line of work, psychologists may ultimately be able to decipher what type of mental (i.e., perceptual and cognitive) templates guide the formation of encounter-based impressions in humans.

Conclusion
Although encounter-based impressions have been a topic of investigation since the 1970s (cf. Duncan, 1976), there has been a rapid increase in this area of research in recent years. In consequence, impressions about people's interactions and relations have become a pivotal subject of study. Further research in this field promises not only to advance our understanding of the human mind but also to inform best practices in clinical psychology (e.g., by refining standardized assessments of social-cognitive functioning), educational psychology (e.g., by defining optimal circumstances for observational learning), and economic psychology (e.g., by outlining easily accessible impression-based marketing strategies). In order to live up to its potential, however, psychological scientists should embrace a more systematic and coordinated approach toward exploring the nature, prevalence, and functional significance of encounter-based impressions in everyday life. Aloni, M., & Biernieri, F. J. (2004). (See References). A representative study that provides further support for the claim that observers of brief third-party encounters typically struggle to determine other people's rapport as discussed in the current article. Duncan, B. L. (1976). (See References). A historical classic; one of the first papers to raise attention to encounterbased impressions. A thorough eye-tracking study capturing people's tendency to look back and forth between those who constitute a third-person encounter in order to assess salient visual information about them.