Colloquium (11/18/2015): Robert J. Podesva (Stanford University)

The Role of the Body in Structuring Sociophonetic Variation

Scholars of gesture and bodily hexis have long recognized the centrality of the body in speech production (Bourdieu 1984, McNeill 1992, Kendon 1997). Yet theories of variation have generally been constructed based on analyses of what can be observed in the audio channel alone (cf. Mendoza-Denton and Jannedy 2011). This paper draws on a multimodal analysis of audiovisual data to illustrate that voice quality and vowel quality are strongly constrained by body movement and facial expression.

Dyadic interactions between friends were recorded in a sound-attenuated environment staged like a living room. The acoustic analysis focuses on the incidence of creaky voice (using Kane et al.’s 2013 neural network model) and vowel quality (the lowering and retraction of the front lax vowels, in accordance with the California Vowel Shift). Computer vision techniques were applied to additionally quantify the magnitude of body movements (movement amplitude) and identify when speakers were smiling.

Results show that body movement and facial expression predict the realization of both linguistic variables. Creaky voice was more common in phrases where speakers moved less, in phrases where they were not smiling (for women), and in interactions where speakers reported feeling less comfortable. The front lax vowels were lower (more shifted) among women, and in phrases where speakers (regardless of sex) were smiling.

Speakers use their bodies in non-random ways to structure linguistic variation, so analysts can improve quantitative models of variation by attending to forms of embodied affect. Focusing on the body can also facilitate the development of more comprehensive social analyses of variation, many of which rely solely on correlations between linguistic practice and social category membership. I conclude by discussing the implications of an embodied view of variation for language change.