Dyadic interactions between friends were recorded in a sound-attenuated environment staged like a living room. The acoustic analysis focuses on the incidence of creaky voice (using Kane et al.’s 2013 neural network model) and vowel quality (the lowering and retraction of the front lax vowels, in accordance with the California Vowel Shift). Computer vision techniques were applied to additionally quantify the magnitude of body movements (movement amplitude) and identify when speakers were smiling.
Results show that body movement and facial expression predict the realization of both linguistic variables. Creaky voice was more common in phrases where speakers moved less, in phrases where they were not smiling (for women), and in interactions where speakers reported feeling less comfortable. The front lax vowels were lower (more shifted) among women, and in phrases where speakers (regardless of sex) were smiling.
Speakers use their bodies in non-random ways to structure linguistic variation, so analysts can improve quantitative models of variation by attending to forms of embodied affect. Focusing on the body can also facilitate the development of more comprehensive social analyses of variation, many of which rely solely on correlations between linguistic practice and social category membership. I conclude by discussing the implications of an embodied view of variation for language change.