Here, I’ll present the first Bayesian model which can simultaneously learn word meanings and perform pragmatic inference. In addition to capturing standard phenomena in both of these literatures, it gives insight into how the literal meaning of words like “some” can be acquired from observations of pragmatically strengthened uses, and provides a theory of how novel, task-appropriate linguistic conventions arise and persist within a single dialogue, such as occurs in the well-known phenomenon of lexical alignment. Over longer time scales such effects should accumulate to produce language change; however, unlike traditional iterated learning models, our simulated agents do not converge on a sample from their prior, but instead show an emergent bias towards belief in more useful lexicons. Our model also makes the interesting prediction that different classes of implicature should be differentially likely to conventionalize over time. Finally, I’ll argue that the mathematical “trick” needed to convince word learning and pragmatics to work together in the same model is in fact capturing a real truth about the psychological mechanisms needed to support human culture, and, more speculatively, suggest that it may point the way towards a general mechanism for reconciling qualitative, externalist theories of social interaction with quantitative, internalist models of low-level perception and action, while preserving the key claims of both approaches.
Building a Bayesian bridge between the physics and the phenomenology of social interaction
What is word meaning, and where does it live? Both naive intuition and scientific theories in fields such as discourse analysis and socio- and cognitive linguistics place word meanings, at least in part, outside the head: in important ways, they are properties of speech communities rather than individual speakers. Yet, from a neuroscientific perspective, we know that actual speakers and listeners have no access to such consensus meanings: the physical processes which generate word tokens in usage can only depend directly on the idiosyncratic goals, history, and mental state of a single individual. It is not clear how these perspectives can be reconciled. This gulf is thrown into sharp perspective by current Bayesian models of language processing: models of learning have taken the former perspective, and models of pragmatic inference and implicature have taken the latter. As a result, these two families of models, though built using the same mathematical framework and often by the same people, turn out to contain formally incompatible assumptions.