The Psychological Status of Collocation: Evidence from ERPs

Jennifer Hughes

CASS, Lancaster University

In this presentation, I discuss the results of four ERP experiments which collectively aim to find out whether or not there is a neurophysiological difference in the way that the brain processes pairs of words which form collocations compared to pairs of which do not form collocations. The ERP (or event-related potential) technique is a method of measuring the changes in voltage that occur in the brain in response to particular stimuli. The stimuli used in this research consists of corpus-derived adjective-noun bigrams which form strong collocations (Condition 1), and matched adjective-noun bigrams which do not form collocations (Condition 2). The bigrams are embedded into sentences which are presented in a word-by-word fashion.

In Experiment 1, I pilot a procedure for determining whether or not a detectable neurophysiological difference exists between Conditions 1 and 2 for native speakers of English. In Experiment 2, I replicate the pilot study using a different groups of native English speakers; then, in Experiment 3, I replicate the procedure using non-native speakers of English (specifically, native speakers of Mandarin Chinese). In Experiment 4, I then investigate the psychological validity of different association measures, namely transition probability, mutual information, log-likelihood, z-score, t-score, Dice-coefficient, MI3, and raw frequency.

The results reveal that there is a neurophysiological difference in the way that the brain processes corpus-derived collocational bigrams compared to matched non-collocational bigrams, and that this difference is larger for the non-native speakers compared to the native speakers. Moreover, while there is a strong correlation between the amplitude of the brain response and all of the association measures studied in Experiment 4, the strongest correlations exist between amplitude and hybrid association measures, including z-score, MI3, and Dice co-efficient. This suggests that mutual information and log-likelihood, which are two of the most commonly used association measures in corpus linguistics (Gries 2014a:37), are not necessarily always the optimal choice. I discuss these results in relation to prior literature from the fields of corpus linguistics and cognitive neuroscience.

