The Trinity Lancaster Corpus of L2 spoken production which is currently being developed at Lancaster University in collaboration with the Trinity College London represents the largest corpus of its kind. It is based on examinations of spoken English conducted by the Trinity College London, a major international examination board, and contains interactions between exam candidates (L2 speakers of English) and examiners (L1 speakers of English).
The corpus allows the study of individual differences between speakers as well as the effect of speaker background (L1, gender, age, education etc.) on their language production. At present, the corpus contains approximately 3 million running words coming from over 1500 L2 users from nine different countries (Spain, Italy, China, India, Sri Lanka, Mexico, Argentina, Brazil and Russia). The age of L2 speakers ranges from 9 to 72 years, with a balanced sample of each of the following age groups: young (under 20 years of age), middle-aged (20-30 years old) and mature (31 and older) speakers. The corpus also provides a balanced sample of L2 speech in terms of L2 proficiency, covering B1 - C2 levels of the Common European Framework.
The presentation will discuss:
- methodological decisions connected with sampling and transcription
- specific features of L2 speech
- structure of the corpus (to date)