AHRB (Arts and Humanities Research Council) research grant
The British Academy larger research grant
Personnel: Geoffrey Leech (grant holder); Nicholas Smith (research associate)
The 'Brown family' of corpora, consisting of four equivalent one-million word corpora of British and American written English, provides an unprecedented opportunity for investigating grammatical change over a short period of a language's history. The four corpora are:
These corpora are 'equivalent' in the sense that they were compiled using the same corpus design and sampling methods, and can therefore be compared so as to reveal changes in frequency in the use of the language within the 30-year period 1961-1991, and also to show how American and British English differed over that period.
The Recent Grammatical Change project focused initially on the British pair of corpora (LOB and FLOB), building on previous work done by Christian Mair and Marianne Hundt at Freiburg (Germany) where the Frown and FLOB Corpora were created. In collaboration with Freiburg, we POS-tagged the FLOB Corpus, using the C8 tagset, a somewhat more detailed tagset than the ones previously used at Lancaster. At the same time, we converted the LOB corpus (previously tagged at Lancaster and Oslo in 1980-83) to the new C8 tagset, to ensure comparability with the tagging of FLOB.
We concentrated on those areas of English grammar where it has previously been suggested that change has been taking place in the recent past. The primary areas we investigated were verb categories such as modal auxiliaries, semi-modals, aspect, tense and mood. A provisional frequency analysis was also undertaken of a number of noun phrase categories (e.g. relative pronouns, genitives vs. of-phrases, noun-noun sequences) and a few other categories such as questions and punctuation.
After the British corpora had been analysed, we turned our attention to the American corpora Brown and Frown. To ensure comparability between Brown and Frown, as well as with the British corpora LOB and FLOB, we (re-)tagged them using the same C8 tagset. Time has not so far allowed a manual post-editing or the tagged American corpora, and so we have to live with the 2% error in tagging, to rely on approximately frequency comparisons, using an error coefficient derived, for each tag, from the corpus of previous manual corrections.
Findings show a decline of c. 10% in the use of the modal auxiliaries in both American and British English. As a group, the semi-modals, in contrast, show a highly significant increase (although some semi-modals such as had better show a decrease). Other findings include: a remarkable rise (c. 30%) in the use of the present progressive and a decline in the use of the passive. Provisional results from the noun phrase analysis include a decline in wh- relative clauses (beginning with who, whom, which, etc.), especially in the use of pied-piping constructions (of which, to whom, etc.). On the other hand, zero relativization has been increasing, especially in the case of preposition stranding. On the whole, these changes are observed in both the American and the British corpora, though with some marked differences in individual frequency profiles. The 'Americanization' of British English is suggested by the modal auxiliary profiles: here British English appears to follow American English, where the decline of modals is further advanced both in the 1961 and the 1991 data. In addition to American influence, another general trend of 'colloquialization' appears to explain some changes.
Although this study is limited by its concentration on written English, we were able to make some provisional comparisons of British English spoken English during the same period. For this, we were kindly allowed to use data from the Survey of English Usage at UCL. The trends observed were broadly comparable with the pattern shown of written English corpora, except that they were spectacularly more extreme in the decline of modals (c.45%) and the rise of semi-modals (c. 32%). This observation gives credence to the idea that the spoken language leads the way in grammatical change - however, results must remain tentative, and based as they are on relatively small samples of data.
Simplistic assumptions implied by the terms Americanization and colloquialization are upheld by our investigations, but with some puzzling exceptions. For example, the generalization that writers tend to adopt progressively more colloquial habits in written English is not always borne out in LOB and FLOB, which show (a) a non-significant increase in the use of the get-passive, and (b) a significant increase in the use of Latinate affixes.
Moreover, the simplistic assumption that the 'new wave' of semi modals (have to, going to etc.) is supplanting the 'older wave' of modal auxiliaries (must, will, may, etc.) is thwarted by the LOB-FLOB comparison which shows the core modals to be seven times more common than the semi-modals - and hence is unable to explain more than a small part of their decline. The only case where this argument has some credibility is in the dramatic fall in the use of must. This could be partly explained by a rising use of have to, which in 1991 is more frequent than must in both written and spoken corpus data.
The simple explanation of Americanization also does not apply to all cases. Strangest of all is the observation that the semi-modals, which have been strongly associated with American English, are observed to less common in American than in British English in both sets of corpora. Hence there can be no easy conclusion to the effect that American English, championing the semi-modals, is causing their extended use in British English.