The UCREL Corpus Research Seminar (CRS) is a forum for all staff, visiting academics, and postgraduate research students interested in corpus-based research in any area of linguistics. CRS is run by UCREL (University Centre for Computer Corpus Research on Language), a research centre between the Department of Linguistics and English Language and the School of Computing and Communications.
CRS meetings offer an opportunity to present work in progress and receive helpful feedback, discuss relevant research, approaches and methods, get experience in using corpus interfaces and tools, and stay up to date with corpus-based research at Lancaster University. We welcome anyone who is a newcomer to this exciting and growing area of linguistics. We welcome presentations from researchers of other departments and universities.
On this site, along with general information, you will find a list of upcoming seminars and an archive of past seminars. If you have any suggestions of things to add to the site, then please get in touch.
In 2016/2017 CRS meetings will be on Thursdays at 3pm during term time, unless otherwise indicated.
Notifications of seminars are sent to the UCREL Mailing List, sign up if you would like to receive them and other UCREL related messages. You can also follow us on Twitter, where we post updates on upcoming seminars.
If you need more information or want to give a presentation, please contact one of the CRS organisers:
The previous website has an archive of past seminars.
We acknowledge the following funding for external speakers: CASS and UCREL research centres, the Faculty of Arts and Social Sciences, the Department of Linguistics and English Language and the School of Computing and Communications.
The UCREL Corpus Research Seminars this academic year (2016/17) will be on Thursdays 3pm-4pm during term time, unless otherwise indicated, please check our upcoming page for the time and location of any future presenations. If you would like to give a talk or have a suggestion for an external speaker to invite, then please get in touch.
Thursday 3rd November 2016
Management school LT9
A computational stylistic comparison between English used on Chinese governmental websites and English used on US and UK governmental websites
English texts on Chinese governmental websites are often criticised for being 'Chinglish' or 'lifeless'. This project investigates how English versions of Chinese governmental websites can improve their stylistic quality. The project is a computational stylistic comparison between English texts on Chinese governmental websites and English texts on UK and US governmental websites. The approach is corpus-based and employs Biber's (1988) multidimensional analysis. A corpus (including two subcorpora) of websites had previously been downloaded using the wget-m method. Perl scripts were used to extract text content from web pages to form a txt file for each website, and word frequency lists and trigrams have also been extracted. Keyword lists for the two subcorpora have been generated based on a COCA word frequency list. Several issues remain to be dealt with before further analysis can be conducted, including: whether it is possible to separate 'real content' from purely repetitive content when data comes from web pages (such as menus, navigation, copyright); the alternatives to manual annotation when this is not a practical option given the massive size of the corpus; and how to identify which features to consider to make the comparison more significant.
Week 6: 17th November 2016 (3:00-4:00pm)
Charles Carter A15
Talk on Visualising English Print project - title pending
Week 11: 19th January 2017 (3:00-4:00pm)
Management school LT9
Heads and adjuncts in the recent history of English: syntax and processing on the move