The Hansard at Huddersfield project

Alexander Von Lunen & Hugo Sanjurjo Gonzalez

University of Huddersfield

The AHRC-funded Hansard at Huddersfield project is following up on the recent SAMUELS project, which semantically tagged the Hansard Corpus. The Hansard corpus is the collection of UK House of Commons and Lords debates from 1803 to 2005, and the SAMUELS project tagged this corpus with a grammatical and semantic annotation based on the Historical Thesaurus Semantic Tagger.

The main goal of the Hansard at Huddersfield project is to introduce some usual corpus linguistic methods to the general public in a simplified manner. Most methods from corpus linguistics cater for a specialist audience, yet these methods could also help the general public by making the record of British parliament more accessible. The project contemplates this by means of intuitive searches and associated visualisations. Timelines, word clouds, sunburst visualizations and line charts show linguistic information such as frequencies, linguistic tags and relations in a simpler and more understandable way. Thus, we expect that non-academic users and the general public maximise the benefits of the Hansard Corpus without the need of any linguistic expertise and obtaining more than a simple list of full-text search results.

Week 13 2018/2019

Thursday 31st January 2019

Management School LT 11