Wmatrix corpus analysis of UK General Election Manifestos

Analysis of earlier UK General Election Manifestos

The 2005 UK General Election Manifestos were analysed in the online tutorials for Wmatrix showing the comparison of the Liberal Democrats and Labour Party Manifestos. Further examples of the application to the 2010 general election manifestos can be seen on my blog. The plain text versions of the 2010 UK election manifestos can be downloaded for use in your favourite text analysis software (with thanks to Martin Wynne for editing two of the files). TEI encoded versions of the 2010 election manifestos are now available (with thanks to Lou Burnard).

Analysis of 2015 UK General Election Manifestos

In the following table, each manifesto is presented along with a set of corpus analyses.

In addition, you can download crosstabs of word, POS and semantic tag frequencies with log-likelihood and dispersion statistics. These are tab delimited so can be opened in your favourite spreadsheet application.

Conservatives Labour Liberal Democrats Green Party Plaid Cymru Scottish National Party UKIP

Party website www.conservatives.com www.labour.org.uk www.libdems.org.uk www.greenparty.org.uk www.plaid.cymru www.snp.org www.ukip.org

Launch date 14 April 2015 13 April 2015 15 April 2015 14 April 2015 31 March 2015 20 April 2015 15 April 2015

Direct link PDF PDF PDF | TXT PDF PDF PDF PDF

Local copy PDF PDF PDF PDF PDF PDF PDF

Edited text version TXT TXT TXT TXT TXT TXT TXT

Number of words (tokens) 29,371 17,662 32,269 39,433 17,616 17,826 26,179

Number of unique words (types) 4,366 3,138 4,783 6,221 3,124 2,786 5,182

Word frequency list TXT TXT TXT TXT TXT TXT TXT

Key word cloud Figure 1 Figure 3 Figure 5 Figure 7 Figure 9 Figure 11 Figure 13

Key semantic tag cloud Figure 2 Figure 4 Figure 6 Figure 8 Figure 10 Figure 12 Figure 14

As in previous elections, I have made local copies of manifestos available since they often disappear offline after the election. To prepare the text versions, I used Adobe Reader to save as text and then manually edited (using TextWrangler) a small number of items such as headers, footers and page numbers to store them in pseudo-XML-style tags. In addition, I have followed the input format guidelines for CLAWS and converted n-dashes, pound signs, begin and end quotes to XML entities. The Liberal Democrats manifesto is downloadable in text form directly from their website, but I still needed to carry out the character edits on their file. Note that this text version does not contain the index from the end of the PDF version.

A note on word counts: Wmatrix counts semantically meaningful multiword expressions (MWEs) as one item, so other corpus software may well provide different counts here. These figures also depend on tokenisation in our NLP pipeline (by CLAWS).

Word frequency lists are tab delimited so you can load them in to your favourite spreadsheet program. MWEs are shown in the word frequency lists as words connected by underscore characters.

Key word and semantic tag clouds are produced by comparing the data with the BNC Written Sampler corpus. The larger the font, the higher the log-likelihood score, so larger items are more significantly overused compared to the reference corpus. See the Wmatrix main page and online tutorials if you want more details about how this works.

Figure 1: Conservatives Key Word Cloud

Figure 2: Conservatives Key Semantic Tag Cloud

Figure 3: Labour Key Word Cloud

Figure 4: Labour Key Semantic Tag Cloud

Figure 5: Liberal Democrats Key Word Cloud

Figure 6: Liberal Democrats Key Semantic Tag Cloud

Figure 7: Green Key Word Cloud

Figure 8: Green Key Semantic Tag Cloud

Figure 9: Plaid Cymru Key Word Cloud

Figure 10: Plaid Cymru Key Semantic Tag Cloud

Figure 11: SNP Key Word Cloud

Figure 12: SNP Key Semantic Tag Cloud

Figure 13: UKIP Key Word Cloud

Figure 14: UKIP Key Semantic Tag Cloud

	Conservatives	Labour	Liberal Democrats	Green Party	Plaid Cymru	Scottish National Party	UKIP
Party website	www.conservatives.com	www.labour.org.uk	www.libdems.org.uk	www.greenparty.org.uk	www.plaid.cymru	www.snp.org	www.ukip.org
Launch date	14 April 2015	13 April 2015	15 April 2015	14 April 2015	31 March 2015	20 April 2015	15 April 2015
Direct link	PDF	PDF	PDF \| TXT	PDF	PDF	PDF	PDF
Local copy	PDF	PDF	PDF	PDF	PDF	PDF	PDF
Edited text version	TXT	TXT	TXT	TXT	TXT	TXT	TXT
Number of words (tokens)	29,371	17,662	32,269	39,433	17,616	17,826	26,179
Number of unique words (types)	4,366	3,138	4,783	6,221	3,124	2,786	5,182
Word frequency list	TXT	TXT	TXT	TXT	TXT	TXT	TXT
Key word cloud	Figure 1	Figure 3	Figure 5	Figure 7	Figure 9	Figure 11	Figure 13
Key semantic tag cloud	Figure 2	Figure 4	Figure 6	Figure 8	Figure 10	Figure 12	Figure 14