The emergence of social media has created new opportunities for social scientists to investigate how organisations, individuals and media contribute to the formation of opinions about important and complex issues such as climate change.
This talk will introduce ongoing work in the NTAP project (Networks of Texts and People, 2012-15, Research Council of Norway, www.ntap.no). NTAP is developing methods and tools to analyse the distribution, flow and development of knowledge and opinions across online social networks. We have so far downloaded the complete content of about 3,000 English-language blogs related to the topic of climate change. The text analysis techniques being applied are at the interface of corpus linguistics, text mining and information extraction. These are being coupled with techniques for social network analysis and information visualization.
After a general overview of the project the talk will explore two ideas related more specifically to corpus linguistics. First, the idea that network analysis techniques for community detection can be used to create sub-corpora that exhibit distinctive language use within a topical blog corpus. Second, the idea that local grammar fragments induced from a large set of concordance lines can reveal interesting usage patterns; in our climate change blogs we have c. 20,000 instances of the term "sea levels" and c. 230,000 instances of the term "climate change".