Do men really swear more? Rethinking the statistical procedures in corpus-based sociolinguistic studies.

Vaclav Brezina

CASS, Lancaster University

In the talk, I will discuss different statistical procedures available for analysis of sociolinguistic data in large language corpora. I will demonstrate that the traditional approach of using aggregated data with the log-likelihood statistic is in principle unreliable. I will show that when this method is used, random (and therefore sociolinguistically irrelevant) speaker groupings can often yield statistically significant results. Finally, I will offer suggestions for alternative methodologies (and statistical procedures), which take into account within group differences and therefore produce more meaningful results.

Week 2 2013/2014

Thursday 17th October 2013

FASS Meeting Room 1