Lancaster University

Lancaster Summer Schools
in Corpus Linguistics and other Digital methods (#LancsSS17)

Lancaster University, UK – 27th to 30th June 2017

UCREL Summer School in corpus-based NLP

About the UCREL NLP Summer School

The UCREL Summer School in corpus-based natural language processing (NLP) was a new stream added last year to the highly successful series that began in 2011. Sponsored by UCREL at Lancaster University – one of the world's leading and longest-established centres for corpus-based research – its aim is to support students of computer science and computational linguistics in the development of advanced skills in corpus-based NLP methods. An additional aim is to foster interdisciplinary research and networking via joint sessions with other summer school streams.

Who is the UCREL NLP Summer School for?

The UCREL NLP Summer School is intended primarily for postgraduate computer science and informatics research students (and secondarily for Masters-level students, postdoctoral researchers, and others) who require in-depth knowledge of corpus-based NLP methodologies for their degree projects. Please note that this summer school assumes existing programming knowledge and skills. It is not aimed at raw beginners in coding.

If you are a social scientist with an interest in the analysis of social issues via text and discourse then the Corpus linguistics for Social Science Summer School will be more relevant. If you are a linguist who already has some experience with corpus linguistics, the Corpus linguistics for Language studies Summer school is a better event for you.

What topics does the UCREL NLP Summer School cover?

The programme consists of a series of linked intensive two-hour sessions, some involving practical work, others more discussion-oriented. The instructors include external guest speakers, as well as speakers from Lancaster University. In the 2017 syllabus, speakers, and their (provisional) session titles, include:

  • Stephen Wattam – Web scraping theory and methods
  • Paul Rayson – Web as corpus creation and cleaning
  • Alistair Baron – Authorship analysis for online text
  • Scott Piao – Semantic tagging, multilinguality, development and applications
  • Andrew Moore – Sentiment analysis
  • Mahmoud El–Haj – Text Classification using Machine Learning
  • Daniel Kershaw – Large–scale data mining
  • Eddie Bell – Vector based methods
  • John Mariani and Paul Rayson – Poster/demo session for attendees to present their own work

There are additional daily lectures shared with the other Summer School events, each illustrating cutting-edge research using corpus data.

The full timetable will be made available on this page when completed.

In addition, participants in this Summer School will have the opportunity to meet and consult with members of the CASS Challenge Panel, a group of prominent specialists in corpus methodology.

