Table of contents

PrefacePDFviii
Mariko Abe : A Corpus-based Contrastive Analysis of Spoken and Written Learner Corpora: The Case of Japanese-speaking Learners of EnglishPDF1
Aduriz I., Aranzabe M.J., Arriola J.M., Atutxa A., Díaz de Ilarraza A., Ezeiza N., Gojenola K., Oronoz M., Soroa A., and Urizar R.: Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processingPDF10
Khurshid Ahmad, Pensiri Manomaisupat, David Cheng, Tugba Taskaya, Saif Ahmad, Lee Gillam, Andrew Hippisley: The mood of the (financial) markets: In a corpus of words and of picturesPDF12
Sandra M. Aluísio, Gisele M. Pinheiro, Marcelo Finger, Maria das Graças V. Nunes, Stella E. O. Tagnin: The Lacio-Web Project: overview and issues in Brazilian Portuguese corpora creationPDF14
Dawn Archer, Tony McEnery, Paul Rayson, Andrew Hardie: Developing an automated semantic analysis system for Early Modern EnglishPDF22
Dawn Archer, Andrew Hardie, Tony McEnery, Scott Piao: A corpus of seventeenth-century English news reportage: construction, encoding and applicationsPDF32
Bertol Arrieta, Arantza Díaz de Ilarraza, Koldo Gojenola, Montse Maritxalar, Maite Oronoz: A database system for storing second language learner corporaPDF33
Jørg Asmussen: Towards a methodology for corpus-based studies of linguistic change: Contrastive observations and their possible diachronic interpretations in the Korpus 2000 and Korpus 90 General Corpora of DanishPDF42
Eric Atwell: A New Machine Learning Algorithm for Neoposy: coining new Parts of SpeechPDF43
Eric Atwell, Paul Gent, Julia Medori, Clive Souter: Detecting student copying in a corpus of science laboratory reports: simple and smart approachesPDF48
Francis Henrik Aubert, Stella E. O. Tagnin: A Corpus of Sworn Translations – for linguistic and historical researchPDF54
Bogdan Babych, Anthony Hartley, Eric Atwell: Statistical modelling of MT output corpora for Information ExtractionPDF62
Paul Baker, Andrew Hardie, Tony McEnery, and Sri B.D. Jayaram: Constructing Corpora of South Asian LanguagesPDF71
Federica Barbieri: The "new" quotatives in American English: A cross-register comparisonPDF81
Marco Baroni and Silvia Bernardini: A preliminary analysis of collocational differences in monolingual comparable corporaPDF82
Sabine Bartsch: Investigating cross-linguistic constraints on the premodification of adjectival past participles and desubstantival adjectives. A corpus-based study of English and GermanPDF92
Kate Beeching: Synchronic and diachronic variation: the how and why of sociolinguistic corpora.PDF102
Luisa Bentivogli, Christian Girardi, Emanuele Pianta: The MEANING Italian CorpusPDF103
Julie Carson-Berndsen, Ulrike Gut and Robert Kelly: Discovering regularities in non-native speechPDF113
P. Beust, S. Ferrari, V. Perlerin: NLP model and tools for detecting and interpreting metaphors in domain-specific corporaPDF114
Philippe Blache, Marie-Laure Guénot and Tristan van Rullen: A corpus-based technique for grammar developmentPDF124
Birte Bös: Towards an integrated model of service encountersPDF132
Roderick Bovingdon and Angelo Dalli: Statistical analysis of the source origin of MaltesePDF140
Lou Burnard, Tony Dodd: Xara: an XML aware tool for corpus searchingPDF142
Marianna N. Christou: Expressions and structures of the delexical verb KANΩ [“MAKE” / “DO”] in Modern Greek language: A corpus-based approach to newspaper articlesPDF145
Ken Cosh and Pete Sawyer: Using natural language processing tools to assist semiotic analysis of information systemsPDF155
H. Cunningham, V. Tablan, K. Bontcheva, M. Dimitrov: Language engineering tools for collaborative corpus annotationPDF165
Mark Davies: Annotation without lexicons: an alternative to the standard bootstrapping approachPDF174
Joost van de Weijer: Consonant variation within wordsPDF184
Debbie Elliott, Anthony Hartley and Eric Atwell: Rationale for a multilingual corpus for machine translation evaluationPDF191
John Elliott and Debbie Elliott: The Human Language Chorus Corpus (HULCC)PDF201
Jens Fauth, Hans-Jörg Schmid: Detecting gender-preferential patterns of linguistic features in face-to-face communicationPDF211
Valéria D. Feltrim, Sandra M. Aluísio, Maria das Graças V. Nunes: Analysis of the rhetorical structure of computer science abstracts in PortuguesePDF212
Katerina T. Frantzi: Updating LSP dictionaries with collocational informationPDF219
Robert Gaizauskas, Lou Burnard, Paul Clough and Scott Piao: Using the XARA XML-Aware Corpus Query Tool to Investigate the METER CorpusPDF227
Ana Llinares García: Repetition and young learners´ initiations in the L2: a corpus-driven analysisPDF237
Sandrine Garnier, Youhanizou Tall, Sisay Fissaha, Johann Haller: Learner Corpora: Design, Development and Applications - Development of NLP tools for CALL based on learner corpora (German as a foreign language)PDF246
Sara Gesuato: The company women and men keep: what collocations can reveal about culturePDF253
Vojko Gorjanc: Tracking lexical changes in the reference corpus of Slovene textsPDF263
Stefan Grondelaers, Dirk Speelman, Dirk Geeraerts: A corpus-based approach to informality: the case of Internet chatPDF264
Leif Grönqvist and Magnus Gunnarsson : A method for finding word clusters in spoken languagePDF265
Xiaotian Guo: Between Verbs and Nouns and Between the Base Form and the Other Forms of Verbs – A Contrastive Study into COLEC and LOCNESSPDF274
Le An Ha: A method for word segmentation in VietnamesePDF282
Silvia Hansen-Schirra: Linguistic enrichment and exploitation of the Translational English CorpusPDF288
Andrew Hardie: Developing a tagset for automated part-of-speech tagging in UrduPDF298
Nigel Harwood: Personal pronouns and academic writing: a multidisciplinary corpus-based critical pragmatic approach to EAPPDF308
Laura Hasler, Constantin Orasan and Ruslan Mitkov: Building better corpora for summarisationPDF309
Chris Heffer: Not KWIC but Quick: KeyWords in CourtPDF319
Kris Heylen and Dirk Speelman: A corpus-based analysis of word order variation: The order of verb arguments in the German middle fieldPDF320
Knut Hofland: A web-based concordance system for spoken language corporaPDF330
Shelley Ching-yu Hsieh: The Corpus of Mandarin Chinese and German Animal ExpressionsPDF332
Susan Hunston: Frame, phrase or function: a comparison of frame semantics and local grammarsPDF342
Emi Izumi, Toyomi Saiga, Thepchai Supnithi, Kiyotaka Uchimoto, Hitoshi Isahara: The development of the spoken corpus of Japanese learner English and the applications in collaboration with NLP techniquesPDF359
Inés Jacob, Joseba Abaitua, Josu Gómez: Automatic feeding of translation memory toolsPDF367
Steven Jones, M. Lynne Murphy: Antonymy in Childhood: a corpus-based approach to acquisitionPDF372
Randall L. Jones: An Analysis of Lexical Text Coverage in Contemporary GermanPDF373
Stig W. Jørgensen, Carsten Hansen, Jette Drost, Dorte Haltrup, Anna Braasch, Sussi Olsen: Domain specific corpus building and lemma selection in a computational lexiconPDF374
Tomoko Kaneko: How non-native speakers express anger, surprise, anxiety and grief: a corpus-based comparative studyPDF384
Sachie Karasawa: Patterns of elaboration and interlanguage development: an exploratory corpus analysis of college student essaysPDF394
Hannah Kermes, Stefan Evert: Text analysis meets corpus linguistics PDF402
Adam Kilgarriff: Linguistic Search EnginePDF412
Paul Kingsbury: A methodology for inducing a chronology of the Pä li CanonPDF413
Gerry Knowles, Zuraidah Mohd Don: Tagging a corpus of Malay texts, and coping with 'syntactic drift'PDF422
Natalie Kübler and Cécile Frérot: Verbs in specialised corpora: from manual corpus-based description to automatic extraction in an English-French parallel corpusPDF429
Toshihiko Kubota: A Study on Abridgement for Spoken Word TitlesPDF439
David YW Lee: Spoken Academic Lexicogrammar and Discourse PatternsPDF440
Geoffrey Leech, Martin Weisser: Generic speech act annotation for task-oriented dialoguesPDF441
Agnieszka Lenko-Szymanska: The curse and the blessing of mobile phones - a corpus-based study into Polish and American rhetoric strategiesPDF447
Robert Liebscher and David Groppe: Rethinking context availability for concrete and abstract words: a corpus studyPDF449
Laura Löfberg, Dawn Archer, Scott Piao, Paul Rayson, Tony McEnery, Krista Varantola, Jukka-Pekka Juntunen: Porting an English semantic tagger to the Finnish languagePDF457
Nadine Lucas, Bruno Crémilleux, Leny Turmel : Signalling well-written academic articles in an English corpus by text mining techniquesPDF465
Anke Lüdeling and Stefan Evert: Linguistic experience and productivity: corpus evidence for fine-grained distinctions PDF475
Michaela Mahlberg: High frequency nouns in English: aspects of a grammatical descriptionPDF484
Belinda Maia: Constructing comparable and parallel corpora for terminology extraction - work in progressPDF485
Manolis Maragoudakis, Katia Kermanidis and Nikos Fakotakis: Towards a Bayesian Stochastic Part-Of-Speech and Case Tagger of Natural Language CorporaPDF486
Kevin Mark: Learner corpus building and a ‘living’ university foreign language curriculumPDF496
Tony McEnery, Zhonghua Xiao: Fuck revisitedPDF504
Dan McIntyre, Carol Bellard-Thomson, John Heywood, Tony McEnery, Elena Semino and Mick Short: The Construction of a Corpus to Investigate the Presentation of Speech, Thought and Writing in Written and Spoken British EnglishPDF513
John McKenny: Seeing the wood and the trees: Reconciling findings from discourse and lexical analysisPDF523
Magnus Merkel, Michael Petterstedt and Lars Ahrenberg: Interactive Word Alignment for Corpus LinguisticsPDF533
José María Guirao Miras Ana González Ledesma, Guillermo de la Madrid Heitzmann, Manuel Alcántara Plá, Antonio Moreno Sandoval: Relating lexical items to sociolinguistic features in a spontaneous speech corpus of SpanishPDF543
Juan M. Montero and M. Mar Duque: ANESTTE: a writer’s assistant for a specific purpose languagePDF544
Olga Moudraia: The Student Engineering Corpus: Analysing Word FrequencyPDF552
JoAnne Neff, Francisco Ballesteros, Emma Dafouz, Francisco Martínez, Juan-Pedro Rica: Formulating Writer Stance: A Contrastive Study of EFL Learner CorporaPDF562
Diane Nicholls: The Cambridge Learner Corpus - error coding and analysis for lexicography and ELTPDF572
Judy Noguchi, Thomas Orr, Yukio Tono: Using a dedicated corpus to identify features of professional English usage: What do “we” do in science journal articles?PDF582
Attila Novák, Viktor Nagy, Csaba Oravecz: Corpus assisted development of a Hungarian morphological analyser and guesserPDF583
Toshifumi Oba and Eric Atwell: Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners’ EnglishPDF591
Marija Omazic: THE METACOMMUNICATIVE SETTING OF PHRASEOLOGICAL UNITS AND THEIR MODIFICATIONS – EVIDENCE FROM THE BRITISH NATIONAL CORPUSPDF599
Nelleke Oostdijk: Corpus linguistics meets language technology: deep syntactic parsing for question answeringPDF603
Maeve Paris: Extending computer-assisted text analysis techniques to the detection of source code plagiarism and collusion: assisting manual inspectionPDF611
Núria Gala Pavia, Salah Aït-Mokhtar: Lexicalising a robust parser grammar using the WWWPDF620
Julien Perrez and Liesbeth Degand: On the combination of corpus-based and experimental methodologies in the study of causal, contrastive and metadiscourse connectives in L1 and L2 text comprehension and productionPDF627
Scott S.L. Piao and Tony McEnery: A Tool for Text ComparisonPDF637
James Pustejovsky, Patrick Hanks, Roser Saurí, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro and Marcia Lazo: The TIMEBANK CorpusPDF647
Andrew Roberts and Eric Atwell: The use of corpora for automatic evaluation of grammar inference systemsPDF657
Juhani Rudanko: More on horror aequi: evidence from large corporaPDF662
Sarah Rule, Emma Marsden, Florence Myles, Rosamond Mitchell: Constructing a database of French interlanguage oral corporaPDF669
Geoffrey Sampson: Are we nearly there yet, Mum?PDF678
Hans-Jörg Schmid, Jens Fauth: Women's and men's style: fact or fiction? New grammatical evidencePDF679
Serge Sharoff: Methods and tools for development of the Russian Reference CorpusPDF680
Bayan Abu Shawar and Eric Atwell : Using dialogue corpora to train a chatbot PDF681
Gerardo Sierra, Alfonso Medina, Rodrigo Alarcón, César A. Aguilar: Towards the Extraction of Conceptual Information from CorporaPDF691
Kiril Simov, Alexander Simov, Milen Kouylekov: Constraints for corpora development and validationPDF698
Milena Slavcheva: Corpus shallow parsing: meeting point between paradigmatic knowledge encoding PDF706
Nicholas Smith: A quirky progressive? A corpus-based exploration of the will + be + -ing construction in recent and present day British English.PDF714
Harold Somers: Some Issues in the Mark-up of Handwriting in a Learner CorpusPDF724
Dirk Speelman, Stefan Grondelaers, Dirk Geeraerts: A profile-based calculation of region and register variation: the synchronic and diachronic status of the national variants of DutchPDF733
Somayajulu G. Sripada and Ehud Reiter and Jim Hunter and Jin Yu: Exploiting a parallel TEXT - DATA corpusPDF734
Asa M. Stepak: A proposed mathematical theory explaining the sequence of grammatical categories PDF744
Petra Storjohann: The lexicographic use of corpora and computational tools for disambiguationPDF754
Jozsef Szakos: Cultures and Corpora: Extracting Anthropological Information from Corpora of Formosan Endangered LanguagesPDF763
Jun Arata Takahashi : Do we talk (or write?) differently over the Net?- A lexical enquiry into ‘a’ Net-EN -PDF764
Kaoru Takahashi: A Study of Text Types and Register Variation in the British National CorpusPDF773
Yuri Tambovtsev: The Structure of the Consonant Patterns in the Spanish Speech Sound Chain as a Clue of Typological ClosenessPDF774
Yuri Tambovtsev: Phonological similarity between Basque and other world languages based on the frequency of occurrence of certain typological consonantal featuresPDF775
Tess Yu-Shan Ke, Liang-Feng Chen, Chien-Chung Chen: Investigation on the uses of temporal subordinators by NS and NNS in academic spoken EnglishPDF780
Carole Tiberius, Dunstan Brown, Greville Corbett: Ambiguity in Russian MorphologyPDF790
Juhani Toivanen, Tapio Seppänen, Eero Väyrynen: Creation and utilisation of the MediaTeam Emotional Speech CorpusPDF791
Yukio Tono: Learner corpora: design, development and applicationsPDF800
Montserrat Civit Torruella, Mª Antònia Martí Antonín, Lluís Padró Cirera : Using hybrid probabilistic-linguistic knowledge to improve pos-tagging performancePDF810
Patrick Tschorn, Anke Lüdeling: Morphological knowledge and alignment of English-German parallel corporaPDF818
Francesca Vaghi, Marco Venuti: The Economist and The Financial Times. A study of movement metaphorsPDF828
Bertus van Rooy and Lande Schäfer: An evaluation of three POS taggers for the tagging of the Tswana Learner English CorpusPDF835
Tamás Váradi: Shallow parsing of Hungarian business newsPDF845
Isabel Verdaguer and Anna Poch: Collocational and colligational patterns in lexical sets: A corpus-based studyPDF852
Maria Verde: Shedding light on SHED, CAST and THROW as nodes of extended lexical unitsPDF859
Shih-Ping Wang: Mutual information and corpus-based approaches to reduplicative fixed expressionsPDF869
Julie Weeds and David Weir: Finding and evaluating sets of nearest neighboursPDF879
David Wible, Ping-Yu Huang: Using learner corpora to examine L2 acquisition of tense-aspect markingsPDF889
Sandra Williams and Ehud Reiter: A corpus analysis of discourse relations for Natural Language GenerationPDF899
Andrew Wilson, Celia Worth: Building and annotating corpora of spoken Welsh and GaelicPDF909
Andrew Wilson, Celia Worth: Conceptual Glossaries of the Latin Vulgate BiblePDF918
Andrew Wilson, Olga Moudraia: Quantitative or Qualitative Content Analysis? Experiences from a cross-cultural comparison of female students' attitudes to shoe fashions in Germany, Poland and RussiaPDF919
Martin Wynne, Rowan Wilson, Ylva Berglund: Virtual Corpora at the Oxford Text ArchivePDF920
Yang Xiaojun: Survey and Prospect of China’s Corpus-Based ResearchesPDF930
Debra Ziegeler, Sarah Lee: Analysing a Corpus-based Semantic Investigation of English DialectsPDF931
Heike Zinsmeister, Ulrich Heid: Identifying predicatively used adverbs by means of a statistical grammar modelPDF932