Second Workshop on Arabic Corpus Linguistics (WACL-2)
Workshop in conjunction with the Corpus Linguistics 2013 conference
Monday 22nd July 2013 – Lancaster University, UK
Claire Brierley, University of Leeds
“Natural Language Processing working together with Arabic and Islamic Studies”
Eric Atwell, University of Leeds
Andrew Hardie, Lancaster University
Call for papers
The call for papers is now closed. See the timetable below.
We invite proposals for the full-day Second Workshop on Arabic Corpus Linguistics, to be held in conjunction with the
Corpus Linguistics 2013 conference. Following on from the successful first WACL
in 2011, as well as the related LRE-REL event in 2012,
WACL-2 will again take place at Lancaster University.
The aim of this series of workshops is to create a venue for exploring progress in the field of research into the Arabic
language using corpora, from across the many areas of corpus linguistics and computational linguistics where the analysis of
Arabic structure and usage is an active issue.
The scope of the workshop encompasses both (a) the design, construction and annotation of Arabic corpora, and (b) the use
of corpora in research on the Arabic language – in any relevant area, including (but not limited to!) lexis and lexicography,
syntax, collocation, NLP systems and analysis tools, contrastive and historical studies, stylistics, and discourse analysis.
All varieties of Arabic – including the different Colloquial Arabics as well as Classical/Qur’anic and Modern Standard forms
of the language – are within the workshop's purview.
Abstracts for presentations are invited on any of these areas, or on any other topic related to the study of Arabic-language
corpora. Presentations either describing finished research or reporting work in progress are welcome. Submissions from postgraduate
students are especially welcome.
Abstracts should be up to 600 words; presentations will be in the usual format (20 minutes for the presentation and 10 minutes
Please submit abstracts by email to Andrew Hardie (firstname.lastname@example.org).
Please use the same abstract format prescribed by the main conference – a template can be found at
http://corpora.lancs.ac.uk/submission/template. Acceptable file
formats are Microsoft Word .doc(x), RTF, or OpenDocument text (.odt). Please use Unicode characters for any Arabic text examples.
All abstracts should be in English rather than Arabic; English will be the language of the workshop.
Please note that we will not accept for WACL-2 any abstract which has been accepted in the main CL2013 conference in
verbatim form. We are happy to consider submissions arising from a research project which is also being presented at the main
conference, but the content must not be identical or overlap substantially. For example, it might be appropriate to submit to
WACL-2 a presentation focusing on matters of interest to Arabic specialists, while submitting an abstract of broader methodological
or theoretical interest to the main conference.
- Closing date for abstracts:
Monday February 25th 2013 extended to Mon March 4th
- Responses to abstract submission: before Monday March 1st 2013
Participants should register for the workshop day via the CL2013 website (this can be done in addition to, or independently of,
registration for the main conference). See this page for details:
Registration opens on February 15th 2013.
DRAFT Timetable (subject to revision!)
||The effects of speakers' gender, age, and region on overall performance of Arabic automatic speech recognition systems using the phonetically rich and balanced Modern Standard Arabic speech corpus.
Mohammad A. M. ABUSHARIAH, Majdi SAWALHA, The University of Jordan
||Arabic Learner Corpus v1: a new resource for Arabic language research.
Abdullah ALFAIFI, Eric ATWELL, University of Leeds, UK
||Discourse markers at the local level in NP opinion articles.
Fatima ALKOHLANI, University of Business and Technology
||The design and construction of the 50 million words KSUCCA King Saud University Corpus of Classical Arabic.
Maha ALRABIAH, AbdulMalik AL-SALMAN, King Saud University, Saudi Arabia
Eric ATWELL, University of Leeds, UK
||Tea and coffee break
||Converging linguistic evidence on two flavors of production: The synonymy of Arabic COME verbs.
Antti ARPPE, Dana ABDULRAHIM, University of Alberta, Canada
||arTenTen: a new, vast corpus for Arabic.
Yonatan BELINKOV, MIT, USA
Nizar HABASH, Columbia University, USA
Adam KILGARRIFF, Lexical Computing Ltd, UK
Noam ORDAN, University of Haifa, Israel
Ryan ROTH, Columbia University, USA
Vitek SUCHOMEL, Masaryk University, CZ, and Lexical Computing Ltd, UK
||KEYNOTE PRESENTATION: Natural Language Processing working together with Arabic and Islamic Studies.
Claire BRIERLEY, School of Computing, University of Leeds, UK
||KALIMAT a multipurpose Arabic corpus.
Mahmoud EL-HAJ, Lancaster University, UK
Rim KOULALI, Mohammed 1 University, Morocco
||AraSAS: A semantic tagger for Arabic.
Ghada MOHAMED, University of Bahrain
Amanda POTTS, Andrew HARDIE, Lancaster University, UK
||When collocational and expressive choices are imbued with political stances and ideological opinions: A corpus-based critical discourse analysis of Islamic identity and Egyptian identity in the news media of pre-revolutionary Egypt.
Safwat A. S. MOHAMMED, Lancaster and Cairo Universities, UK and Egypt
||Tea and coffee break
||Using Subordinate Clauses as Subjects of Verbal Sentences in Modern Standard Arabic.
Ashraf ABDOU, American University in Cairo, Egypt
||Unifying linguistic annotations and ontologies for the Arabic Quran.
Noorhan ABBAS, Luluh ALDHUBAYI, Hend AL-KHALIFA, Zainab ALQASSEM, Eric ATWELL, Kais DUKES, Majdi SAWALHA, Abdul-Baquee Muhammad SHARAF, University of Leeds, UK, and King Saud University, Saudi Arabia
||Corpus-based lexicography in a language with a long lexicographical tradition: The case of Arabic.
Tressy ARTS, Karen MCNEIL, Oxford University Press, UK
||The role of large-scale Arabic corpora in the tasks of sort-out and throw-out of sensory subdivisions of the entry in the general-purpose monolingual Arabic reference works.
Sultan Nasser A. ALMUJAIWEL, King Saud University, Saudi Arabia
||A hybrid approach for prepositional phrase attachment in MSA and EA.
Rania AL-SABBAGH, Abbas BENMAMOUN, University of Illinois at Urbana-Champaign, USA
||Developing tools for Arabic corpus for researchers.
Bassam HAMMO, Faisal AL-SHARGI, Sane YAGI, Nadim OBEID, University of Jordan
||Multi-level analysis and annotation of Arabic corpora for text-to-sign-language Machine Translation.
Abdelaziz LAKHFIF, Ferhat Abbas University, Algeria
Mohammed T. LASKRI, Badji Mokhtar University, Algeria
Eric ATWELL, University of Leeds, UK
||Representation of Muslim Brotherhood in Egyptian newspapers.
Sara YOUSSEF, American University in Cairo, Egypt
||Close of WACL2 Workshop on Arabic Corpus Linguistics.
POSTER PRESENTATIONS (on show during 2 hours of breaks)
A: Annotating the Arabic Quran with a classical semantic ontology
Nora ABBAS, Eric ATWELL, University of Leeds, UK
B: Generating an Arabic Sentiment Corpus from social media.
Samah ALHAZMI, John MCNAUGHT, University of Manchester, UK
C: Towards an automatic development of Named Entities Corpus from Arabic Wikipedia.
Fahd ALOTAIBI, Mark LEE, University of Birmingham, UK
D: Linguistics features to confirm the chronological order of the Quran.
Sameer ALREHAILI, Eric ATWELL, University of Leeds, UK
E: Quran ontologies and keywords for Question Answering.
Aisha JILANI, Lee MCCLUSKEY, Di CAI, University of Huddersfield, UK
F: Unsupervised morphology learning using the Quranic Arabic Corpus.
Bilal KHALIQ, John CARROLL, University of Sussex, UK
G: Corpus based unsupervised learning of Arabic morphology.
Abdellah LAKHDARI, Hadda CHERROUN, Amar Telidji University, Algeria
H: Enriching Algerian Arabic dialects corpora.
K. MEFTOUH, Badji Mokhtar University, Algeria
S. HARRAT, Ecole Normale Superieure de Bouzareah, Algeria
K. SMAILI, LORIA, France
M. ABBAS, CRSTDLA, Algeria
I: Quranic verse similarity based on Word Sense Disambiguation.
Farah MEHBOOB, Institute of Business Administration, Pakistan
J: Development and implementation of a computational algorithm to predict the classical Arabic conjugate pattern focusing on weak verbs.
Haq NAWAZ, Mufti Ahmad ALI, Jamia Ashrafia Lahore, Pakistan
K: Arabic social media analysis for the construction and the enrichment of NLP tools.
Fatiha SADAT, University of Quebec in Montreal, Canada
L: Accelerating the processing of large corpora: using Grid Computing for lemmatizing the 176 million words Arabic Internet Corpus.
Majdi SAWALHA, University of Jordan
Eric ATWELL, University of Leeds
Mohammad A. M. ABUSHARIAH, University of Jordan
M: Early results for named entity recognition in a Haddith corpus.
Muazzam Ahmed SIDDIQUI, Mostafa El-Sayed SALEH, Abobakr BAGAIS, King Abdulaziz University, Saudi Arabia