last update: February 28th, 2013
Advanced learner varieties
In recent years, the field of second language acquisition (SLA) research has seen an increasing interest in advanced stages of acquisition and questions of near-native competence. However, there are still relatively few studies of advanced learners compared to learners at early and intermediate stages of the learning process. Advanced learners have typically mastered the L2 rules of morphosyntax, and their written production is mainly free from grave grammatical errors. However, their writing often sounds unidiomatic and shows subtle differences to texts produced by native speakers (NSs). It seems difficult to pin down the exact reasons for this non-nativeness or foreign-soundingness of learner writing, and therefore it is frequently explained by using vague cover terms such as "unidiomaticity" or "style".
In the last 15 years or so, corpus-based research into learner language has contributed to a much clearer picture of advanced interlanguage. These studies have yielded substantial empirical evidence that, for example, texts produced by advanced learners and native speakers differ in terms of frequencies of certain words, phrases and syntactic structures. This research also shows that learners use features that are more typical of speech than of academic prose, which suggests that they are largely unaware of register differences.
Moreover, there is evidence that learners of various language backgrounds have similar problems and face similar challenges on their way to near-native proficiency. For example, advanced learners still struggle with the acquisition of linguistic phenomena that are optional and/or highly L2-specific, often located at the interfaces of linguistic subfields (e.g. syntax-semantics, syntax-pragmatics). Also, when writing academic prose, many of the observed difficulties appear to stem from a lack of understanding of the rules surrounding academic writing, or from a lack of practice, rather than as a result of interference from first language academic conventions. Due to these similarities, the project refers to the interlanguage of these learners as advanced learner varieties (ALVs).
Despite the growing interest in the concept of advancedness, the field is still struggling with
- a definition and clarification of the concepts "advanced learner / advancedness" and "nativelikeness" or "near-native competence"
- an in-depth description of the ALV, especially when it comes to learners' acquisition of optional and highly L2-specific phenomena in all linguistic subsystems
- the operationalization of such a description in terms of criteria for the assessment of advancedness
Aims of the project
The project has four major aims:
- create an electronic corpus for a detailed, empirical, quantitative and qualitative description of ALVs, the Corpus of Academic Learner English (CALE)
- produce detailed case studies, examining individual (or interplay of several) determinants of lexico-grammatical variation, e.g. weight/complexity, information status, animacy; genre, writing proficiency
- develop a corpus-driven, text-centred method based on linguistic criteria for the assessment of writing proficiency in the academic register
- apply findings to teaching academic writing at advanced levels (e.g. English for Academic Purposes)
More specifically, the following sub-projects are currently on the research agenda:
- syntactic weight and information status and their impact on the positioning of object NPs with particle verbs and transitive verbs + PP
- the (non-)representation of agentivity in L2 academic writing
- emphatic do in advanced learner English
- verb-noun collocations and semantic prosody/preference in advanced learner English
CALE, the Corpus of Academic Learner English
The Corpus of Academic Learner English (CALE) is a specialised corpus of academic learner writing for a detailed, empirical, quantitative and qualitative description of ALVs. Possible native-speaker control corpora for CALE are the Michigan Corpus of Upper-Level Student Papers (MICUSP) or the British Academic Written English corpus (BAWE). In the first and current stage, the corpus will include data from German university students of English. In the long run, CALE is planned to include academic written English from advanced learners of other first-language backgrounds, e.g. Polish, Italian, Spanish, Portuguese, and Chinese, to enable cross-linguistic and typological comparison. We are currently negotiating co-operations with several partners. If you are interested in joining the project, please send an e-mail to the principal investigator. --> Download flyer <--
When the CALE will be completed, it will comprise seven academic text types that are typically produced in university content courses, e.g. research papers, reading reports, abstracts, reviews etc. --> Download text classification <--
News & events
CALE in the press
- April 2012: ALV-project and CALE relocate to the University of Bremen
- 08 June 2011: poster presentation "Korpuslinguistische Methoden in der Erforschung sprachlicher Variation - Korpusbasiertes Arbeiten im Rahmen des Forschungsprojekts 'Determinanten sprachlicher Variation' ", Tag der Forschung 2011, JGU Mainz [our poster section here as .jpg]
- 28 February 2011: Follow-up workshop, research project "Determinants of linguistic variation" in Mainz
- 10 December 2010: Workshop on linguistic technologies and linguistic annotation in Mainz-Germersheim
- 4-5 October 2010: Kick-off workshop, research project "Determinants of linguistic variation" in Mainz
- Fall/winter 2012/13: Corpus Linguistics (Zaytseva)
- Fall/winter 2012/13: Analysing Learner Language (Zaytseva)
Past and upcoming conferences
- Corpus Linguistics 2013, Lancaster University, UK; 22nd to 26th July 2013
- Learner Corpus Research 2013, Bergen, Norway, September 27–29, 2013
- ICAME 34 "English corpus linguistics on the move: Applications and implications", Santiago de Compostela, Spain; 22nd to 26th May 2013; pre-conference workshop "(Learner) Corpora and their application in language testing and assessment"
- COMPILING AND USING LEARNER CORPORA TO TEACH AND ASSESS PRODUCTIVE AND INTERACTIVE SKILLS IN FOREIGN LANGUAGES AT UNIVERSITY LEVEL, Padova, Italy, May 16-17, 2013
- ICAME 33 "Corpora at the centre and crossroads of English linguistics", Leuven (Belgium), 30 May - 3 June 2012
- L2 Proficiency Assessment Workshop, Montpellier (France), 24 - 25 February 2012
Publications and presentations
Recent and upcoming publications
CALLIES, Marcus (to appear). "Agentivity as a determinant of lexico-syntactic variation in L2 academic writing". International Journal of Corpus Linguistics 18(3).
CALLIES, Marcus, DIEZ-BEDMAR, Maria Belen & ZAYTSEVA, Ekaterina (submitted). "Using learner corpora for testing and assessing L2 proficiency", in Leclercq, P., H. Hilton & A. Edmonds (eds.), Proficiency Assessment Issues in SLA Research: Measures and Practices (Second Language Acquisition series). Clevedon: Multilingual Matters.
CALLIES, Marcus (in press). "Advancing the research agenda of Interlanguage Pragmatics: The role of learner corpora". Yearbook of Corpus Linguistics and Pragmatics 2013: New Domains and Methodologies. New York: Springer.
CALLIES, Marcus (in press). "Die Lernerkorpuslinguistik als Brücke zwischen Sprachwissenschaft, Fremdsprachenerwerbsforschung und Fremdsprachendidaktik", in Bürgel, C. & D. Siepmann (eds.), Sprachwissenschaft - Fremdsprachendidaktik: Neue Impulse. Baltmannsweiler: Schneider Verlag Hohengehren.
CALLIES, Marcus & ZAYTSEVA, Ekaterina (in press). "The Corpus of Academic Learner English (CALE) – A new resource for the study and assessment of advanced language proficiency", in Granger, S., G. Gilquin & F. Meunier (eds.), Twenty Years of Learner Corpus Research: Looking back, Moving ahead (Corpora and Language in Use - Proceedings Vol. 1). Louvain-la-Neuve: Presses universitaires de Louvain.
CALLIES, Marcus, ZAYTSEVA, Ekaterina & PRESENT-THOMAS, R. (in press). "Writing Assessment in Higher Education: Making the framework work". Dutch Journal of Applied Linguistics 2(1).
CALLIES, Marcus & ZAYTSEVA, Ekaterina (in press)."The Corpus of Academic Learner English (CALE) – A new resource for the assessment of writing proficiency in the academic register". Dutch Journal of Applied Linguistics 2(1).
CALLIES, Marcus & ZAYTSEVA, Ekaterina (2011). "The Corpus of Academic Learner English (CALE): A new resource for the study of lexico-grammatical variation in advanced learner varieties," in Hedeland, Hanna, Thomas Schmidt and Kai Wörner (eds.), Multilingual Resources and Multilingual Applications (Hamburg Working Papers in Multilingualism B 96), 51-56. [.pdf]
ZAYTSEVA, Ekaterina (2011). "Register, genre, rhetorical functions: Variation in English native-speaker and learner writing", in Hedeland, Hanna, Thomas Schmidt and Kai Wörner (eds.), Multilingual Resources and Multilingual Applications (Hamburg Working Papers in Multilingualism B 96), 239-242. [.pdf]
Recent and upcoming presentations
CALLIES, Marcus & ZAYTSEVA, Ekaterina. Operationalizing and assessing writing proficiency in the academic register: The Corpus of Academic Learner English (CALE). ICAME 34 pre-conference workshop "(Learner) Corpora and their application in language testing and assessment"
CALLIES, Marcus. Using learner corpora for language testing and assessment: Current practices and future challenges. Plenary talk, COMPILING AND USING LEARNER CORPORA TO TEACH AND ASSESS PRODUCTIVE AND INTERACTIVE SKILLS IN FOREIGN LANGUAGES AT UNIVERSITY LEVEL, Padova, Italy, May 16-17, 2013
CALLIES, Marcus. Emphatic do in advanced learner English - A contrastive interlanguage analysis of spoken and written corpora. ICAME 33 - Corpora at the centre and crossroads of English linguistics. 30. May – 3. June 2012, Leuven, Belgium.
CALLIES, Marcus. Compiling a new Language-for-Specific-Purposes learner corpus: the Corpus of Academic Learner English (CALE). 4th International Conference on Corpus Linguistics. 22.-24. March 2012, Universidad de Jaén, Spain.
ZAYTSEVA, Ekaterina. Using Learner Corpora for Testing and Assessing L2 Proficiency. L2 Proficiency Assessment Workshop, 24. – 25. February 2012, Montpellier, France.