"Possibilities and Drawbacks of Using an Online Application for Semi-automatic Corpus Analysis to Investigate Discourse Markers and Alternative Fluency Variables."
Abstract
To overcome planning phases in spontaneous speech production, learners and native speakers use strategies such as (un)filled pauses, smallwords or discourse markers. Small scale studies in this vein have demonstrated that learners differ from native speakers in that they underuse smallwords and discourse markers, and rely on other fluency-enhancing strategies instead. In the present paper, we present a corpus-based study, which investigates fluency-enhancing strategies in four components of the Louvain International Database of Spoken English Interlanguage (LINDSEI; Gilquin et al. 2010), covering four learner English varieties, namely Spanish, German, Bulgarian and Japanese. We investigate 216 different fluencemes (i.e. fluency-enhancing features; Götz in Fluency in native and nonnative English speech, John Benjamins, Amsterdam, 2013) in 200 transcribed interviews with advanced learners of English. An online coding application, which was specially designed and programmed for this project, enables us to cover such a large amount of data. We report on the design, functionality and (dis-)advantages of the online application, the multilevel-coding system we implemented, and the methodological challenges we face in detail. We will also present the findings of one first pilot study where we exhibit considerable variation between and within learners of particular native languages concerning fluenceme frequencies, while distributional patterns of fluencemes are rather similar across varieties.