The use of language data in the analysis of language and communication has become commonplace, due in part to the increasing range of software tools and functions, as well as to the fact that linguists today are more sensitive to data-driven research methods that have become standard in other disciplines. In particular, corpus linguistics has revolutionised different fields of language study by bringing in data (aka corpora) to language description. Although the 1980s introduced large corpora designed to be representative of the ‘standard’ language written and spoken at the time, applied linguists and language teachers were quick to recognize the huge pedagogic potential behind them. This special issue of ReCALL promotes research into different ways in which corpora can be used by language teachers and learners directly in what has come to be known as ‘data-driven learning’ (DDL), as opposed to mediated by specialists for purposes of language analysis and description.
In English, for example, when is enter followed by into? Can moreover be used in less formal registers? Does blond(e) have the same collocates as its cognate in various other languages (e.g. beer or tobacco)? Is youths simply a synonym for young people? Is therefore mainly used in sentence-initial or mid-sentence position in academic writing? What kind of things do we end up doing? Are there differences in meaning or use between widely, largely and broadly? My student wrote the last several years, but this sounds odd to a British teacher – why? Why did my teacher underline an important number as wrong in my essay? Corpora contain the data necessary to pursue all sorts of queries such as these, most commonly in the form of frequency lists, clusters, words in context (concordances), collocates and colligates, distributions, and so on. Large corpora can be accessed on line in many different languages, and software exists to help create and query specialised corpora for specific needs.
An on-going question is whether and in what conditions this presents any substantial advantage over other learning methodologies, and whether it makes sense to promote corpus use for specific learners with particular needs in a given context, and for a variety of different purposes. This special issue of ReCALL has sought to gather both qualitative and quantitative empirical studies investigating various aspects of corpus use in language teaching and learning. Such research is essential to afford further insight into both the possibilities and limitations of using language corpora for different purposes, whether in mainstream practice among ‘ordinary’ teachers and learners, or for more innovative or specialised uses.
Among other questions we might ask: Can a DDL approach be appropriate for younger or less advanced learners? Can corpora be used deductively as well as inductively, and in teacher-mediated materials on paper as well as for direct consultation via a concordancer? How does corpus use compare to traditional methodologies, or to dictionary use for encoding and decoding? Can a corpus be useful both as a reference resource and as a learning tool? How exactly can learners use corpora to correct errors and improve their writing generally? Are corpora only useful for reading/writing, or can they help with listening/speaking as well? Can corpora be used beyond the level of individual words, from formulaic sequences to discourse? How do learners react to using corpora for different purposes and with varying degrees of autonomy? Can learners make use of corpora other than the large, on-line ones that are best known – including discipline-specific corpora, corpora they build themselves, or corpora including texts they produce themselves?
These and other issues are all addressed at various points in this special issue by specialists from around the world – in Europe, Asia and America – all practising language teachers as well as researchers. Though the contributors are keen not to ignore any difficulties they may encounter, the results are in all cases encouraging and point to a number of future avenues worthy of further exploration.