The follow-up exercise then gives students the opportunity to produce similar comment-utterances using idioms, in response to stimulus utterances. This is contextualised using two structures which are very familiar to language teachers: However, some closed sets are very large e. Hatcher did not base her statements on a corpus, but did conclude that get will be used only for the two types of events just treated: The two lists are comparable.

All three share features associated with written language, that is to say the high fre- quency of: This chapter gives consideration to how we define idioms and how they can be extracted from a corpus. Replicating in the classroom and in materials, however artificially, the contexts in which idioms typically occur is likely to be more motivating to learners than decontextu- alised attempts to understand and remember these tricky items, not least because in actual contexts idioms often contain their own paraphrases or at least many clues as to their meaning. Combining a dictionary and a corpus can be a valuable route in a pedagogical context. A corpus can reveal the regular, patterned prefer- ences of the language users represented in it, speaking and writing in the contexts in which the corpus was gathered. We aim to show how a corpus can reveal a lot about the pragmatic force of grammatical choices.

We can generate a list of chunks for the whole of a big corpus to get some idea of the general distribution of chunks.

Forensic linguistics Another area which is increasingly using language corpora as a tool is forensic linguistics, which broadly concerns itself with the use of language in law and crime investigation.

Lexicography Language corpora have many applications beyond language description for its own sake. For instance, if you wished to build a corpus of your own classroom interactions, you would first need to record the classes and then transcribe them. They focus on the use of non- specific ellos English equivalent: The chancellor also asks us to bargain away whatever obligations or int 3:


Stylistics In other language-related fields, corpora are also being used. Good dic- tionaries of idioms encode such information for the user, based on large-scale observations of corpora. It also focuses on key issues and debates that have emerged around corpus research.

We say bitterly disappointed in preference to but not the absolute prohibition of sourly disappointed there is nothing to stop, say, a poet using this unusual collocation ; tea is usually strong, but cars are powerful, and so on.

It is helpful, at this stage, to make clear what it is not. We are also very conscious in this book that there is a proliferation of corpora dedicated to the English language.

If you have no requirement to know where overlapping utterances and interruptions occur, then there is no point in spending time transcribing to that level of detail.

The frequency curve does not decline at a regular rate across the whole of the vocabulary; there is a continental shelf of high-frequency, core items, after which the curve takes a nose-dive into the vast depths of tens of thousands ga23at relatively low-frequency words.

Chunks are ready for use at any moment and do not need re-assembling every time they are used.

While L 13 junkies, pop history freaks and casual bargain hunters. Three- four- and five-word chunks and common single words occurrences 0 of ce t ice er ly ha lar th sin it tw et ab eo gu lik re th gs d in an th at d th an is th arising from the extremely high frequency and weak collocability of its component words and their inevitable repeated collision in the corpus, or do such co-occurrences reveal any- thing about how we communicate with one another?

In any corpus, items apparently belonging to closed sets will not necessarily occur with equal fre- quency. Dog and bark collocate significantly, cat and bark are not likely to do so to any significant extent.


Record Collector magazi 14 as keen on trail ab23at as they are on bargain hunting. However, when these individual forms are studied in patterns, the picture changes. Vocabulary skills include ways of maximising learning opportu- nities during interaction e.

In such cases, the corpus which is chosen must best represent the language or language variety, and, if com- paring varieties, the corpora themselves must be comparable.

But there is still undoubtedly a place in many edu- cational contexts for learning about the colourful, cultural aspects of language ba23t for observing cultures as they live through ek words and actions, without any presupposi- tion that the wad is short-term or even long-term lexical acquisition or production.

Language corpora can be com- posed of written or spoken texts, or a mix of both, and nowadays the capability exists to add multimedia elements, such as video clips, to corpora of spoken language.

Vagueness is central to informal conversation, and its absence can make utterances blunt and pedantic, especially in such domains as references to number and quantity, where approximation rather than precision is the norm in conversation compare that with technical and wpisode discourse, where precision is usually sought after and admired.

The expression let sth.


This compares with Biber et al. Why should learners who do not necessarily wish to sound like native speakers bother with them? Let us look the word bargain using a dictionary and some corpus examples: