Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are now largely derived by an automated process.
The corpus approach runs counter to Noam Chomsky's view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting[citation needed].
The problem of laboratory-selected sentences is similar to that facing lab-based psychology: researchers do not have any measure of the ethnographic representativity of their data.
Corpus linguistics does away with Chomsky's competence/performance split[citation needed]: adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference. Within corpus linguistics there are divergent views as to the value of corpus annotation, from John Sinclair advocating minimal annotation and allowing texts to 'speak for themselves', to others, such as the Survey of English Usage team (based in University College, London) advocating annotation as a path to greater linguistic understanding and rigour.
Source: Wikipedia.org
CORPORA
AMERICAN NATIONAL CORPUS (ANC)
BERGEN CORPUS OF LONDON TEENAGER LANGUAGE (COLT)
BRITISH ACADEMIC SPOKEN ENGLISH CORPUS (BASE)
BRITISH NATIONAL CORPUS (BNC)
CAMBRIDGE AND NOTTINGHAM CORPUS OF DISCOURSE IN ENGLISH (CANCODE)
CAMBRIDGE INTERNATIONAL CORPUS (CIC)
COLLINS WORDBANKS ONLINE ENGLISH CORPUS
6 Online Corpus
What can we use Corpus with our students for?
Mainly to keep them updated about common colloquial language: idioms, collocations, slang, reduced forms which they may need as they encounter real English in movies or songs. Also, to make a research project.
Welcome Everyone! I am glad to present our new Lexico Grammar blog in which you can find all the information you need regarding this course. Please, feel welcome to comment on the entries, leave questions and so on. Remember that we are all learning together.
Showing posts with label Corpus. Show all posts
Showing posts with label Corpus. Show all posts
Friday, November 26, 2010
Sunday, August 23, 2009
Corpus
• CORPUS
From Corpus to Classroom
WHAT IS CORPUS?
• What is Corpus?
A corpus is a collection of texts, written or spoken, which is stored on a computer.
A corpus is a principled collection of texts available for qualitative and quantitive analysis.
It must represent something and its merits will often be judged on how representative it is.
WHAT CAN WE USE FROM IT?
• COLLOCATIONS
Words that collocate with another and no other:
Depend on
Look up
Wooden box (ADJECTIVE+NOUN)
• WORDS/CHUNKS
A SMALL COMPONENT OF LANGUAGE:
I
YOU
I DON’T KNOW
A LOT OF
ONE OF THE
I MEAN
THE
• DISCOURSE MARKERS
OPENINGS AND CLOSINGS
YOU KNOW
I MEAN
ANYWAY
MIND YOU
WELL
• FREQUENCY
THE RANGE IN WHICH A WORD IS REPEATED IN CERTAIN DISCOURSE
S1—S2—S3
W1—W2—W3
• REGISTER
FORMAL/INFORMAL/COLLOQUIAL
TECHNICAL
From Corpus to Classroom
WHAT IS CORPUS?
• What is Corpus?
A corpus is a collection of texts, written or spoken, which is stored on a computer.
A corpus is a principled collection of texts available for qualitative and quantitive analysis.
It must represent something and its merits will often be judged on how representative it is.
WHAT CAN WE USE FROM IT?
• COLLOCATIONS
Words that collocate with another and no other:
Depend on
Look up
Wooden box (ADJECTIVE+NOUN)
• WORDS/CHUNKS
A SMALL COMPONENT OF LANGUAGE:
I
YOU
I DON’T KNOW
A LOT OF
ONE OF THE
I MEAN
THE
• DISCOURSE MARKERS
OPENINGS AND CLOSINGS
YOU KNOW
I MEAN
ANYWAY
MIND YOU
WELL
• FREQUENCY
THE RANGE IN WHICH A WORD IS REPEATED IN CERTAIN DISCOURSE
S1—S2—S3
W1—W2—W3
• REGISTER
FORMAL/INFORMAL/COLLOQUIAL
TECHNICAL
Subscribe to:
Posts (Atom)