How to use corpora?

This session covers some selected useful corpora, concordance tools, and corpus tools for teachers and provides them with the skills needed to explore corpus data and discover patterns of language use.

1. The British National Corpus (BNC)

The British National Corpus (BNC) was originally created by the Oxford University Press in the 1980s –early 1990s, and it is an essential tool for linguistic data analysis. It contains 100-million-word texts of British English. It not only includes written texts but also transcriptions of spoken data. It is completely free but requires registration.
The official website:

1.1 Registration
(BNC registration link)

In this short video clip, Prof. Handke explains how to create a BNC account and how to use this well-known corpus.

Linguistic Data Analysis - Using the BNC. Retrieved from

1.2 The BNC frequency search:

In this particular case, the "frequency-of- occurrence" analysis of words in the British National Corpus. The central research question is to generate a list of the occurrences of word-forms derived from the base GENERATE.

The BNC frequency searchs. Retrieved from

1.3 BNC Frequency Search with POS-Tags:

How do we find all verbs that have been derived by means of en- prefixation? The use of BNC-option part-of-speech (POS)-Tag in conjunction with the frequency search is the answer which Prof. Handke discusses in this short video clip.

BNC Frequency Search with POS-Tags Tags. Retrieved from

2. BYU Corpora

The BYU Corpora was created by Mark Davies, Professor of Corpus Linguistics at Brigham Young University. It contains multiple corpora, which are probably the most widely-used corpora currently available-- more than130,000 distinctresearchers, teachers, and students each month. There are also many corpus-based resources provided.
The official website:

Responsive image

The table Screenshot above was retrieved from

3. Corpus of Contemporary American English (COCA)

The Corpus of Contemporary American English (COCA) was created by Mark Davies, Professor of Corpus Linguistics at Brigham Young University. It is the largest freely-available corpus of English, and the only large and balanced corpus of American English. COCA is probably the most widely-used corpus of English, and it is related to many other corpora of English that we have created, which offer unparalleled insights into variation in English. The corpus contains more than 560 million words of text (20 million words each year 1990-2017) and it is equally divided among spoken language, fiction, popular magazines, newspapers, and academic texts.
The official website:

3.1 Introduction to the COCA

This is a series of short videos developed by Dr. Angel Ma from EdUHK, which introduces a variety of search functions available on the COCA website, including Basic Frequency Search, Wildcard Search, Part of speech Search, Synonym Search, Collocates, Compare and KWIC next, and the Chart function:

Gaining familiarity of these search functions could help you design vocabulary teaching and learning activities for students.

Introduction to COCA functions by Dr Angel Ma (EdUHK)

EdUHK Corpus Tutorial: Register

EdUHK Corpus Tutorial Part 1: Basic Search

EdUHK Corpus Tutorial Part 2: Wildcard Search

EdUHK Corpus Tutorial Part 3: Part of Speech

EdUHK Corpus Tutorial Part 4: Synonym

EdUHK Corpus Tutorial Part 5: Collocates

EdUHK Corpus Tutorial Part 6: Compare

EdUHK Corpus Tutorial Part 7: KWIC

EdUHK Corpus Tutorial Part 8: Chart

4. Word and Phrase

4.1 Introduction to Word and Phrase

The official website:

Using Word and Phrase: Introduction. Retrieved from

4.2 Explore collocates in Word and Phrase

This tutorial demonstrates how you can use Word and Phrase to explore collocates.

Using Word and Phrase: Exploring collocates. Retrieved from

4.3 Doing academic searches

This tutorial demonstrates how you can use Academic function of Word and Phrase to improve language usage.

Using Word and Phrase: Doing academic searches. Retrieved from

4.4 Using Word and Phrase to analyze and improve learners’ writing

This video covers the basic functions of Analyze Texts in Word and Phrase and demonstrates how this function can greatly enhance learners’ writing.

Word and Phrase. Retrieved from

5. Compleat Lexical Tutor

Compleat Lexical Tutor was developed by Tom Cobb of University of Quebec at Montreal (UQAM), aiming to provide applications for testing, improving, and researching vocabulary learning. The site provides resources not only for teaching English, but also French, German and Spanish. It also includes concordance, vocabulary profiler, exercise maker, interactive resources, and much more.
The official website:

5.1 How can we use Compleat Lexical Tutor in the classroom?

This tutorial is an introduction to Compleat Lexical Tutor covering Hypertext builder, Dictator, Interactive quiz option, Cloze builder, frequency list, frequency based vocabulary tests.

lextutor@dade. Retrieved from

5.2 Corpus Concordance English

Corpus Concordance English is a powerful and user-friendly concordancer tool in Compleat Lexical Tutor, where you can search collocations, check to see whether use of a word is appropriate. It covers various corpora for language teaching and learning at different school levels. The official website:

6. AntConc

AntConc is a concordance tool for analysing electronic texts in order to discover patterns in language use. It was created by Laurence Anthony of Waseda University. It is one of the most well-designed and user-friendly corpus tools. For teachers, it is an effect way to select target words, design lessons and prepare teaching materials.
The official website:

6.1 Getting started with AntConc

This video demonstrates how to download and get started with AntConc.

AntConc 3.4.0 Tutorial 1: Getting Started. Retrieved from

6.2 Basic features of Concordance function in AntConc

This video shows the basic features of the AntConc concordance.

AntConc 3.4.0 Tutorial 2: Concordance Tool - Basic Features. Retrieved from

6.3 Advanced features of Concordance function in AntConc

This video shows the advanced features of the AntConc concordance

AntConc 3.4.0 Tutorial 3: Concordance Tool - Advanced Features. Retrieved from