Our English Speech Corpus – Corpus-aided English Speaking Learning and Teaching System

The English Speech Corpus with Different Proficiency Levels is expanded and redeveloped from the previous small-scale spoken corpus. It contains 78 sets of spontaneous speech data and 13 sets of classroom presentation data. Of the 78 sets of data, 48 are collected from Chinese Mainland and Hong Kong learners, and 30 are retrieved from IELTS speaking official videos.

In addition to the authentic speech data, detailed annotations were also made based on the four aspects of IELTS speaking criteria (i.e., fluency and coherence, lexical resource, grammatical range and accuracy, and pronunciation), which can help identify difficulties in English speaking learning for Chinese English learners with different language levels. The high-quality recordings are ideally suited for teachers, learners, and researchers around the world.