The Parallel EAP (English for Academic Purposes) corpora website provides a powerful corpus-based online learning system to help language teachers to teach and learners to learn academic English by searching, examining and researching two specialised EAP corpora on English Language Studies/Education. The goal is to enhance and develop advanced English learners’ ability in using lexis, grammar and lexico-grammatical items in academic writing. These EAP corpora can be used for research, teaching and learning purposes. Advanced English learners may find this website a useful resource for self-learning to improve their academic writing skills. Teachers can use the website as a rich resource to prepare materials for teaching academic writing to language or non-language majors. In addition, researchers may use it for researching into academic lexis, grammar or other issues related to academic writing or L2 writing in general. The same holds true for novice researchers, i.e., postgraduate students or final year undergraduates who work on research projects concerning L2 or academic writing.
This website accommodates two distinct corpora: (1) an EAP Learner Corpus and (2) an EAP Professional Corpus. The former consists of one million words from course assignments submitted for different language-related or linguistic courses by English majors (undergraduates and postgraduates) from a higher institution in Hong Kong. The latter is composed of one million words from published research journal articles or book chapters in the discipline of language studies/education. Each corpus is divided into six subject sections that match six sub-disciplines in language studies/education: (1) general linguistics, (2) SLA, (3) vocabulary, (4) morphology, (5) ELT research reports, (6) comparative language studies.
Different from most existing corpus-based websites, the two corpora hosted on this website are both syntactically and semantically tagged. The Part of Speech (POS) search allows users to examine the syntactic (structural) properties of keywords and the semantic search enables them to study the semantic categories of keywords. A key feature of this corpus website is that users can search keywords syntactically and semantically separately or simultaneously to make comparisons between the two corpora in varied ways. In addition, users can search keywords in different sections (i.e. introduction, literature review, research method, results, discussion and conclusion) of the component ELT research reports in each corpus in order to study respectively the distinctive features in the use of lexis and grammar in experts’ and learners’ writing. Finally, collocations of keywords searched can be generated automatically to allow users to examine the collocational features in the writing of both experts and learners.
Due to technological constraints, Google Chrome is proposed to be the default engine to perform various searches on this corpora website.
© Copyright 2014. All rights reserved. Enquiry: maqingangel@gmail.com
Visitor Count: 30351Download PDF version of User Guide
The Parallel EAP Corpora search engine provides a set of features which are grouped into different tabs as shown below.
Figure 1: Interface of the Parallel EAP Corpora
The POS Search tab and Semantic Search tab are shown below. This allows searches based on a number of search parameters. The simplest search is to enter a word in the Search Keyword input box and press ENTER, without modifying any other search parameters. For example, the word “study” is entered in the search box.
The POS search results are shown below.
Figure 2: POS search results
The semantic search results are shown below.
Figure 3: Semantic search results
There is a toggle button to switch between two modes, namely “Basic Mode” and “Advanced Mode”. Its label will also change according to the current status.
In POS Search Basic Mode, fewer search parameters are available and users are not required to specify these features in detail. Users will only need to select the parts of speech for the keyword in the search box via the POS Tag attribute.
In a similar vein, fewer search parameters are available in the Semantic Search Basic Mode and users are not required to specify them in detail. Users will only need to select the main semantic labels for the keyword they would like to search in the search box via the Semantic Tag attribute (see below).
Figure 4: Semantic Tags in Basic Mode
In Advanced Mode, in addition to the primary level of POS tags, more detailed categories are available, as shown in the figure below. Users may narrow down the search by selecting a secondary level of POS tags to search.
Figure 5: Secondary level tags for POS Search Advanced Mode
For detailed information on the POS tag sets and their meanings you may refer to this website http://ucrel.lancs.ac.uk/claws7tags.html.
In Advanced Mode, users can choose to limit the scope of search by the “Subject” attribute (see below).
Figure 6: The “Subject” attribute in POS Search Advanced Mode
In Advanced Mode, if user selects “ELT Research” in the “Subject” attribute, the “Section” attribute will appear (see below).
Figure 7: The “Section” attribute in POS Search Advanced Mode
In Advanced Mode, users can choose to limit the scope of search by the “Genre” attribute (see below).
Figure 8: The “Genre” attribute in POS Search Advanced Mode
Like the case of POS Search, only the primary level tags are available in Basic Mode, whereas in Advanced Mode, the secondary level tags are available for selection when any primary level tag is chosen (see below).
Figure 9: Secondary level tags for Semantic Search Advanced Mode
Apart from the Semantic Tag parameter, the search parameters of Semantic Search are the same as those of POS Search; the results are also displayed in a similar way (see below).
Figure 10: Search results in Semantic Search
For detailed information on the semantic tag sets and their meanings you may refer to this website http://ucrel.lancs.ac.uk/usas/.
The POS Tag can be specified to narrow down the search results. The list of possible values are shown in the following figure.
Figure 11: POS Tag
For example, only “Noun” will be included in the search results with this setting (see below):
Figure 12: Selecting “Noun” POS Tag to search the word “study”
Results: The tag display is turned on to show that the matched keywords have the POS Tag NN1.
Figure 13: POS Tag Search Results
The Semantic Tag can be specified to narrow down the search results. The list of possible values is shown in the following figure.
Figure 14: Semantic Tag
When the default match mode “Exact” is selected, the exactly matched word (“study” in this case) will be searched and included in the results (see below).
Figure 15 and 16: Respectively show the match mode function in both POS and Semantic Search
The match mode parameter allows you to adjust the mode of matching. There are four matching modes, namely “Exact”, “Starts with”, “Contains” and “Ends with”:
“Exact” is the default mode of search. Search results using this matching mode will include lexical items exactly the same as the keyword, i.e. “study”.
For “Starts with”, any lexical items starting with the keyword entered will be identified, like “studying”, “studied”, etc.
“Contains” mode will search for any lexical items that contain the keyword entered, like “case-study”, “self-studying”, etc.
“Ends with”: any lexical items ending with the keyword entered will be identified, like “case-study”, “meta-study”, but not “studying” nor “study-leave”.
Naturally the “Contains” match mode should provide the most number of search results among all four modes.
This parameter is used to control the length of contents to be displayed in the results. Users can slide the circle to change the desired value. The minimum length is 20 characters (see below).
Figure 17: Concordance Length (Minimum Length)
The length of the contents displayed will be shorter (see below).
Figure 18: The results displayed shows 20 characters of the sentence containing the keyword
The maximum allowed is 120 characters (see below).
Figure 19: Concordance Length (Maximum Length)
The larger contexts of the results can also be examined (see below).
Figure 20: The results displayed shows 120 characters of the sentence containing the keyword
By default, all texts in the corpus will be included in the POS and Semantic Search. Users, however, can choose to limit the scope of search by the corpus type (“learner” or “professional”), like below:
Figure 21 and 22: Respectively showing the “Corpus” type in POS and Semantic Search
Users can also limit the scope by subject in POS and Semantic Search (see below):
Figure 23 and 24: Respectively showing the “Subject” attribute in POS and Semantic Search
If user selects “ELT Research” in the “Subject” attribute, the “Section” attribute will appear in Advanced Mode (see below):
Figure 25 and 26: Respectively the “Section” attribute is displayed in POS and Semantic Search
The search results are tabulated for easy navigation with a number of columns indicating their different attributes. The results are based on sentence units, which means if a sentence has more than one match, the sentence will only appear once in the result section (see below).
Word column: indicates the matched word
Contents column: shows the sentence containing the matched word. The length of the sentence fragment shown depends on the “Concordance Length” parameter in the search form
Corpus column: shows from which corpus this result comes from: “Learner” or “Professional”
Subject column: shows the subject to which this result belongs
Genre column: shows the genre of the searched text
Word Filter (a selection list at top right corner): displays only the results of the selected word among the words in the Word Filter list
Show n entries (at top left corner): changes the number of entries on one page. There are choices of 10, 25, 50, 100 entries on one page
Tag Display switch (a button at top left corner): it enables users to switch between two modes whether to show the tags of each word or not. The default is off. When this is switched on, the button will become “Tag Display On”, and the results will be changed accordingly (see below):
Figure 27: Results generated when Tag Display is On
When a concordance line is clicked, a window will appear showing the whole essay, with the currently matched sentence highlighted (see below):
Figure 28: The highlighted text of the entire paragraph containing the keyword
The “Show / hide Tags” button at the bottom functions similarly and will toggle the display of tags for the essay. Clicking the “Close” button will dismiss the essay display.
After every search, a word list will be generated consisting of all matched words according to the search parameters. For example, a search of “study” with “Contains” match mode will result in the following word list:
Figure 29: POS Word List generated for the word “study” with match mode “Contains”
The list shows the matched word, its POS tag, its frequency in the corpus, and the adjusted frequency per million for cross-corpus comparison. The description of each POS Tag can be examined by clicking the “[?]” hyperlink beside each POS Tag. For example, clicking the link beside the NN1 tag will show the following popup message:
Figure 30: The explanation of the POS Tag
From the word list generated above, clicking any word will generate another list showing collocation details of the word. For example, clicking “studying” with POS VVG will result in the following collocation details, shown below in the word list table.
Figure 31: Collocation results generated for the word “studying” with POS VVG
The collocation results show the preceding and following 5 words of the chosen word, sorted by descending frequency (indicated in the parentheses) at each position.
Clicking any collocated word in the above table will further show the actual context where the chosen word and collocated words occur; see the example below:
Figure 32: More contexts and information shown for the collocated words
After every search, a word list will be generated consisting of all matched words according to the search parameters. For example, a search of “study” with “Exact” match mode will result in the following word list:
Figure 33: Semantic Word List generated for the word “study” with match mode “Exact”
The list shows the matched word, its Semantic tag, its frequency in the corpus, and the adjusted frequency per million for cross-corpus comparison. The description of each Semantic Tag can be examined by clicking the “[?]” hyperlink beside each tag. For example, clicking the link beside the Z2 tag will show the following popup message:
Figure 34: The explanation of the Semantic Tag
From the word list generated above, clicking any word will generate another list showing collocation details of the word. For example, clicking study with Semantic Tag Z2 will result in the following collocation details; see below in the word list table.
Figure 35: Collocation results generated for the word “study” with Semantic Tag Z2
The collocation results show the preceding and following 5 words of the chosen word, sorted by descending frequency (indicated in the parentheses) at each position.
Clicking any collocated word in the above table will further show the actual context where the chosen word and collocated word occur; see an example below:
Figure 36: More contexts and information shown for the collocated words
Sometimes users may need to juxtapose two different search results for easier comparison. For example, the exact match of “study” and “studying” results may be compared. One way to do this is to perform one search after another. This comparison feature makes it easy to do comparisons among different search modes. The user can change the search parameters to perform the two searches one after one, and then choose the “Result Destination” to display the results on the Comparison tab.
For example, by choosing the “study” with exact match from the “Learner Corpus” and placing the result at “Comparison A”, the result is shown below:
Figure 37: Display search result “study” from Learner Corpus in Comparison A
Then choosing the “study” with exact match from “Professional Corpus” and placing the result at “Comparison B”, the result is shown below:
Figure 38: Display search result “study” from Professional Corpus in Comparison B
On the Comparison tab, the first and second search results will be shown on the same page, as below.
Figure 39: A side by side comparison on the word “study” from POS Search
The comparison tabs not only allow comparison between different searches from POS Search, but also enable comparison between POS Search and Semantic Search, or actually any combination. Choosing the “Result Destination” will display the results accordingly in the selected location in the Comparison tab.
The example below shows POS Search in “Comparison A” and Semantic Search in “Comparison B”.
Figure 40: A side by side comparison on the word “study” between POS Search and Semantic Search
Choose "Comparison A" in Result Destination in your search to display the results here
Choose "Comparison B" in Result Destination in your search to display the results here