COR: Lab 1: Searching a Corpus

This lab will give you practice searching through a part-of-speech-tagged corpus.

Upload the final paper as pdf.
It should be called lab1-yoursurname.pdf.

Replace the ??? with your answers.

Which corpus did you use?
???

      
Q1: (3)

Here's a list of some wordforms that can be nouns, verbs, or
adjectives. For each, determine the frequency of each the wordform
used in each grammatical category. [Only look at these wordforms, not
inflected/derived variants, and don’t worry about other,
e.g. prepositional, uses.]

Word form	N freq	V freq	A freq
---------------------------------------
green		???	???	???
top		???	???	???
fly		???	???	???
meet		???	???	???

Do these agree with your intuitions?

???


Q2: (4)

List some examples of idioms of the type "the Xer the Yer", e.g. "the
more the merrier", the "bigger the better", and show the quer(y|ies) 
you used to find them.

???

What kind of words appear in the first slot (X)?

???

What kind of words appear in the second slot (Y)?

???

Do they fall into neat semantic classes? 

???

Are the classes independent?

???

Q3: (3)

It is often claimed that "less" is used with uncountable nouns and
"few" with countable nouns. 

Design some queries to test this claim and show the queries and results:

???

How accurate is the claim, according to the text in your corpus?

???

What would be a more accurate claim?

???


COR (Corpus Linguistics) main page.

Francis Bond <bond@ieee.org> <francis.bond@upol.cz>
Home page