Fall 1999 Qualifying Exam for Natural Language Processing

*** Questions ***

Part 1, Grammars (60\%)

Write a grammar that accepts the sentences in list A and rejects the sentences in list B.

List A List B
Joe is reading the book. *Joe has reading the book.
Joe had won a letter. *Joe had win.
Joe has to win. *Joe winning.
Joe will have the letter. *Joe will had the letter.
The man could have had the book. *The man can have having the book.
Joe believes that John has to win. *Joe won that John has to win.

Obviously, your grammar should not target these specific sentences, and should handle auxiliaries and tense/aspect in a general way. However, don't worry about other linguistic features; just make sure your grammar accepts the sentences in List A and does not accept the sentences in List B.

The grammar formalism is up to you. For partial credit, include some documentation. In particular, give a list of the aspects of English your grammar takes care of. Then, briefly document your rules, referring to this list.

Part 2, Evaluation of NLP Systems, (20\%)

Suppose that lexical resources are generated automatically by a system. Specifically, suppose the system learns clusters of words that are similar to one another, using a similarity metric and an unsupervised clustering technique. Discuss how the system could be evaluated.

Part 3, (20\%)

Suppose that someone claims in a paper that their system achieves 95% accuracy (number correct)\ (total number) on the task of assigning a word sense to every word (token) that appears in a corpus.

Assume that the system was tested on unseen test data, and measured against a gold-standard set of manual annotations. Assume that the agreement among the human annotators is good, the researchers are honest, and the system does not contain bugs.

Shall we conclude that word-sense disambiguation is effectively solved? To answer this, write a critique of the evaluation.

*** Answers ***

Part 1:

NOTE: everything is singular, so the student does not have to worry about number. Also, everything is active, so the student does not have to worry about passives. (etc.)

S --> NP VP

NP --> propernoun

NP --> det noun

NP --> det noun PP

PP --> prep NP

VP --> be(pres) verb(pres-participle, transitive) NP

VP --> be(pres) verb(pres-partiple, intransitive)

VP --> have (pres or past), verb (past-participle, transitive) NP

VP --> have (pres or past), verb (past-participle, transitive, intransitive) NP

etc. -- one rule for each.

propernoun --> Joe

det --> the

noun --> letter

noun --> man

Lexicon --

be : is (pres), was (past), be (root)

read: reads (pres), read (past), reading (pres-participle), transitive, intransitive

Part 2:

There are lots of answers that could be given here. A sample:

1. The clusters could be treated as word senses. A mapping to a dictionary of sense distinctions could be developed; given this mapping, the system could be evaluated on a corpus annotated according to the dictionary.

2. a lexical expert could judge the resonableness of a random sample of the similarity judgements produced by the system

3. evaluate the clusters of similar words by using them to improve the performance of another application. For example, they can be used to select parse trees among the set of amibiguous trees produced by a grammar, or to perform discourse segmentation. One needs a way to evaluate the application, of course.

4. Compare the results to roget's thesaurus or wordnet.

Part 3:

The main problem with the paper is that the author does not report results separately for ambiguous and unambiguous tokens. By doing so, the accuracy results are artificially inflated.

Another issue the student could discuss is how representive the corpus is.

He could also question how fine-grained the sense disinctions are in the dictionary that is used. If the distinctions are too course grained, the system might, in essence, simply be reporting the results of part of speech tagging.
 
 


[comments][csgso home][cs home][nmsu home]

Last modified: July 18, 2000.
gradrep@cs.nmsu.edu