No Title

NEW MEXICO STATE UNIVERSITY
Department of Computer Science

Computational Linguistics Qualifying Examination
August 23rd, 1999 -- 1-3pm

Part 1, Grammars (60%)

Write a grammar that accepts the sentences in list A and rejects the sentences in list B.

List A                                 List B

Joe is reading the book.           *Joe has reading the book.

Joe had won a letter.              *Joe had win.

Joe has to win.                    *Joe winning.
 
Joe will have the letter.          *Joe will had the letter.

The man could have had the book.   *The man can have having the book.

Joe believes that John has to win. *Joe won that John has to win.

Obviously, your grammar should not target these specific sentences, and should handle auxiliaries and tense/aspect in a general way. However, don't worry about other linguistic features; just make sure your grammar accepts the sentences in List A and does not accept the sentences in List B.

The grammar formalism is up to you. For partial credit, include some documentation. In particular, give a list of the aspects of English your grammar takes care of. Then, briefly document your rules, referring to this list.

Part 2, Evaluation of NLP Systems

1.

(20%) Suppose that lexical resources are generated automatically by a system. Specifically, suppose the system learns clusters of words that are similar to one another, using a similarity metric and an unsupervised clustering technique. Discuss how the system could be evaluated.

2.

(20%) Suppose that someone claims in a paper that their system achieves 95% accuracy ( $\frac{number \mbox{\hspace*{2mm}} correct}{total \mbox{\hspace*{2mm}} number}$ ) on the task of assigning a word sense to every word (token) that appears in a corpus.

Assume that the system was tested on unseen test data, and measured against a gold-standard set of manual annotations. Assume that the agreement among the human annotators is good, the researchers are honest, and the system does not contain bugs.

Shall we conclude that word-sense disambiguation is effectively solved? To answer this, write a critique of the evaluation.

About this document ...

Next: About this document ...

Graduate Representative Account
2000-08-03