Marshall R. Mayberry, III, and Risto Miikkulainen (1994). Lexical Disambiguation Based on Distributed Representations of Context Frequency. In Proceedings of the 16th Annual Conference of the Cognitive Science Society, Atlanta, GA.
Here is the abstract:
A model for lexical disambiguation is presented that is based on combining the frequencies of past contexts of ambiguous words. The frequencies are encoded in the word representations and define the words' semantics. A Simple Recurrent Network (SRN) parser combines the context frequencies one word at a time, always producing the most likely interpretation of the current sentence at its output. This disambiguation process is most striking when the interpretation involves semantic flipping, that is, an alternation between two opposing meanings as more words are read in. The sense of ``throwing a ball'' alternates between ``dance'' and ``baseball'' as indicators such as the agent, location, and recipient are input. The SRN parser demonstrates how the context frequencies are dynamically combined to determine the interpretation of such sentences. We hypothesize that several other aspects of ambiguity resolution are based on similar mechanisms, and can be naturally approached from the distributed connectionist viewpoint.The SRN has been trained on a corpus of 125 active and 125 passive sentences each having three parameters: an agent, a location, and a recipient. Each of these parameters can take on one of five values. These are graduated from values which are strongly associated with the baseball sense of the word ``ball'', through values that are neutral with respect to the sense, and finally to values that are strongly associated with the dance sense of "ball". Thus, for example, if the passive sentence
The ball was thrown in the clubroom for the fans by the emcee.is analyzed on a word-by-word basis, a hearer could be expected to anticipate the dance sense of ``ball'' upon hearing clubroom, which is moderately associated with that sense in the training corpus, then to switch over to the baseball sense upon encountering the word fans which is more strongly associated with baseball, and finally back to dance when emcee is processed, because of that word's strong association with that sense of ``ball''.
For the sake of explicitness, the words used in the lexicon have been handcoded (see the paper for details) so that the activation of the last unit of the word labeled by ``Ball'' in the output reveals its sense. This value is displayed under the ``Computed'' label at the bottom of the demo, together with the ``Predicted'' value based on the actual frequency of this context in the training corpus.
Given this background, we can understand the components of the demo:
The blue line gives the predicted activation (and, therefore, sense of the word "ball") based on its frequency in the given context during training. The red line shows the activation the network actually learned. The activations of the units themselves are distributed across the spectrum with a rough breakout (depending on how many colors are actually allocated on your monitor) as follows:
Back to UTCS Neural Networks home page
martym@cs.utexas.edu Last update: 1.8 2000/06/24 04:27:45 jbednar