Special Issue on Empirical Natural Language Processing
AI Magazine,
Vol. 18, No. 4, Winter 1997
(AAAI members see here for PDF versions of the papers)
Abstracts for papers in the collection edited by Eric Brill and Raymond
Mooney
An Overview of Empirical Natural Language Processing
Eric Brill and
Raymond J. Mooney
18(4): Winter 1997, 13-24
In recent years, there has been a resurgence in research
on empirical methods in natural language
processing. These methods employ learning techniques
to automatically extract linguistic knowledge
from natural language corpora rather than require
the system developer to manually encode
the requisite knowledge. The current special issue
reviews recent research in empirical methods in
speech recognition, syntactic parsing, semantic
processing, information extraction, and machine
translation. This article presents an introduction
to the series of specialized articles on these topics
and attempts to describe and explain the growing
interest in using learning methods to aid the development
of natural language processing systems.
Linguistic Knowledge and Empirical Methods in Speech Recognition
Andreas Stolcke
18(4): Winter 1997, 25-32
Automatic speech recognition is one of the fastest
growing and commercially most promising applications
of natural language technology. The technology
has achieved a point where carefully
designed systems for suitably constrained applications
are a reality. Commercial systems are available
today for such tasks as large-vocabulary dictation
and voice control of medical equipment. This
article reviews how state-of-the-art speech-recognition
systems combine statistical modeling, linguistic
knowledge, and machine learning to achieve
their performance and points out some of the
research issues in the field.
Statistical Techniques for Natural Language Parsing
Eugene Charniak
18(4): Winter 1997, 33-44
I review current statistical work on syntactic parsing
and then consider part-of-speech tagging,
which was the first syntactic problem to successfully
be attacked by statistical techniques and also
serves as a good warm-up for the main topic-statistical
parsing. Here, I consider both the simplified
case in which the input string is viewed as a string
of parts of speech and the more interesting case in
which the parser is guided by statistical information
about the particular words in the sentence. Finally,
I anticipate future research directions.
Corpus-Based Approaches to Semantic Interpretation in NLP
Hwee Tou Ng and John Zelle
18(4): Winter 1997, 45-64
In recent years, there has been a flurry of research
into empirical, corpus-based learning approaches
to natural language processing (NLP). Most empirical
NLP work to date has focused on relatively
low-level language processing such as part-of-speech
tagging, text segmentation, and syntactic
parsing. The success of these approaches has stimulated
research in using empirical learning techniques
in other facets of NLP, including semantic
analysis--uncovering the meaning of an utterance.
This article is an introduction to some of the
emerging research in the application of corpus-based
learning techniques to problems in semantic
interpretation. In particular, we focus on two important
problems in semantic interpretation,
namely, word-sense disambiguation and semantic
parsing.
Empirical Methods in Information Extraction
Claire
Cardie
18(4): Winter 1997, 65-80
This article surveys the use of empirical, machine-learning
methods for a particular natural language-
understanding task-information extraction.
The author presents a generic architecture for
information-extraction systems and then surveys
the learning algorithms that have been developed
to address the problems of accuracy, portability,
and knowledge acquisition for each component of
the architecture.
Automating Knowledge Acquisition for Machine Translation
Kevin Knight
18(4): Winter 1997, 81-96
Machine translation of human languages (for example,
Japanese, English, Spanish) was one of the
earliest goals of computer science research, and it
remains an elusive one. Like many AI tasks, trans-lation
requires an immense amount of knowledge
about language and the world. Recent approaches
to machine translation frequently make use of
text-based learning algorithms to fully or partially
automate the acquisition of knowledge. This article
illustrates these approaches.