Natural language understanding is a sub-field of natural language processing,
which builds automated systems to understand natural language.
It is such an ambitious task that it sometimes is referred to as an AI-complete problem,
implying that its difficulty is equivalent to solving the central
artificial intelligence problem -- making computers as intelligent as people.
Despite its complexity, natural language understanding continues to be a fundamental problem
in natural language processing in terms of
its theoretical and empirical importance.
In recent years, startling progress has been made at different levels of natural language processing tasks,
which provides great opportunity for deeper natural language understanding.
In this thesis, we focus on the task of semantic parsing, which maps a natural language sentence into a
complete, formal meaning representation in a meaning
representation language.
We present two novel state-of-the-art learned syntax-based semantic parsers using
statistical syntactic parsing techniques,
motivated by the following two reasons.
First, the syntax-based semantic parsing is theoretically well-founded
in computational semantics.
Second, adopting a syntax-based approach allows us to directly leverage
the enormous progress made in statistical syntactic parsing.
The first semantic parser, SCISSOR,
adopts an integrated syntactic-semantic parsing approach,
in which a statistical syntactic parser
is augmented with semantic parameters to produce a semantically-augmented parse tree (SAPT).
This integrated approach allows both syntactic and semantic information
to be available during parsing time
to obtain an accurate combined syntactic-semantic analysis.
The performance of SCISSOR is further improved by using discriminative reranking for incorporating
non-local features.
The second semantic parser, SYNSEM,
exploits an existing syntactic parser to produce disambiguated parse trees that
drive the compositional semantic interpretation.
This pipeline approach allows semantic parsing to conveniently leverage the most recent
progress in statistical syntactic parsing.
We report experimental results
on two real applications: an interpreter for coaching instructions in
robotic soccer and a natural-language database interface,
showing that
the improvement of SCISSOR and SYNSEM over other systems
is mainly on long sentences, where the knowledge of syntax
given in the form of annotated SAPTs or syntactic parses from an existing parser helps semantic composition.
SYNSEM also significantly improves results with limited training
data, and is shown to be robust to syntactic errors.
PhD Thesis, Department of Computer Science, University of Texas at Austin. 165 pages.