UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Statistical Relational Learning for Natural Language Information Extraction (2007)
Razvan Bunescu
and
Raymond J. Mooney
Understanding natural language presents many challenging problems that lend themselves to statistical relational learning (SRL). Historically, both logical and probabilistic methods have found wide application in natural language processing (NLP). NLP inevitably involves reasoning about an arbitrary number of entities (people, places, and things) that have an unbounded set of complex relationships between them. Representing and reasoning about unbounded sets of entities and relations has generally been considered a strength of predicate logic. However, NLP also requires integrating uncertain evidence from a variety of sources in order to resolve numerous syntactic and semantic ambiguities. Effectively integrating multiple sources of uncertain evidence has generally been considered a strength of Bayesian probabilistic methods and graphical models. Consequently, NLP problems are particularly suited for SRL methods that combine the strengths of first-order predicate logic and probabilistic graphical models. In this article, we review our recent work on using Relational Markov Networks (RMNs) for information extraction, the problem of identifying phrases in natural language text that refer to specific types of entities. We use the expressive power of RMNs to represent and reason about several specific relationships between candidate entities and thereby collectively identify the appropriate set of phrases to extract. We present experiments on learning to extract protein names from biomedical text, which demonstrate the advantage of this approach over existing IE methods.
View:
PDF
,
PS
Citation:
In
Introduction to Statistical Relational Learning
, L. Getoor and B. Taskar (Eds.), pp. 535-552, Cambridge, MA 2007. MIT Press.
Bibtex:
@InCollection{bunescu:bkchapter07b, title={Statistical Relational Learning for Natural Language Information Extraction}, author={Razvan Bunescu and Raymond J. Mooney}, booktitle={Introduction to Statistical Relational Learning}, editor={L. Getoor and B. Taskar}, address={Cambridge, MA}, publisher={MIT Press}, pages={535-552}, url="http://www.cs.utexas.edu/users/ai-lab?bunescu:bkchapter07b", year={2007} }
People
Razvan Bunescu
Ph.D. Alumni
bunescu [at] ohio edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Information Extraction
Machine Learning
Statistical Relational Learning
Labs
Machine Learning