UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction (2003)
Mary Elaine Califf
and
Raymond J. Mooney
Information Extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for machine learning. We present a aystem,
RAPIER
, that uses pairs of sample documents and filled templates to induce pattern-match rules that directly extract fillers for the slots in the template.
RAPIER
employs a bottom-up learning algorithm which incorporates techniques from several inductive logic programming systems and acquires unbounded patterns that include constraints on the words, part-of-speech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.
View:
PDF
,
PS
Citation:
Journal of Machine Learning Research
(2003), pp. 177-210.
Bibtex:
@Article{califf:jmlr03, title={Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction}, author={Mary Elaine Califf and Raymond J. Mooney}, journal={Journal of Machine Learning Research}, key={Rapier}, pages={177-210}, url="http://www.cs.utexas.edu/users/ai-lab?califf:jmlr03", year={2003} }
People
Mary Elaine Califf
Ph.D. Alumni
mecaliff [at] ilstu edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Inductive Logic Programming
Information Extraction
Machine Learning
Labs
Machine Learning