UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Relational Learning of Pattern-Match Rules for Information Extraction (1998)
Mary Elaine Califf
and
Raymond J. Mooney
Information extraction is a form of shallow text processing which locates a specified set of relevant items in natural language documents. Such systems can be useful, but require domain-specific knowledge and rules, and are time-consuming and difficult to build by hand, making infomation extraction a good testbed for the application of machine learning techniques to natural language processing. This paper presents a system, RAPIER, that takes pairs of documents and filled templates and induces pattern-match rules that directly extract fillers for the slots in the template. The learning algorithm incorporates techniques from several inductive logic programming systems and learns unbounded patterns that include constraints on the words and part-of-speech tags surrounding the filler. Encouraging results are presented on learning to extract information from computer job postings from the newsgroup misc.jobs.offered.
View:
PDF
,
PS
Citation:
In
Proceedings of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing
, pp. 6-11, Standford, CA, March 1998.
Bibtex:
@InProceedings{califf:aaai-amldp98, title={Relational Learning of Pattern-Match Rules for Information Extraction}, author={Mary Elaine Califf and Raymond J. Mooney}, booktitle={Proceedings of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing}, month={March}, address={Standford, CA}, pages={6-11}, url="http://www.cs.utexas.edu/users/ai-lab?califf:aaai-amldp98", year={1998} }
People
Mary Elaine Califf
Ph.D. Alumni
mecaliff [at] ilstu edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Inductive Logic Programming
Information Extraction
Machine Learning
Labs
Machine Learning