UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Using Information Extraction to Aid the Discovery of Prediction Rules from Text (2000)
Un Yong Nahm
and
Raymond J. Mooney
Text mining and Information Extraction(IE) are both topics of significant recent interest. Text mining concerns applying data mining, a.k.a. knowledge discovery from databases (KDD) techniques to unstructured text. Information extraction (IE) is a form of shallow text understanding that locates specific pieces of data in natural language documents, transforming unstructured text into a structured database. This paper describes a system called DiscoTEX, that combines IE and KDD methods to perform a text mining task, discovering prediction rules from natural-language corpora. An initial version of DiscoTEX is constructed by integrating an IE module based on Rapier and a rule-learning module, Ripper. We present encouraging results on applying these techniques to a corpus of computer job postings from an Internet newsgroup.
View:
PDF
,
PS
Citation:
In
Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (KDD-2000) Workshop on Text Mining
, pp. 51--58, Boston, MA, August 2000.
Bibtex:
@inproceedings{nahm:kdd00, title={Using Information Extraction to Aid the Discovery of Prediction Rules from Text}, author={Un Yong Nahm and Raymond J. Mooney}, booktitle={Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (KDD-2000) Workshop on Text Mining}, month={August}, address={Boston, MA}, pages={51--58}, url="http://www.cs.utexas.edu/users/ai-lab?nahm:kdd00", year={2000} }
People
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Un Yong Nahm
Ph.D. Alumni
pebronia [at] acm org
Areas of Interest
Machine Learning
Text Data Mining
Labs
Machine Learning