The C++ code for RAPIER is available via anonymous ftp. See the README file here for details. Pointers to papers on RAPIER can be found on our Natural Language Learning research page. Below is the standard reference (click on the open book image).
Proceedings of the Sixteenth National Conference on Artificial
Intelligence (AAAI-99), Orlando, FL, pp. 328-334, July, 1999.
Information extraction is a form of shallow text processing that locates a
specified set of relevant items in a natural-language document. Systems for
this task require significant domain-specific knowledge and are time-consuming
and difficult to build by hand, making them a good application for machine
learning. This paper presents a system, Rapier, that takes pairs of
sample documents and filled templates and induces pattern-match rules that
directly extract fillers for the slots in the template. Rapier employs
a bottom-up learning algorithm which incorporates techniques from several
inductive logic programming systems and acquires unbounded patterns that
include constraints on the words, part-of-speech tags, and semantic classes
present in the filler and the surrounding text. We present encouraging
experimental results on two domains.