ELIXIR


ELIXIR is a library for writing wrappers in Java.

The code for ELIXIR is available via anonymous ftp. Pointers to papers on ELIXIR can be found on our Information Extraction research page. Below is the standard reference (click on the open book image).

  • ELIXIR: A Library for Writing Wrappers in Java
    Edward Wild
    Undergraduate Honor Thesis, Department of Computer Sciences, University of Texas at Austin, Dec 2001.

    ELIXIR is a library for writing wrappers in Java. ELIXIR provides a way to combine text extraction and spiderin g in wrappers. Since wrappers using ELIXIR are Java programs, they are eays to integrate with other Java program. The user c an also extend the functionality of ELIXIR by implement new ItemExtractors. In an experiment, a wrapper written using ELIXI R showed an 89% reduction in non-comment source statements from a wrapper written using a prototype of ELIXIR. In another ex periemnt, a wrapper written using ELIXIR showed a 90% reduction in non-comment source statements from a wrapper written usi ng SPHINX, a Java toolkit for writing spiders.


    mooney@cs.utexas.edu