weka.extraction
Class Extractor

java.lang.Object
  extended byweka.extraction.Extractor
All Implemented Interfaces:
java.lang.Cloneable
Direct Known Subclasses:
ClusteringExtractor

public abstract class Extractor
extends java.lang.Object
implements java.lang.Cloneable

An abstract extractor class. Takes a set of objects and trains on it; then can be used for extraction on a testing set.


Field Summary
protected  java.util.ArrayList m_statistics
          An arraylist of Object arrays containing statistics
 
Constructor Summary
Extractor()
           
 
Method Summary
static Extractor forName(java.lang.String extractorName, java.lang.String[] options)
          A helper function that may be needed by command-line Weka
 java.util.ArrayList getStatistics()
          Return the list of statistics collected during extraction
abstract  void testExtractor(Instances testData, java.util.HashMap docFillerMap)
          Perform extraction on a set of data.
abstract  void trainExtractor(Instances labeledData, Instances unlabeledData)
          Given training data, train the extractor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_statistics

protected java.util.ArrayList m_statistics
An arraylist of Object arrays containing statistics

Constructor Detail

Extractor

public Extractor()
Method Detail

trainExtractor

public abstract void trainExtractor(Instances labeledData,
                                    Instances unlabeledData)
                             throws java.lang.Exception
Given training data, train the extractor

Parameters:
labeledData - a set of training data
unlabeledData - a set of unlabeled data; used only by extractors that implement transductive learning
Throws:
java.lang.Exception

testExtractor

public abstract void testExtractor(Instances testData,
                                   java.util.HashMap docFillerMap)
                            throws java.lang.Exception
Perform extraction on a set of data.

Parameters:
testData - a set of instances on which to perform extraction
docFillerMap - a map where the uniqueID of an instance (document) is mapped to a HashMap, which maps fillers to a list of Integer positions
Throws:
java.lang.Exception

getStatistics

public java.util.ArrayList getStatistics()
Return the list of statistics collected during extraction


forName

public static Extractor forName(java.lang.String extractorName,
                                java.lang.String[] options)
                         throws java.lang.Exception
A helper function that may be needed by command-line Weka

Throws:
java.lang.Exception