weka.deduping
Class DedupingEvaluation

java.lang.Object
  extended byweka.deduping.DedupingEvaluation

public class DedupingEvaluation
extends java.lang.Object

Class for evaluating deduping


Field Summary
protected  double[][] m_ConfusionMatrix
          Array for storing the confusion matrix.
protected  int m_numClusters
          The number of produced clusters
protected  Instances m_testInstances
          Test instances
protected  Instances m_trainInstances
          Training instances
 
Constructor Summary
DedupingEvaluation()
          A default constructor
 
Method Summary
protected  int countPresentClasses(Instances instances)
          A helper function that determines how many classes are actually represented in an Instances object
 java.util.ArrayList evaluateModel(Deduper deduper, Instances testInstances)
          Evaluates the deduper on a given set of test instances
 java.lang.String globalInfo()
          Returns a string describing this evaluator
 void trainDeduper(Deduper deduper, Instances trainingData, Instances testData)
          Train a deduper on the supplied data
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_numClusters

protected int m_numClusters
The number of produced clusters


m_trainInstances

protected Instances m_trainInstances
Training instances


m_testInstances

protected Instances m_testInstances
Test instances


m_ConfusionMatrix

protected double[][] m_ConfusionMatrix
Array for storing the confusion matrix.

Constructor Detail

DedupingEvaluation

public DedupingEvaluation()
A default constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this evaluator

Returns:
a description of the evaluator suitable for displaying in the explorer/experimenter gui

trainDeduper

public void trainDeduper(Deduper deduper,
                         Instances trainingData,
                         Instances testData)
                  throws java.lang.Exception
Train a deduper on the supplied data

Parameters:
deduper - a deduper to train
Throws:
java.lang.Exception

evaluateModel

public java.util.ArrayList evaluateModel(Deduper deduper,
                                         Instances testInstances)
                                  throws java.lang.Exception
Evaluates the deduper on a given set of test instances

Parameters:
testInstances - set of test instances for evaluation
Returns:
a list of arrays containing the basic statistics for each point
Throws:
java.lang.Exception - if model could not be evaluated successfully

countPresentClasses

protected int countPresentClasses(Instances instances)
A helper function that determines how many classes are actually represented in an Instances object

Parameters:
instances - a set of instances
Returns:
the number of classes present among the instances