weka.experiment
Class SemiSupClustererSplitEvaluator

java.lang.Object
  extended byweka.experiment.SemiSupClustererSplitEvaluator
All Implemented Interfaces:
OptionHandler, java.io.Serializable, SplitEvaluator

public class SemiSupClustererSplitEvaluator
extends java.lang.Object
implements SplitEvaluator, OptionHandler

A SplitEvaluator that produces results for a semi-supervised clustering scheme on a nominal class attribute. -W clustername
Specify the full class name of the clusterer to evaluate.

-C class index
The index of the class for which statistics are to be output. (default 1)

See Also:
Serialized Form

Field Summary
protected  Clusterer m_Clusterer
          The semi-supervised clusterer used for evaluation
protected  java.lang.String m_ClustererOptions
          The clusterer options (if any)
protected  java.lang.String m_ClustererVersion
          The clusterer version
protected  java.lang.String m_result
          Holds the statistics for the most recent application of the clusterer
 
Constructor Summary
SemiSupClustererSplitEvaluator()
          No args constructor.
 
Method Summary
 java.lang.String clustererTipText()
          Returns the tip text for this property
 int getClassForIRStatistics()
          Get the value of ClassForIRStatistics.
 Clusterer getClusterer()
          Get the value of Clusterer.
 java.lang.Object[] getKey()
          Gets the key describing the current SplitEvaluator.
 java.lang.String[] getKeyNames()
          Gets the names of each of the key columns produced for a single run.
 java.lang.Object[] getKeyTypes()
          Gets the data types of each of the key columns produced for a single run.
 java.lang.String[] getOptions()
          Gets the current settings of the Clusterer.
 java.lang.String getRawResultOutput()
          Gets the raw output from the clusterer
 java.lang.Object[] getResult(java.util.ArrayList labeledTrainPairs, Instances labeledTrain, Instances unlabeledData, Instances test, Instances unlabeledTest)
          Gets the results for the supplied train and test datasets.
 java.lang.Object[] getResult(Instances unlabeledTrain, Instances test)
          Dummy function, exists just for compatibility with SplitEvaluator interface
 java.lang.Object[] getResult(Instances labeledTrain, Instances unlabeledData, Instances totalTrainWithLabels, Instances test, int startingIndexOfTest)
          Gets the results for the supplied train and test datasets.
 java.lang.Object[] getResult(Instances labeledTrain, Instances unlabeledTrain, Instances test, int numClasses)
          Gets the results for the supplied train and test datasets.
 java.lang.Object[] getResult(Instances labeledTrain, Instances unlabeledTrain, Instances test, int numClasses, int startingIndexOfTest)
          Gets the results for the supplied train and test datasets.
 java.lang.String[] getResultNames()
          Gets the names of each of the result columns produced for a single run.
 java.lang.Object[] getResultTypes()
          Gets the data types of each of the result columns produced for a single run.
 java.lang.String globalInfo()
          Returns a string describing this split evaluator
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options..
 void setAdditionalMeasures(java.lang.String[] additionalMeasures)
          Does nothing, since cluster evaluation does not allow additional measures
 void setClassForIRStatistics(int v)
          Set the value of ClassForIRStatistics.
 void setClusterer(Clusterer newClusterer)
          Sets the clusterer.
 void setClustererName(java.lang.String newClustererName)
          Set the Clusterer to use, given it's class name.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Returns a text description of the split evaluator.
protected  void updateOptions()
          Updates the options that the current clusterer is using.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Clusterer

protected Clusterer m_Clusterer
The semi-supervised clusterer used for evaluation


m_result

protected java.lang.String m_result
Holds the statistics for the most recent application of the clusterer


m_ClustererOptions

protected java.lang.String m_ClustererOptions
The clusterer options (if any)


m_ClustererVersion

protected java.lang.String m_ClustererVersion
The clusterer version

Constructor Detail

SemiSupClustererSplitEvaluator

public SemiSupClustererSplitEvaluator()
No args constructor.

Method Detail

setAdditionalMeasures

public void setAdditionalMeasures(java.lang.String[] additionalMeasures)
Does nothing, since cluster evaluation does not allow additional measures

Specified by:
setAdditionalMeasures in interface SplitEvaluator
Parameters:
additionalMeasures - a list of method names

globalInfo

public java.lang.String globalInfo()
Returns a string describing this split evaluator

Returns:
a description of the split evaluator suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options..

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-W classname
Specify the full class name of the clusterer to evaluate.

-C class index
The index of the class for which IR statistics are to be output. (default 1)

All option after -- will be passed to the clusterer.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Clusterer.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

getKeyTypes

public java.lang.Object[] getKeyTypes()
Gets the data types of each of the key columns produced for a single run. The number of key fields must be constant for a given SplitEvaluator.

Specified by:
getKeyTypes in interface SplitEvaluator
Returns:
an array containing objects of the type of each key column. The objects should be Strings, or Doubles.

getKeyNames

public java.lang.String[] getKeyNames()
Gets the names of each of the key columns produced for a single run. The number of key fields must be constant for a given SplitEvaluator.

Specified by:
getKeyNames in interface SplitEvaluator
Returns:
an array containing the name of each key column

getKey

public java.lang.Object[] getKey()
Gets the key describing the current SplitEvaluator. For example This may contain the name of the clusterer used for clusterer predictive evaluation. The number of key fields must be constant for a given SplitEvaluator.

Specified by:
getKey in interface SplitEvaluator
Returns:
an array of objects containing the key.

getResultTypes

public java.lang.Object[] getResultTypes()
Gets the data types of each of the result columns produced for a single run. The number of result fields must be constant for a given SplitEvaluator.

Specified by:
getResultTypes in interface SplitEvaluator
Returns:
an array containing objects of the type of each result column. The objects should be Strings, or Doubles.

getResultNames

public java.lang.String[] getResultNames()
Gets the names of each of the result columns produced for a single run. The number of result fields must be constant for a given SplitEvaluator.

Specified by:
getResultNames in interface SplitEvaluator
Returns:
an array containing the name of each result column

getResult

public java.lang.Object[] getResult(Instances unlabeledTrain,
                                    Instances test)
Dummy function, exists just for compatibility with SplitEvaluator interface

Specified by:
getResult in interface SplitEvaluator
Parameters:
unlabeledTrain - the training Instances.
test - the testing Instances.
Returns:
the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).

getResult

public java.lang.Object[] getResult(java.util.ArrayList labeledTrainPairs,
                                    Instances labeledTrain,
                                    Instances unlabeledData,
                                    Instances test,
                                    Instances unlabeledTest)
                             throws java.lang.Exception
Gets the results for the supplied train and test datasets.

Parameters:
labeledTrainPairs - the constraint pairs having labels on them
labeledTrain - the labeled training Instances.
unlabeledData - the unlabeled training (+ test for transductive) Instances.
test - the testing Instances.
Returns:
the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).
Throws:
java.lang.Exception - if a problem occurs while getting the results

getResult

public java.lang.Object[] getResult(Instances labeledTrain,
                                    Instances unlabeledTrain,
                                    Instances test,
                                    int numClasses)
                             throws java.lang.Exception
Gets the results for the supplied train and test datasets.

Parameters:
labeledTrain - the labeled training Instances.
unlabeledTrain - the unlabeled training Instances.
test - the testing Instances.
Returns:
the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).
Throws:
java.lang.Exception - if a problem occurs while getting the results

getResult

public java.lang.Object[] getResult(Instances labeledTrain,
                                    Instances unlabeledData,
                                    Instances totalTrainWithLabels,
                                    Instances test,
                                    int startingIndexOfTest)
                             throws java.lang.Exception
Gets the results for the supplied train and test datasets.

Parameters:
labeledTrain - the labeled training Instances.
unlabeledData - the unlabeled training (+ test for transductive) Instances.
test - the testing Instances.
startingIndexOfTest - from where test data starts in unlabeledData, useful if clustering is transductive
Returns:
the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).
Throws:
java.lang.Exception - if a problem occurs while getting the results

getResult

public java.lang.Object[] getResult(Instances labeledTrain,
                                    Instances unlabeledTrain,
                                    Instances test,
                                    int numClasses,
                                    int startingIndexOfTest)
                             throws java.lang.Exception
Gets the results for the supplied train and test datasets.

Parameters:
labeledTrain - the labeled training Instances.
unlabeledTrain - the unlabeled training Instances.
test - the testing Instances.
startingIndexOfTest - from where test data starts in unlabeledData, useful if clustering is transductive
Returns:
the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).
Throws:
java.lang.Exception - if a problem occurs while getting the results

clustererTipText

public java.lang.String clustererTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getClusterer

public Clusterer getClusterer()
Get the value of Clusterer.

Returns:
Value of Clusterer.

setClusterer

public void setClusterer(Clusterer newClusterer)
Sets the clusterer.

Parameters:
newClusterer - the new clusterer to use.

getClassForIRStatistics

public int getClassForIRStatistics()
Get the value of ClassForIRStatistics.

Returns:
Value of ClassForIRStatistics.

setClassForIRStatistics

public void setClassForIRStatistics(int v)
Set the value of ClassForIRStatistics.

Parameters:
v - Value to assign to ClassForIRStatistics.

updateOptions

protected void updateOptions()
Updates the options that the current clusterer is using.


setClustererName

public void setClustererName(java.lang.String newClustererName)
                      throws java.lang.Exception
Set the Clusterer to use, given it's class name. A new clusterer will be instantiated.

Throws:
java.lang.Exception - if the class name is invalid.

getRawResultOutput

public java.lang.String getRawResultOutput()
Gets the raw output from the clusterer

Specified by:
getRawResultOutput in interface SplitEvaluator
Returns:
the raw output from the clusterer

toString

public java.lang.String toString()
Returns a text description of the split evaluator.

Returns:
a text description of the split evaluator.