weka.core.metrics
Class BarHillelMetricMatlab

java.lang.Object
  extended byweka.core.metrics.Metric
      extended byweka.core.metrics.LearnableMetric
          extended byweka.core.metrics.BarHillelMetricMatlab
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable

public class BarHillelMetricMatlab
extends LearnableMetric
implements OptionHandler

Class for performing RCA according to Bar-Hillel's algorithm.

See Also:
Serialized Form

Field Summary
static int CONVERSION_EXPONENTIAL
           
static int CONVERSION_LAPLACIAN
          We can have different ways of converting from distance to similarity
static int CONVERSION_UNIT
           
protected  double[][] m_attrMatrix
          full matrix returned by Matlab code
 java.lang.String m_chunkletAssignmentFilename
          Name of the file where chunklet assignments will be stored
protected  int m_conversionType
          The method of converting, by default laplacian
 java.lang.String m_dataFilename
          Name of the file where dataMatrix will be stored
protected  java.lang.String m_RCAMFile
          Name of the Matlab program file that computes RCA
static Tag[] TAGS_CONVERSION
           
 
Fields inherited from class weka.core.metrics.LearnableMetric
m_attrWeights, m_classifier, m_classifierClassName, m_classifierRequiresNominalClass, m_numPosDiffInstances, m_posNegDiffInstanceRatio, m_trainable
 
Fields inherited from class weka.core.metrics.Metric
m_attrIdxs, m_classIndex, m_numAttributes
 
Constructor Summary
BarHillelMetricMatlab()
          Create a default new metric
BarHillelMetricMatlab(int numAttributes)
          Create a new metric.
BarHillelMetricMatlab(int[] _attrIdxs)
          Creates a new metric which takes specified attributes.
 
Method Summary
 void buildAttributeMatrix(Instances data, int[] clusterAssignments)
           
 void buildMetric(Instances data)
          Create a new metric for operating on specified instances
 void buildMetric(int numAttributes)
          Generates a new Metric.
 void buildMetric(int numAttributes, java.lang.String[] options)
          Generates a new Metric.
 Instance createDiffInstance(Instance instance1, Instance instance2)
          Create an instance with features corresponding to components of the two given instances
 double distance(Instance instance1, Instance instance2)
          Returns a distance value between two instances.
 double distanceNonWeighted(Instance instance1, Instance instance2)
          Returns a distance value between two instances.
 Instance getCentroidInstance(Instances instances, boolean fastMode, boolean normalized)
          Given a cluster of instances, return the centroid of that cluster
 SelectedTag getConversionType()
          return the type of distance to similarity conversion
 double[] getGradients(Instance instance1, Instance instance2)
          Get the values of the partial derivates for the metric components for a particular instance pair
static java.lang.String getLogTimestamp()
          Get a timestamp string as a weak uniqueid
 java.lang.String[] getOptions()
          Gets the current settings of WeightedEuclideanP.
 boolean isDistanceBased()
          The computation of a metric can be either based on distance, or on similarity
 void learnMetric(Instances data)
          Train the distance metric.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class
 void prepareMatlab(java.lang.String filename)
          Create matlab m-file for ICA
 double[][] readMatrix(java.lang.String name)
          Read column vectors from a text file
 void resetMetric()
          Reset all values that have been learned
static void runMatlab(java.lang.String inFile, java.lang.String outFile)
          Run matlab in command line with a given argument
 void setConversionType(SelectedTag conversionType)
          Set the type of distance to similarity conversion.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 double similarity(Instance instance1, Instance instance2)
          Returns a similarity estimate between two instances.
 double similarityNonWeighted(Instance instance1, Instance instance2)
          Returns a similarity estimate between two instances without using the weights.
 
Methods inherited from class weka.core.metrics.LearnableMetric
clone, getExternal, getNumPosDiffInstances, getPosNegDiffInstanceRatio, getTrainable, getWeights, meanOrMode, normalizeInstanceWeighted, setExternal, setNumPosDiffInstances, setPosNegDiffInstanceRatio, setTrainable, setWeights, useClassifier, useNoClassifier, usesClassifier
 
Methods inherited from class weka.core.metrics.Metric
forName, getAttrIdxs, getAttrIdxsWithoutLastClass, getAttrIndxs, getClassIndex, getNumAttributes, length, normalizeInstance, setAttrIdxs, setAttrIdxs, setClassIndex
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_attrMatrix

protected double[][] m_attrMatrix
full matrix returned by Matlab code


CONVERSION_LAPLACIAN

public static final int CONVERSION_LAPLACIAN
We can have different ways of converting from distance to similarity

See Also:
Constant Field Values

CONVERSION_UNIT

public static final int CONVERSION_UNIT
See Also:
Constant Field Values

CONVERSION_EXPONENTIAL

public static final int CONVERSION_EXPONENTIAL
See Also:
Constant Field Values

TAGS_CONVERSION

public static final Tag[] TAGS_CONVERSION

m_conversionType

protected int m_conversionType
The method of converting, by default laplacian


m_RCAMFile

protected java.lang.String m_RCAMFile
Name of the Matlab program file that computes RCA


m_dataFilename

public java.lang.String m_dataFilename
Name of the file where dataMatrix will be stored


m_chunkletAssignmentFilename

public java.lang.String m_chunkletAssignmentFilename
Name of the file where chunklet assignments will be stored

Constructor Detail

BarHillelMetricMatlab

public BarHillelMetricMatlab(int numAttributes)
                      throws java.lang.Exception
Create a new metric.

Parameters:
numAttributes - the number of attributes that the metric will work on

BarHillelMetricMatlab

public BarHillelMetricMatlab()
Create a default new metric


BarHillelMetricMatlab

public BarHillelMetricMatlab(int[] _attrIdxs)
                      throws java.lang.Exception
Creates a new metric which takes specified attributes.

Parameters:
_attrIdxs - An array containing attribute indeces that will be used in the metric
Method Detail

resetMetric

public void resetMetric()
                 throws java.lang.Exception
Reset all values that have been learned

Specified by:
resetMetric in class LearnableMetric
Throws:
java.lang.Exception

buildMetric

public void buildMetric(int numAttributes)
                 throws java.lang.Exception
Generates a new Metric. Has to initialize all fields of the metric with default values.

Specified by:
buildMetric in class Metric
Parameters:
numAttributes - the number of attributes that the metric will work on
Throws:
java.lang.Exception - if the distance metric has not been generated successfully.

buildMetric

public void buildMetric(int numAttributes,
                        java.lang.String[] options)
                 throws java.lang.Exception
Generates a new Metric. Has to initialize all fields of the metric with default values

Specified by:
buildMetric in class Metric
Parameters:
options - an array of options suitable for passing to setOptions. May be null.
numAttributes - the number of attributes that the metric will work on
Throws:
java.lang.Exception - if the distance metric has not been generated successfully.

buildMetric

public void buildMetric(Instances data)
                 throws java.lang.Exception
Create a new metric for operating on specified instances

Specified by:
buildMetric in class Metric
Parameters:
data - instances that the metric will be used on
Throws:
java.lang.Exception

distance

public double distance(Instance instance1,
                       Instance instance2)
                throws java.lang.Exception
Returns a distance value between two instances.

Specified by:
distance in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if distance could not be estimated.

setConversionType

public void setConversionType(SelectedTag conversionType)
Set the type of distance to similarity conversion. Values other than CONVERSION_LAPLACIAN, CONVERSION_UNIT, or CONVERSION_EXPONENTIAL will be ignored


getConversionType

public SelectedTag getConversionType()
return the type of distance to similarity conversion

Returns:
one of CONVERSION_LAPLACIAN, CONVERSION_UNIT, or CONVERSION_EXPONENTIAL

isDistanceBased

public boolean isDistanceBased()
The computation of a metric can be either based on distance, or on similarity

Specified by:
isDistanceBased in class Metric

similarity

public double similarity(Instance instance1,
                         Instance instance2)
                  throws java.lang.Exception
Returns a similarity estimate between two instances. Similarity is obtained by inverting the distance value using one of three methods: CONVERSION_LAPLACIAN, CONVERSION_EXPONENTIAL, CONVERSION_UNIT.

Specified by:
similarity in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

buildAttributeMatrix

public void buildAttributeMatrix(Instances data,
                                 int[] clusterAssignments)
                          throws java.lang.Exception
Throws:
java.lang.Exception

readMatrix

public double[][] readMatrix(java.lang.String name)
                      throws java.lang.Exception
Read column vectors from a text file

Parameters:
name - file name
Returns:
a double[][] value
Throws:
java.lang.Exception - if an error occurs

prepareMatlab

public void prepareMatlab(java.lang.String filename)
Create matlab m-file for ICA

Parameters:
filename - file where matlab script is created

runMatlab

public static void runMatlab(java.lang.String inFile,
                             java.lang.String outFile)
Run matlab in command line with a given argument

Parameters:
inFile - file to be input to Matlab
outFile - file where results are stored

getLogTimestamp

public static java.lang.String getLogTimestamp()
Get a timestamp string as a weak uniqueid


getCentroidInstance

public Instance getCentroidInstance(Instances instances,
                                    boolean fastMode,
                                    boolean normalized)
Given a cluster of instances, return the centroid of that cluster

Specified by:
getCentroidInstance in class LearnableMetric
Parameters:
instances - objects belonging to a cluster
fastMode - whether fast mode should be used for SparseInstances
normalized - normalize centroids for SPKMeans
Returns:
a centroid instance for the given cluster

getGradients

public double[] getGradients(Instance instance1,
                             Instance instance2)
                      throws java.lang.Exception
Get the values of the partial derivates for the metric components for a particular instance pair

Specified by:
getGradients in class LearnableMetric
Parameters:
instance1 - the first instance
instance2 - the first instance
Throws:
java.lang.Exception

createDiffInstance

public Instance createDiffInstance(Instance instance1,
                                   Instance instance2)
Create an instance with features corresponding to components of the two given instances

Specified by:
createDiffInstance in class LearnableMetric
Parameters:
instance1 - first instance
instance2 - second instance

learnMetric

public void learnMetric(Instances data)
                 throws java.lang.Exception
Train the distance metric. A specific metric will take care of its own training either via a metric learner or by itself.

Specified by:
learnMetric in class LearnableMetric
Throws:
java.lang.Exception

distanceNonWeighted

public double distanceNonWeighted(Instance instance1,
                                  Instance instance2)
                           throws java.lang.Exception
Returns a distance value between two instances.

Specified by:
distanceNonWeighted in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if distance could not be estimated.

similarityNonWeighted

public double similarityNonWeighted(Instance instance1,
                                    Instance instance2)
                             throws java.lang.Exception
Returns a similarity estimate between two instances without using the weights.

Specified by:
similarityNonWeighted in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

getOptions

public java.lang.String[] getOptions()
Gets the current settings of WeightedEuclideanP.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-N
Normalize the euclidean distance by vectors lengths -E
Use exponential conversion from distance to similarity (default laplacian conversion)

-U
Use unit conversion from similarity to distance (dist=1-sim) (default laplacian conversion)

-R
The metric is trainable and will be trained using the current MetricLearner (default non-trainable)

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

main

public static void main(java.lang.String[] argv)
Main method for testing this class

Parameters:
argv - should contain the command line arguments to the evaluator/transformer (see AttributeSelection)