weka.core.metrics
Class WeightedDotP

java.lang.Object
  extended byweka.core.metrics.Metric
      extended byweka.core.metrics.LearnableMetric
          extended byweka.core.metrics.WeightedDotP
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable

public class WeightedDotP
extends LearnableMetric
implements OptionHandler

WeightedDotP class Implements the weighted dot product distance metric

See Also:
Serialized Form

Field Summary
static int CONVERSION_EXPONENTIAL
           
static int CONVERSION_LAPLACIAN
          We can have different ways of converting from similarity to distance
static int CONVERSION_UNIT
           
protected  int m_conversionType
          The method of converting, by default laplacian
protected  boolean m_lengthNormalized
          Should cosine similarity be normalized by the length of two instance vectors?
protected  MetricLearner m_metricLearner
          A metric learner responsible for training the parameters of the metric
static Tag[] TAGS_CONVERSION
           
 
Fields inherited from class weka.core.metrics.LearnableMetric
m_attrWeights, m_classifier, m_classifierClassName, m_classifierRequiresNominalClass, m_numPosDiffInstances, m_posNegDiffInstanceRatio, m_trainable
 
Fields inherited from class weka.core.metrics.Metric
m_attrIdxs, m_classIndex, m_numAttributes
 
Constructor Summary
WeightedDotP()
          Creates an empty metric class
WeightedDotP(int numAttributes)
          Creates a new metric.
WeightedDotP(int[] _attrIdxs)
          Creates a new metric which takes specified attributes.
 
Method Summary
 void buildMetric(Instances data)
          Create a new metric for operating on specified instances
 void buildMetric(int numAttributes)
          Generates a new Metric.
 void buildMetric(int numAttributes, java.lang.String[] options)
          Generates a new Metric.
 Instance createDiffInstance(Instance instance1, Instance instance2)
          Create an Instance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]
 Instance createDiffInstanceNonSparse(Instance instance1, Instance instance2)
          Create an Instance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]
 SparseInstance createDiffInstanceSparse(SparseInstance instance1, SparseInstance instance2)
          Create a SparseInstance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]
 SparseInstance createDiffInstanceSparseNonSparse(SparseInstance instance1, Instance instance2)
          Create a SparseInstance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]
 double distance(Instance instance1, Instance instance2)
          Returns distance between two instances using the current conversion type (CONVERSION_LAPLACIAN, CONVERSION_EXPONENTIAL, CONVERSION_UNIT, ...)
 double distanceNonWeighted(Instance instance1, Instance instance2)
          Returns distance between two instances using the current conversion without using the weights type (CONVERSION_LAPLACIAN, CONVERSION_EXPONENTIAL, CONVERSION_UNIT, ...)
 Instance getCentroidInstance(Instances instances, boolean fastMode, boolean normalized)
          Given a cluster of instances, return the centroid of that cluster
 SelectedTag getConversionType()
          return the type of similarity to distance conversion
 double[] getGradients(Instance instance1, Instance instance2)
          Get the values of the partial derivates for the metric components for a particular instance pair
 boolean getLengthNormalized()
          Check whether similarity is normalized by the length of the vectors
 MetricLearner getMetricLearner()
          Get the distance metric learner
 java.lang.String[] getOptions()
          Gets the current settings of WeightedDotP.
 boolean isDistanceBased()
          The computation of a metric can be either based on distance, or on similarity
 void learnMetric(Instances data)
          Updates the weights
 double lengthWeighted(Instance instance)
          Get the norm-2 length of an instance assuming all attributes are numeric and utilizing the attribute weights
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
           
 void resetMetric()
          Reset all values that have been learned
 void setConversionType(SelectedTag conversionType)
          Set the type of similarity to distance conversion.
 void setLengthNormalized(boolean lengthNormalized)
          Set normalization by instance length to be on or off
 void setMetricLearner(MetricLearner metricLearner)
          Set the distance metric learner
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 double similarity(Instance instance1, Instance instance2)
          Returns a dot product similarity value between two instances.
 double similarityInternal(Instance instance1, Instance instance2)
          Returns a dot product similarity value between two instances.
 double similarityNonSparse(Instance instance1, Instance instance2)
          Returns a dot product similarity value between a two non-sparse instances
 double similarityNonSparseNonWeighted(Instance instance1, Instance instance2)
          Returns a dot product similarity value between a two non-sparse instances
 double similarityNonWeighted(Instance instance1, Instance instance2)
          Returns a dot product similarity value between two instances without using the weights.
 double similaritySparse(SparseInstance instance1, SparseInstance instance2)
          Returns a dot product similarity value between two sparse instances.
 double similaritySparseNonSparse(SparseInstance instance1, Instance instance2)
          Returns a dot product similarity value between a non-sparse instance and a sparse instance
 double similaritySparseNonSparseNonWeighted(SparseInstance instance1, Instance instance2)
          Returns a dot product similarity value between a non-sparse instance and a sparse instance
 double similaritySparseNonWeighted(SparseInstance instance1, SparseInstance instance2)
          Returns a dot product similarity value between two sparse instances.
 
Methods inherited from class weka.core.metrics.LearnableMetric
clone, getExternal, getNumPosDiffInstances, getPosNegDiffInstanceRatio, getTrainable, getWeights, meanOrMode, normalizeInstanceWeighted, setExternal, setNumPosDiffInstances, setPosNegDiffInstanceRatio, setTrainable, setWeights, useClassifier, useNoClassifier, usesClassifier
 
Methods inherited from class weka.core.metrics.Metric
forName, getAttrIdxs, getAttrIdxsWithoutLastClass, getAttrIndxs, getClassIndex, getNumAttributes, length, normalizeInstance, setAttrIdxs, setAttrIdxs, setClassIndex
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_lengthNormalized

protected boolean m_lengthNormalized
Should cosine similarity be normalized by the length of two instance vectors?


CONVERSION_LAPLACIAN

public static final int CONVERSION_LAPLACIAN
We can have different ways of converting from similarity to distance

See Also:
Constant Field Values

CONVERSION_UNIT

public static final int CONVERSION_UNIT
See Also:
Constant Field Values

CONVERSION_EXPONENTIAL

public static final int CONVERSION_EXPONENTIAL
See Also:
Constant Field Values

TAGS_CONVERSION

public static final Tag[] TAGS_CONVERSION

m_conversionType

protected int m_conversionType
The method of converting, by default laplacian


m_metricLearner

protected MetricLearner m_metricLearner
A metric learner responsible for training the parameters of the metric

Constructor Detail

WeightedDotP

public WeightedDotP()
Creates an empty metric class


WeightedDotP

public WeightedDotP(int numAttributes)
             throws java.lang.Exception
Creates a new metric.

Parameters:
numAttributes - the number of attributes that the metric will work on

WeightedDotP

public WeightedDotP(int[] _attrIdxs)
             throws java.lang.Exception
Creates a new metric which takes specified attributes.

Parameters:
_attrIdxs - An array containing attribute indeces that will be used in the metric
Method Detail

resetMetric

public void resetMetric()
                 throws java.lang.Exception
Reset all values that have been learned

Specified by:
resetMetric in class LearnableMetric
Throws:
java.lang.Exception

buildMetric

public void buildMetric(int numAttributes)
                 throws java.lang.Exception
Generates a new Metric. Has to initialize all fields of the metric with default values.

Specified by:
buildMetric in class Metric
Parameters:
numAttributes - the number of attributes that the metric will work on
Throws:
java.lang.Exception - if the distance metric has not been generated successfully.

buildMetric

public void buildMetric(int numAttributes,
                        java.lang.String[] options)
                 throws java.lang.Exception
Generates a new Metric. Has to initialize all fields of the metric with default values

Specified by:
buildMetric in class Metric
Parameters:
options - an array of options suitable for passing to setOptions. May be null.
numAttributes - the number of attributes that the metric will work on
Throws:
java.lang.Exception - if the distance metric has not been generated successfully.

buildMetric

public void buildMetric(Instances data)
                 throws java.lang.Exception
Create a new metric for operating on specified instances

Specified by:
buildMetric in class Metric
Parameters:
data - instances that the metric will be used on
Throws:
java.lang.Exception

similarity

public double similarity(Instance instance1,
                         Instance instance2)
                  throws java.lang.Exception
Returns a dot product similarity value between two instances.

Specified by:
similarity in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similarityNonWeighted

public double similarityNonWeighted(Instance instance1,
                                    Instance instance2)
                             throws java.lang.Exception
Returns a dot product similarity value between two instances without using the weights.

Specified by:
similarityNonWeighted in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similarityInternal

public double similarityInternal(Instance instance1,
                                 Instance instance2)
                          throws java.lang.Exception
Returns a dot product similarity value between two instances.

Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similaritySparse

public double similaritySparse(SparseInstance instance1,
                               SparseInstance instance2)
                        throws java.lang.Exception
Returns a dot product similarity value between two sparse instances.

Parameters:
instance1 - First sparse instance.
instance2 - Second sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similarityNonSparse

public double similarityNonSparse(Instance instance1,
                                  Instance instance2)
                           throws java.lang.Exception
Returns a dot product similarity value between a two non-sparse instances

Parameters:
instance1 - First non-sparse instance.
instance2 - Second non-sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similaritySparseNonSparse

public double similaritySparseNonSparse(SparseInstance instance1,
                                        Instance instance2)
                                 throws java.lang.Exception
Returns a dot product similarity value between a non-sparse instance and a sparse instance

Parameters:
instance1 - First sparse instance.
instance2 - Second non-sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similaritySparseNonWeighted

public double similaritySparseNonWeighted(SparseInstance instance1,
                                          SparseInstance instance2)
                                   throws java.lang.Exception
Returns a dot product similarity value between two sparse instances.

Parameters:
instance1 - First sparse instance.
instance2 - Second sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similarityNonSparseNonWeighted

public double similarityNonSparseNonWeighted(Instance instance1,
                                             Instance instance2)
                                      throws java.lang.Exception
Returns a dot product similarity value between a two non-sparse instances

Parameters:
instance1 - First non-sparse instance.
instance2 - Second non-sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

similaritySparseNonSparseNonWeighted

public double similaritySparseNonSparseNonWeighted(SparseInstance instance1,
                                                   Instance instance2)
                                            throws java.lang.Exception
Returns a dot product similarity value between a non-sparse instance and a sparse instance

Parameters:
instance1 - First sparse instance.
instance2 - Second non-sparse instance.
Throws:
java.lang.Exception - if similarity could not be estimated.

createDiffInstance

public Instance createDiffInstance(Instance instance1,
                                   Instance instance2)
Create an Instance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]

Specified by:
createDiffInstance in class LearnableMetric
Parameters:
instance1 - first instance
instance2 - second instance

createDiffInstanceSparse

public SparseInstance createDiffInstanceSparse(SparseInstance instance1,
                                               SparseInstance instance2)
Create a SparseInstance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]

Parameters:
instance1 - first sparse instance
instance2 - second sparse instance

createDiffInstanceNonSparse

public Instance createDiffInstanceNonSparse(Instance instance1,
                                            Instance instance2)
Create an Instance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]

Parameters:
instance1 - first instance
instance2 - second instance

createDiffInstanceSparseNonSparse

public SparseInstance createDiffInstanceSparseNonSparse(SparseInstance instance1,
                                                        Instance instance2)
Create a SparseInstance with features corresponding to internal "features": for x'y returns an instance with the following features: [x1*y1, x2*y2, ..., xn*yn]

Parameters:
instance1 - first sparse instance
instance2 - second instance

distance

public double distance(Instance instance1,
                       Instance instance2)
                throws java.lang.Exception
Returns distance between two instances using the current conversion type (CONVERSION_LAPLACIAN, CONVERSION_EXPONENTIAL, CONVERSION_UNIT, ...)

Specified by:
distance in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if distance could not be estimated.

distanceNonWeighted

public double distanceNonWeighted(Instance instance1,
                                  Instance instance2)
                           throws java.lang.Exception
Returns distance between two instances using the current conversion without using the weights type (CONVERSION_LAPLACIAN, CONVERSION_EXPONENTIAL, CONVERSION_UNIT, ...)

Specified by:
distanceNonWeighted in class Metric
Parameters:
instance1 - First instance.
instance2 - Second instance.
Throws:
java.lang.Exception - if distance could not be estimated.

setConversionType

public void setConversionType(SelectedTag conversionType)
Set the type of similarity to distance conversion. Values other than CONVERSION_LAPLACIAN, CONVERSION_UNIT, or CONVERSION_EXPONENTIAL will be ignored


getConversionType

public SelectedTag getConversionType()
return the type of similarity to distance conversion

Returns:
one of CONVERSION_LAPLACIAN, CONVERSION_UNIT, or CONVERSION_EXPONENTIAL

setLengthNormalized

public void setLengthNormalized(boolean lengthNormalized)
Set normalization by instance length to be on or off

Parameters:
lengthNormalized - if true, similarity is normalized by the length of the vectors

getLengthNormalized

public boolean getLengthNormalized()
Check whether similarity is normalized by the length of the vectors


learnMetric

public void learnMetric(Instances data)
                 throws java.lang.Exception
Updates the weights

Specified by:
learnMetric in class LearnableMetric
Throws:
java.lang.Exception

setMetricLearner

public void setMetricLearner(MetricLearner metricLearner)
Set the distance metric learner

Parameters:
metricLearner - the metric learner

getMetricLearner

public MetricLearner getMetricLearner()
Get the distance metric learner


isDistanceBased

public boolean isDistanceBased()
The computation of a metric can be either based on distance, or on similarity

Specified by:
isDistanceBased in class Metric

getCentroidInstance

public Instance getCentroidInstance(Instances instances,
                                    boolean fastMode,
                                    boolean normalized)
Given a cluster of instances, return the centroid of that cluster

Specified by:
getCentroidInstance in class LearnableMetric
Parameters:
instances - objects belonging to a cluster
fastMode - whether fast mode should be used for SparseInstances
normalized - normalize centroids for SPKMeans
Returns:
a centroid instance for the given cluster

getGradients

public double[] getGradients(Instance instance1,
                             Instance instance2)
                      throws java.lang.Exception
Get the values of the partial derivates for the metric components for a particular instance pair

Specified by:
getGradients in class LearnableMetric
Parameters:
instance1 - the first instance
instance2 - the first instance
Throws:
java.lang.Exception

lengthWeighted

public double lengthWeighted(Instance instance)
Get the norm-2 length of an instance assuming all attributes are numeric and utilizing the attribute weights


setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-N
Normalize the dot product by vectors lengths -E
Use exponential conversion from similarity to distance (default laplacian conversion)

-U
Use unit conversion from similarity to distance (dist=1-sim) (default laplacian conversion)

-R
The metric is trainable and will be trained using the current MetricLearner (default non-trainable)

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

getOptions

public java.lang.String[] getOptions()
Gets the current settings of WeightedDotP.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

main

public static void main(java.lang.String[] args)