|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.core.metrics.Metric
weka.core.metrics.LearnableMetric
weka.core.metrics.KL
KL class Implements weighted Kullback-Leibler divergence
Field Summary | |
static int |
CONVERSION_EXPONENTIAL
|
static int |
CONVERSION_LAPLACIAN
We can have different ways of converting from distance to similarity |
static int |
CONVERSION_UNIT
|
double |
LOG2
|
protected double |
m_alpha
|
protected double |
m_alphaDecayRate
|
protected int |
m_conversionType
The method of converting, by default laplacian |
protected double |
m_currAlpha
|
protected HashMapVector |
m_datasetFrequencies
Frequencies over the entire dataset used for smoothing |
protected java.util.HashMap |
m_instanceConstraintMap
A hashmap that maps every instance to a set of instances with which JS has been computed |
protected java.util.HashMap |
m_instanceNormHash
We hash sum(p log(p)) terms for the input instances to speed up computation |
protected double |
m_lambdaJM
The lambda value for the Jelinek-Mercer smoothing |
protected MetricLearner |
m_metricLearner
A metric learner responsible for training the parameters of the metric |
protected int |
m_numTotalTokens
Total number of tokens in the dataset |
protected double |
m_pseudoCountDirichlet
The pseudocount value for the Dirichlet smoothing |
protected int |
m_smoothingType
The smoothing method |
protected boolean |
m_useDITCSmoothing
Even with unsmoothed data, DITC-type smoothing can be used |
protected boolean |
m_useIDivergence
We can switch between regular KL divergence and I-divergence |
static int |
SMOOTHING_DIRICHLET
|
static int |
SMOOTHING_JELINEK_MERCER
|
static int |
SMOOTHING_UNSMOOTHED
Different smoothing methods for obtaining probability distributions from frequencies |
static Tag[] |
TAGS_CONVERSION
|
static Tag[] |
TAGS_SMOOTHING
|
Fields inherited from class weka.core.metrics.LearnableMetric |
m_attrWeights, m_classifier, m_classifierClassName, m_classifierRequiresNominalClass, m_numPosDiffInstances, m_posNegDiffInstanceRatio, m_trainable |
Fields inherited from class weka.core.metrics.Metric |
m_attrIdxs, m_classIndex, m_numAttributes |
Constructor Summary | |
KL()
Create a default new metric |
|
KL(int numAttributes)
Create a new metric. |
|
KL(int[] _attrIdxs)
Creates a new metric which takes specified attributes. |
Method Summary | |
void |
buildMetric(Instances data)
Create a new metric for operating on specified instances |
void |
buildMetric(int numAttributes)
Generates a new Metric. |
void |
buildMetric(int numAttributes,
java.lang.String[] options)
Generates a new Metric. |
java.lang.Object |
clone()
Create a copy of this metric |
protected double |
convertFrequency(double freq,
double numTotalTokens,
java.lang.String token)
Given a frequency of a given token in a document, convert it to a probability value for that document's distribution |
Instance |
convertInstance(Instance instance)
Take an instance and convert it for use by the metric |
Instance |
createDiffInstance(Instance instance1,
Instance instance2)
Create an instance with features corresponding to dot-product components of the two given instances |
Instance |
createDiffInstanceJS(Instance instance1,
Instance instance2)
Create an instance with features corresponding to JS components |
protected Instance |
createDiffInstanceJSNonSparse(Instance instance1,
Instance instance2)
Create a nonsparse instance with features corresponding to dot-product components of the two given instances |
protected SparseInstance |
createDiffInstanceJSSparse(SparseInstance instance1,
SparseInstance instance2)
Create a sparse instance with features corresponding to dot-product components of the two given instances |
protected Instance |
createDiffInstanceJSSparseNonSparse(SparseInstance instance1,
Instance instance2)
Create an instance with features corresponding to dot-product components of the two given instances |
protected Instance |
createDiffInstanceNonSparse(Instance instance1,
Instance instance2)
Create a nonsparse instance with features corresponding to dot-product components of the two given instances |
protected SparseInstance |
createDiffInstanceSparse(SparseInstance instance1,
SparseInstance instance2)
Create a sparse instance with features corresponding to dot-product components of the two given instances |
protected Instance |
createDiffInstanceSparseNonSparse(SparseInstance instance1,
Instance instance2)
Create an instance with features corresponding to dot-product components of the two given instances |
double |
distance(Instance instance1,
Instance instance2)
Returns a distance value between two instances. |
double |
distanceInternal(Instance instance1,
Instance instance2)
Returns a distance value between two instances. |
double |
distanceJS(Instance instance1,
Instance instance2)
Returns Jensen-Shannon distance value between two instances. |
double |
distanceJSNonSparse(Instance instance1,
Instance instance2)
Returns Jensen-Shannon distance between non-sparse instances without using the weights |
double |
distanceJSSparse(SparseInstance instance1,
SparseInstance instance2)
Returns Jensen-Shannon distance between two sparse instances. |
double |
distanceJSSparseNonSparse(SparseInstance instance1,
Instance instance2)
Returns Jensen-Shannon distance between a non-sparse instance and a sparse instance |
double |
distanceNonSparse(Instance instance1,
Instance instance2)
Returns a distance value between non-sparse instances without using the weights |
double |
distanceNonWeighted(Instance instance1,
Instance instance2)
Returns distance between two instances without using the weights. |
double |
distanceSparse(SparseInstance instance1,
SparseInstance instance2)
Returns a distance value between two sparse instances. |
double |
distanceSparseNonSparse(SparseInstance instance1,
Instance instance2)
Returns a distance value between a non-sparse instance and a sparse instance |
double |
getAlpha()
Get the initial value of the smoothing parameter alpha in DITC smoothing |
double |
getAlphaDecayRate()
Get the initial value of the the decay rate of alpha in DITC smoothing |
Instance |
getCentroidInstance(Instances instances,
boolean fastMode,
boolean normalized)
Given a cluster of instances, return the centroid of that cluster |
SelectedTag |
getConversionType()
return the type of distance to similarity conversion |
double |
getCurrAlpha()
Get the current value of the smoothing parameter alpha in DITC smoothing |
double[] |
getGradients(Instance instance1,
Instance instance2)
Get the values of the partial derivates for the metric components for a particular instance pair |
double |
getLambdaJM()
Get the lambda parameter for Jelinek-Mercer smoothing |
MetricLearner |
getMetricLearner()
Get the distance metric learner |
protected java.lang.String |
getMetricLearnerSpec()
Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier |
java.lang.String[] |
getOptions()
Gets the current settings of KLP. |
double |
getPseudoCountDirichlet()
Get the pseudo-count value for Dirichlet smoothing |
SelectedTag |
getSmoothingType()
return the type of smoothing |
boolean |
getUseDITCSmoothing()
Check whether DITC smoothing is used |
boolean |
getUseIDivergence()
Check whether regular KL divergence or I-divergence is used |
boolean |
isDistanceBased()
The computation of a metric can be either based on distance, or on similarity |
void |
learnMetric(Instances data)
Train the metric |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] args)
|
void |
resetMetric()
Reset all values that have been learned |
void |
setAlpha(double alpha)
Set the initial value of the smoothing parameter alpha in DITC smoothing |
void |
setAlphaDecayRate(double alphaDecayRate)
Set the initial value of the smoothing parameter alphaDecayRate in DITC smoothing |
void |
setConversionType(SelectedTag conversionType)
Set the type of distance to similarity conversion. |
void |
setLambdaJM(double lambdaJM)
Set the lambda parameter for Jelinek-Mercer smoothing |
void |
setMetricLearner(MetricLearner metricLearner)
Set the distance metric learner |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setPseudoCountDirichlet(double pseudoCountDirichlet)
Set the pseudo-count value for Dirichlet smoothing |
void |
setSmoothingType(SelectedTag smoothingType)
Set the type of smoothing |
void |
setUseDITCSmoothing(boolean useDITC)
Switch between using and not using DITC smoothing |
void |
setUseIDivergence(boolean useID)
Switch between regular KL divergence and I-divergence |
double |
similarity(Instance instance1,
Instance instance2)
Returns a similarity estimate between two instances. |
double |
similarityNonWeighted(Instance instance1,
Instance instance2)
Returns a similarity estimate between two instances without using the weights. |
void |
updateAlpha()
Update the current value of alpha by the decay rate |
Methods inherited from class weka.core.metrics.LearnableMetric |
getExternal, getNumPosDiffInstances, getPosNegDiffInstanceRatio, getTrainable, getWeights, meanOrMode, normalizeInstanceWeighted, setExternal, setNumPosDiffInstances, setPosNegDiffInstanceRatio, setTrainable, setWeights, useClassifier, useNoClassifier, usesClassifier |
Methods inherited from class weka.core.metrics.Metric |
forName, getAttrIdxs, getAttrIdxsWithoutLastClass, getAttrIndxs, getClassIndex, getNumAttributes, length, normalizeInstance, setAttrIdxs, setAttrIdxs, setClassIndex |
Methods inherited from class java.lang.Object |
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public final double LOG2
protected boolean m_useIDivergence
protected HashMapVector m_datasetFrequencies
protected java.util.HashMap m_instanceNormHash
protected int m_numTotalTokens
public static final int SMOOTHING_UNSMOOTHED
public static final int SMOOTHING_DIRICHLET
public static final int SMOOTHING_JELINEK_MERCER
public static final Tag[] TAGS_SMOOTHING
protected int m_smoothingType
protected boolean m_useDITCSmoothing
protected double m_alpha
protected double m_alphaDecayRate
protected double m_currAlpha
protected double m_pseudoCountDirichlet
protected double m_lambdaJM
public static final int CONVERSION_LAPLACIAN
public static final int CONVERSION_UNIT
public static final int CONVERSION_EXPONENTIAL
public static final Tag[] TAGS_CONVERSION
protected int m_conversionType
protected MetricLearner m_metricLearner
protected java.util.HashMap m_instanceConstraintMap
Constructor Detail |
public KL(int numAttributes) throws java.lang.Exception
numAttributes
- the number of attributes that the metric will work onpublic KL()
public KL(int[] _attrIdxs) throws java.lang.Exception
_attrIdxs
- An array containing attribute indeces that will
be used in the metricMethod Detail |
public void resetMetric() throws java.lang.Exception
resetMetric
in class LearnableMetric
java.lang.Exception
public void buildMetric(int numAttributes) throws java.lang.Exception
buildMetric
in class Metric
numAttributes
- the number of attributes that the metric will work on
java.lang.Exception
- if the distance metric has not been
generated successfully.public void buildMetric(int numAttributes, java.lang.String[] options) throws java.lang.Exception
buildMetric
in class Metric
options
- an array of options suitable for passing to setOptions.
May be null.numAttributes
- the number of attributes that the metric will work on
java.lang.Exception
- if the distance metric has not been
generated successfully.public void buildMetric(Instances data) throws java.lang.Exception
buildMetric
in class Metric
data
- instances that the metric will be used on
java.lang.Exception
public Instance convertInstance(Instance instance)
InstanceConverter
convertInstance
in interface InstanceConverter
protected double convertFrequency(double freq, double numTotalTokens, java.lang.String token)
freq
- frequency of a tokentoken
- the tokenpublic double distance(Instance instance1, Instance instance2) throws java.lang.Exception
distance
in class Metric
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if distance could not be estimated.public double distanceInternal(Instance instance1, Instance instance2) throws java.lang.Exception
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if distance could not be estimated.public double distanceSparse(SparseInstance instance1, SparseInstance instance2) throws java.lang.Exception
instance1
- First sparse instance.instance2
- Second sparse instance.
java.lang.Exception
- if distance could not be estimated.public double distanceSparseNonSparse(SparseInstance instance1, Instance instance2) throws java.lang.Exception
instance1
- sparse instance.instance2
- sparse instance.
java.lang.Exception
- if distance could not be estimated.public double distanceNonSparse(Instance instance1, Instance instance2) throws java.lang.Exception
instance1
- non-sparse instance.instance2
- non-sparse instance.
java.lang.Exception
- if distance could not be estimated.public double distanceJS(Instance instance1, Instance instance2) throws java.lang.Exception
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if distanceJS could not be estimated.public double distanceJSSparse(SparseInstance instance1, SparseInstance instance2) throws java.lang.Exception
instance1
- First sparse instance.instance2
- Second sparse instance.
java.lang.Exception
- if distanceJS could not be estimated.public double distanceJSSparseNonSparse(SparseInstance instance1, Instance instance2) throws java.lang.Exception
instance1
- sparse instance.instance2
- sparse instance.
java.lang.Exception
- if distanceJS could not be estimated.public double distanceJSNonSparse(Instance instance1, Instance instance2) throws java.lang.Exception
instance1
- non-sparse instance.instance2
- non-sparse instance.
java.lang.Exception
- if distanceJS could not be estimated.public double similarity(Instance instance1, Instance instance2) throws java.lang.Exception
similarity
in class Metric
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if similarity could not be estimated.public double distanceNonWeighted(Instance instance1, Instance instance2) throws java.lang.Exception
distanceNonWeighted
in class Metric
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if similarity could not be estimated.public double similarityNonWeighted(Instance instance1, Instance instance2) throws java.lang.Exception
similarityNonWeighted
in class Metric
instance1
- First instance.instance2
- Second instance.
java.lang.Exception
- if similarity could not be estimated.public double[] getGradients(Instance instance1, Instance instance2) throws java.lang.Exception
getGradients
in class LearnableMetric
instance1
- the first instanceinstance2
- the first instance
java.lang.Exception
public void learnMetric(Instances data) throws java.lang.Exception
learnMetric
in class LearnableMetric
java.lang.Exception
public void setMetricLearner(MetricLearner metricLearner)
metricLearner
- the metric learnerpublic MetricLearner getMetricLearner()
public Instance createDiffInstance(Instance instance1, Instance instance2)
createDiffInstance
in class LearnableMetric
instance1
- first instanceinstance2
- second instanceprotected SparseInstance createDiffInstanceSparse(SparseInstance instance1, SparseInstance instance2)
instance1
- first sparse instanceinstance2
- second sparse instanceprotected Instance createDiffInstanceSparseNonSparse(SparseInstance instance1, Instance instance2)
instance1
- first sparse instanceinstance2
- second non-sparse instanceprotected Instance createDiffInstanceNonSparse(Instance instance1, Instance instance2)
instance1
- first nonsparse instanceinstance2
- second nonsparse instancepublic Instance createDiffInstanceJS(Instance instance1, Instance instance2)
instance1
- first instanceinstance2
- second instanceprotected SparseInstance createDiffInstanceJSSparse(SparseInstance instance1, SparseInstance instance2)
instance1
- first sparse instanceinstance2
- second sparse instanceprotected Instance createDiffInstanceJSSparseNonSparse(SparseInstance instance1, Instance instance2)
instance1
- first sparse instanceinstance2
- second non-sparse instanceprotected Instance createDiffInstanceJSNonSparse(Instance instance1, Instance instance2)
instance1
- first nonsparse instanceinstance2
- second nonsparse instancepublic void setConversionType(SelectedTag conversionType)
public SelectedTag getConversionType()
public void setSmoothingType(SelectedTag smoothingType)
public SelectedTag getSmoothingType()
public void setPseudoCountDirichlet(double pseudoCountDirichlet)
pseudoCountDirichlet
- the pseudocount valuepublic double getPseudoCountDirichlet()
public void setLambdaJM(double lambdaJM)
public double getLambdaJM()
public boolean isDistanceBased()
isDistanceBased
in class Metric
public void setUseIDivergence(boolean useID)
public boolean getUseIDivergence()
public void setUseDITCSmoothing(boolean useDITC)
public boolean getUseDITCSmoothing()
public void setAlpha(double alpha)
public double getAlpha()
public double getCurrAlpha()
public void setAlphaDecayRate(double alphaDecayRate)
public double getAlphaDecayRate()
public void updateAlpha()
public Instance getCentroidInstance(Instances instances, boolean fastMode, boolean normalized)
getCentroidInstance
in class LearnableMetric
instances
- objects belonging to a clusterfastMode
- whether fast mode should be used for SparseInstancesnormalized
- normalize centroids for SPKMeans
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-N
Normalize the euclidean distance by vectors lengths
-E
Use exponential conversion from distance to similarity
(default laplacian conversion)
-U
Use unit conversion from similarity to distance (dist=1-sim)
(default laplacian conversion)
-R
The metric is trainable and will be trained using the current MetricLearner
(default non-trainable)
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedprotected java.lang.String getMetricLearnerSpec()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public java.lang.Object clone()
clone
in class LearnableMetric
public static void main(java.lang.String[] args)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |