Used for debugging: Input random vektors from file.
major TODOS:
make BIC-Score replaceable by other scores
- See Also:
Clusterer
,
OptionHandler
,
Serialized Form
Method Summary |
java.lang.String |
binValueTipText()
Returns the tip text for this property |
void |
buildClusterer(Instances data)
Generates the X-Means clusterer. |
boolean |
checkForNominalAttributes(Instances data)
Checks for nominal attributes in the dataset. |
int |
clusterInstance(Instance instance)
Classifies a given instance. |
double |
getBinValue()
Gets value that represents true in a new numeric attribute. |
double |
getCutOffFactor()
Gets the cutoff factor. |
int |
getDebugLevel()
Gets the debug level. |
DistanceFunction |
getDistanceF()
Gets the distance function. |
protected java.lang.String |
getDistanceFSpec()
Gets the distance function specification string, which contains the
class name of the distance function class and any options to it |
java.lang.String |
getInputCenterFile()
Gets the name of the file to read the list of centers from. |
KDTree |
getKDTree()
Gets the KDTree class. |
protected java.lang.String |
getKDTreeSpec()
Gets the KDTree specification string, which contains the class name of
the KDTree class and any options to the KDTree |
int |
getMaxIterations()
Gets the maximum number of iterations. |
int |
getMaxKMeans()
Gets the maximum number of iterations in KMeans. |
int |
getMaxKMeansForChildren()
Gets the maximum number of iterations in KMeans. |
int |
getMaxNumClusters()
Gets the maximum number of clusters to generate. |
int |
getMinNumClusters()
Gets the minimum number of clusters to generate. |
Instance |
getNextDebugVektorsInstance(Instances model)
Read an instance from debug vektors file. |
java.lang.String[] |
getOptions()
Gets the current settings of SimpleKMeans. |
java.lang.String |
getOutputCenterFile()
Gets the name of the file to write the list of centers to. |
int |
getSeed()
Gets the random number seed. |
java.lang.String |
globalInfo()
Returns a string describing this clusterer |
void |
initDebugVektorsInput()
Initialises the debug vektor input. |
static double[][] |
initializeRanges(Instances instances,
int[] instList)
Function should be in the Instances class!!
Initializes the minimum and maximum values
based on all instances. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
maxNumClustersTipText()
Returns the tip text for this property |
java.lang.String |
minNumClustersTipText()
Returns the tip text for this property |
int |
numberOfClusters()
Returns the number of clusters. |
static void |
printRanges(Instances model,
double[][] ranges)
Function should be in the Instances class!!
Prints a range. |
java.lang.String |
seedTipText()
Returns the tip text for this property. |
void |
setBinValue(double value)
Sets the distance e value between true and false of binary attributes
and "same" and "different" of nominal attributes |
void |
setCutOffFactor(double i)
Sets a new cutoff factor. |
void |
setDebugLevel(int d)
Sets the debug level. |
void |
setDebugVektorsFile(java.lang.String fileName)
Sets a file name for a file that has the random vektors stored. |
void |
setDistanceF(DistanceFunction distanceF)
gets the "binary" distance value |
void |
setInputCenterFile(java.lang.String fileName)
Sets the name of the file to read the list of centers from. |
void |
setKDTree(KDTree k)
Sets the KDTree class. |
void |
setMaxIterations(int i)
Sets the maximum number of iterations to perform. |
void |
setMaxKMeans(int i)
Set the maximum number of iterations to perform in KMeans |
void |
setMaxKMeansForChildren(int i)
Sets the maximum number of iterations KMeans that is performed
on the child centers. |
void |
setMaxNumClusters(int n)
Sets the maximum number of clusters to generate. |
void |
setMinNumClusters(int n)
Sets the minimum number of clusters to generate. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setOutputCenterFile(java.lang.String fileName)
Sets the name of the file to write the list of centers to. |
void |
setSeed(int s)
Sets the random number seed. |
java.lang.String |
toString()
Return a string describing this clusterer. |
static void |
updateRanges(Instance instance,
int numAtt,
double[][] ranges)
Function should be in the Instances class!!
Updates the minimum and maximum and width values for all the attributes
based on a new instance. |
static void |
updateRangesFirst(Instance instance,
int numAtt,
double[][] ranges)
Function should be in the Instances class!!
Used to initialize the ranges. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
R_LOW
public static int R_LOW
- Index in ranges for LOW and HIGH and WIDTH
R_HIGH
public static int R_HIGH
R_WIDTH
public static int R_WIDTH
D_PRINTCENTERS
public static int D_PRINTCENTERS
D_FOLLOWSPLIT
public static int D_FOLLOWSPLIT
D_CONVCHCLOSER
public static int D_CONVCHCLOSER
D_RANDOMVEKTOR
public static int D_RANDOMVEKTOR
D_KDTREE
public static int D_KDTREE
D_ITERCOUNT
public static int D_ITERCOUNT
D_METH_MISUSE
public static int D_METH_MISUSE
D_CURR
public static int D_CURR
D_GENERAL
public static int D_GENERAL
m_CurrDebugFlag
public boolean m_CurrDebugFlag
XMeans
public XMeans()
globalInfo
public java.lang.String globalInfo()
- Returns a string describing this clusterer
- Returns:
- a description of the evaluator suitable for
displaying in the explorer/experimenter gui
initializeRanges
public static double[][] initializeRanges(Instances instances,
int[] instList)
- Function should be in the Instances class!!
Initializes the minimum and maximum values
based on all instances.
- Parameters:
instList
- list of indexes
printRanges
public static void printRanges(Instances model,
double[][] ranges)
- Function should be in the Instances class!!
Prints a range.
- Parameters:
ranges
- the ranges to print
updateRangesFirst
public static void updateRangesFirst(Instance instance,
int numAtt,
double[][] ranges)
- Function should be in the Instances class!!
Used to initialize the ranges. For this the values
of the first instance is used to save time.
Sets low and high to the values of the first instance and
width to zero.
- Parameters:
instance
- the new instancenumAtt
- number of attributes in the model
updateRanges
public static void updateRanges(Instance instance,
int numAtt,
double[][] ranges)
- Function should be in the Instances class!!
Updates the minimum and maximum and width values for all the attributes
based on a new instance.
- Parameters:
instance
- the new instancenumAtt
- number of attributes in the modelranges
- low, high and width values for all attributes
buildClusterer
public void buildClusterer(Instances data)
throws java.lang.Exception
- Generates the X-Means clusterer.
- Specified by:
buildClusterer
in class Clusterer
- Parameters:
data
- set of instances serving as training data
- Throws:
java.lang.Exception
- if the clusterer has not been
generated successfully
checkForNominalAttributes
public boolean checkForNominalAttributes(Instances data)
- Checks for nominal attributes in the dataset.
Class attribute is ignored.
- Parameters:
data
-
- Returns:
- false if no nominal attributes are present
clusterInstance
public int clusterInstance(Instance instance)
throws java.lang.Exception
- Classifies a given instance.
- Specified by:
clusterInstance
in class Clusterer
- Parameters:
instance
- the instance to be assigned to a cluster
- Returns:
- the number of the assigned cluster as an integer
if the class is enumerated, otherwise the predicted value
- Throws:
if
- instance could not be classified
successfully
java.lang.Exception
- if instance could not be classified
successfully
numberOfClusters
public int numberOfClusters()
- Returns the number of clusters.
- Specified by:
numberOfClusters
in class Clusterer
- Returns:
- the number of clusters generated for a training dataset.
listOptions
public java.util.Enumeration listOptions()
- Returns an enumeration describing the available options.
- Specified by:
listOptions
in interface OptionHandler
- Returns:
- an enumeration of all the available options
minNumClustersTipText
public java.lang.String minNumClustersTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property
maxNumClustersTipText
public java.lang.String maxNumClustersTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property
setMaxIterations
public void setMaxIterations(int i)
throws java.lang.Exception
- Sets the maximum number of iterations to perform.
- Parameters:
i
- the number of iterations
- Throws:
java.lang.Exception
- if i is less than 1
getMaxIterations
public int getMaxIterations()
- Gets the maximum number of iterations.
- Returns:
- the number of iterations
setMaxKMeans
public void setMaxKMeans(int i)
- Set the maximum number of iterations to perform in KMeans
- Parameters:
i
- the number of iterations
getMaxKMeans
public int getMaxKMeans()
- Gets the maximum number of iterations in KMeans.
- Returns:
- the number of iterations
setMaxKMeansForChildren
public void setMaxKMeansForChildren(int i)
throws java.lang.Exception
- Sets the maximum number of iterations KMeans that is performed
on the child centers.
- Parameters:
i
- the number of iterations
- Throws:
java.lang.Exception
getMaxKMeansForChildren
public int getMaxKMeansForChildren()
- Gets the maximum number of iterations in KMeans.
- Returns:
- the number of iterations
setCutOffFactor
public void setCutOffFactor(double i)
throws java.lang.Exception
- Sets a new cutoff factor.
- Parameters:
i
- the new cutoff factor
- Throws:
java.lang.Exception
getCutOffFactor
public double getCutOffFactor()
- Gets the cutoff factor.
- Returns:
- the cutoff factor
setMinNumClusters
public void setMinNumClusters(int n)
- Sets the minimum number of clusters to generate.
- Parameters:
n
- the minimum number of clusters to generate
setMaxNumClusters
public void setMaxNumClusters(int n)
- Sets the maximum number of clusters to generate.
- Parameters:
n
- the maximum number of clusters to generate
binValueTipText
public java.lang.String binValueTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
getBinValue
public double getBinValue()
- Gets value that represents true in a new numeric attribute.
(False is always represented by 0.0.)
- Returns:
- the value that represents true in a new numeric attribute
setBinValue
public void setBinValue(double value)
- Sets the distance e value between true and false of binary attributes
and "same" and "different" of nominal attributes
setDistanceF
public void setDistanceF(DistanceFunction distanceF)
- gets the "binary" distance value
- Parameters:
distanceF
- the distance function with all options set
getDistanceF
public DistanceFunction getDistanceF()
- Gets the distance function.
- Returns:
- the distance function
getDistanceFSpec
protected java.lang.String getDistanceFSpec()
- Gets the distance function specification string, which contains the
class name of the distance function class and any options to it
- Returns:
- the distance function specification string
setDebugVektorsFile
public void setDebugVektorsFile(java.lang.String fileName)
- Sets a file name for a file that has the random vektors stored.
Just used for debugging reasons.
- Parameters:
fileName
- file name for the file to read the random vektors from
initDebugVektorsInput
public void initDebugVektorsInput()
throws java.lang.Exception
- Initialises the debug vektor input.
- Throws:
java.lang.Exception
getNextDebugVektorsInstance
public Instance getNextDebugVektorsInstance(Instances model)
throws java.lang.Exception
- Read an instance from debug vektors file.
- Parameters:
model
- the data model for the instance
- Throws:
java.lang.Exception
setInputCenterFile
public void setInputCenterFile(java.lang.String fileName)
- Sets the name of the file to read the list of centers from.
- Parameters:
fileName
- file name of file to read centers from
setOutputCenterFile
public void setOutputCenterFile(java.lang.String fileName)
- Sets the name of the file to write the list of centers to.
- Parameters:
fileName
- file to write centers to
getInputCenterFile
public java.lang.String getInputCenterFile()
- Gets the name of the file to read the list of centers from.
- Returns:
- filename of the file to read the centers from
getOutputCenterFile
public java.lang.String getOutputCenterFile()
- Gets the name of the file to write the list of centers to.
- Returns:
- filename of the file to write centers to
setKDTree
public void setKDTree(KDTree k)
- Sets the KDTree class.
- Parameters:
k
- a KDTree object with all options set
getKDTree
public KDTree getKDTree()
- Gets the KDTree class.
- Returns:
- flag if KDTrees are used
getKDTreeSpec
protected java.lang.String getKDTreeSpec()
- Gets the KDTree specification string, which contains the class name of
the KDTree class and any options to the KDTree
- Returns:
- the KDTree string.
setDebugLevel
public void setDebugLevel(int d)
- Sets the debug level.
debug level = 0, means no output
- Parameters:
d
- debuglevel
getDebugLevel
public int getDebugLevel()
- Gets the debug level.
- Returns:
- debug level
getMinNumClusters
public int getMinNumClusters()
- Gets the minimum number of clusters to generate.
- Returns:
- the minimum number of clusters to generate
getMaxNumClusters
public int getMaxNumClusters()
- Gets the maximum number of clusters to generate.
- Returns:
- the maximum number of clusters to generate
seedTipText
public java.lang.String seedTipText()
- Returns the tip text for this property.
- Returns:
- tip text for this property
setSeed
public void setSeed(int s)
- Sets the random number seed.
- Parameters:
s
- the seed
getSeed
public int getSeed()
- Gets the random number seed.
- Returns:
- the seed
setOptions
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
- Parses a given list of options.
- Specified by:
setOptions
in interface OptionHandler
- Parameters:
options
- the list of options as an array of strings
- Throws:
java.lang.Exception
- if an option is not supported
getOptions
public java.lang.String[] getOptions()
- Gets the current settings of SimpleKMeans.
- Specified by:
getOptions
in interface OptionHandler
- Returns:
- an array of strings suitable for passing to setOptions
toString
public java.lang.String toString()
- Return a string describing this clusterer.
- Returns:
- a description of the clusterer as a string
main
public static void main(java.lang.String[] argv)
- Main method for testing this class.
- Parameters:
argv
- should contain options