Specifies the name of the distance metric class that should be used
.... etc.
- See Also:
Clusterer
,
OptionHandler
,
Serialized Form
Method Summary |
protected int |
activePhaseOne(int numQueries)
Phase 1 code for active learning |
protected void |
activePhaseTwoRandom(int numQueries)
Phase 2 code for active learning, random |
protected void |
activePhaseTwoRoundRobin(int numQueries)
Phase 2 code for active learning, with round robin |
protected void |
addMLAndCLTransitiveClosure(int[] indices)
adding other inferred ML and CL links to m_ConstraintsHash, from
m_NeighborSets |
protected int |
askOracle(int X,
int Y)
|
int |
assignInstanceToCluster(Instance instance)
Classifies the instance using the current clustering, without considering constraints |
int |
assignInstanceToClusterWithConstraints(int instIdx)
Classifies the instance using the current clustering considering
constraints, updates cluster assignments |
int[] |
bestInstancesForActiveLearning(int numActive)
Dummy: not implemented for PCKMeans |
InstancePair[] |
bestPairsForActiveLearning(int numActive)
Returns the indices of the best numActive instances for active learning |
void |
buildClusterer(java.util.ArrayList labeledPair,
Instances unlabeledData,
Instances labeledTrain,
int startingIndexOfTest)
Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds |
void |
buildClusterer(Instances data)
Generates a clusterer. |
void |
buildClusterer(Instances labeledData,
Instances unlabeledData,
int classIndex,
int numClusters)
Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds -- NOT USED FOR PCKMeans!!! |
void |
buildClusterer(Instances labeledData,
Instances unlabeledData,
int classIndex,
int numClusters,
int startingIndexOfTest)
Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds |
void |
buildClusterer(Instances data,
int num_clusters)
Cluster given instances to form the specified number of clusters. |
protected void |
calculateObjectiveFunction()
calculates objective function |
int |
clusterInstance(Instance instance)
Checks if instance has to be normalized and classifies the
instance using the current clustering |
protected void |
createCentroids()
Creates the global cluster centroid |
protected void |
DFS_VISIT(int u,
int[] vertexColor)
Recursive subroutine for DFS |
protected void |
DFS()
Main Depth First Search routine |
protected void |
findBestAssignments()
E-step of the KMeans clustering algorithm -- find best cluster assignments |
boolean |
getActive()
get the active level of clusterer |
SelectedTag |
getAlgorithm()
Get the KMeans algorithm type. |
boolean |
getAllExplore()
Return m_AllExplore |
double |
getCannotLinkWeight()
Return the cannot link constraint weight |
java.util.ArrayList |
getClusters()
Computes the clusters from the cluster assignments, for external access |
double |
getDefaultPerturb()
Get default perturbation value |
java.util.HashSet[] |
getIndexClusters()
Computes the clusters from the cluster assignments, for external access |
SelectedTag |
getInstanceOrdering()
Get the instance ordering |
Instances |
getInstances()
Return training instances |
Metric |
getMetric()
Get the distance metric |
boolean |
getMovePointsTillAssignmentStabilizes()
Return m_MovePointsTillAssignmentStabilizes |
double |
getMustLinkWeight()
Return the must link constraint weight |
int |
getNumClusters()
Return the number of clusters |
double |
getObjFunConvergenceDifference()
Get the minimum value of the objective function difference required for convergence |
java.lang.String[] |
getOptions()
Gets the current option settings for the OptionHandler. |
boolean |
getPhaseTwoRandom()
Return m_PhaseTwoRandom |
int |
getRandomSeed()
Return the random number seed |
boolean |
getSeedable()
Is seeding performed? |
Clusterer |
getThisClusterer()
We always want to implement SemiSupClusterer from a class extending Clusterer. |
static java.lang.Double |
getTimeStamp()
Gets a Double representing the current date and time. |
boolean |
getVerbose()
get the verbosity level of the clusterer |
java.util.Enumeration |
listOptions()
Returns an enumeration of all the available options.. |
protected int |
lookupInstanceCluster(Instance instance)
lookup the instance in the checksum hash |
static void |
main(java.lang.String[] args)
Main method for testing this class. |
protected double[] |
meanOrMode(Instances insts)
Fast version of meanOrMode - streamlined from Instances.meanOrMode for efficiency
Does not check for missing attributes, assumes numeric attributes, assumes Sparse instances |
protected void |
nonActivePairwiseInit()
Initialization routine for non-active algorithm |
void |
normalize(Instance inst)
Normalizes Instance or SparseInstance |
protected void |
normalizeByWeight(Instance inst)
This function divides every attribute value in an instance by
the instance weight -- useful to find the mean of a cluster in
Euclidean space |
void |
normalizeInstance(Instance inst)
Normalizes the values of a normal Instance in L2 norm |
void |
normalizeSparseInstance(Instance inst)
Normalizes the values of a SparseInstance in L2 norm |
int |
numberOfClusters()
A duplicate function to conform to Clusterer abstract class. |
double |
objectiveFunction()
returns objective function |
void |
printClusters()
Prints clusters |
void |
printIndexClusters()
Outputs the current clustering |
void |
resetClusterer()
Reset all values that have been learned |
protected void |
runKMeans()
Actual KMeans function |
boolean |
seedable()
We can have clusterers that don't utilize seeding |
void |
seedClusterer(java.util.HashMap seedHash)
Read the seeds from a hastable, where every key is an instance and every value is:
the cluster assignment of that instance
seedVector vector containing seeds |
void |
setActive(boolean active)
set the active level of the clusterer |
void |
setAlgorithm(SelectedTag algo)
Set the KMeans algorithm. |
void |
setAllExplore(boolean b)
Set m_AllExplore |
void |
setCannotLinkWeight(double w)
Set the cannot link constraint weight |
void |
setDefaultPerturb(double p)
Set default perturbation value |
void |
setInstanceOrdering(SelectedTag order)
Set the instance ordering |
void |
setInstances(Instances instances)
Sets training instances |
void |
setMetric(Metric m)
Set the distance metric |
void |
setMovePointsTillAssignmentStabilizes(boolean b)
Set m_MovePointsTillAssignmentStabilizes |
void |
setMustLinkWeight(double w)
Set the must link constraint weight |
void |
setNumClusters(int n)
Set the number of clusters to generate |
void |
setObjFunConvergenceDifference(double objFunConvergenceDifference)
Set the minimum value of the objective function difference required for convergence |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setPhaseTwoRandom(boolean w)
Set m_PhaseTwoRandom |
void |
setRandomSeed(int s)
Set the random number seed |
void |
setSeedable(boolean seedable)
Turn seeding on and off |
void |
setSeedHash(java.util.HashMap seedhash)
Set the m_SeedHash |
void |
setVerbose(boolean verbose)
set the verbosity level of the clusterer |
protected Instance |
sumInstances(Instance inst1,
Instance inst2)
Finds sum of 2 instances (handles sparse and non-sparse) |
protected static void |
testCase()
|
java.lang.String |
toString()
return a string describing this clusterer |
void |
trainClusterer(Instances instances)
Train the clusterer using specified parameters |
protected void |
updateClusterAssignments()
Updates the clusterAssignments for all points after clustering. |
protected void |
updateClusterCentroids()
M-step of the KMeans clustering algorithm -- updates cluster centroids |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
m_Clusters
protected java.util.ArrayList m_Clusters
- holds the instances in the clusters
m_IndexClusters
protected java.util.HashSet[] m_IndexClusters
- holds the instance indices in the clusters
m_ConstraintsHash
protected java.util.HashMap m_ConstraintsHash
- holds the ([instance pair] -> [type of constraint])
mapping. Note that the instance pairs stored in the hash always
have constraint type InstancePair.DONT_CARE_LINK, the actual
link type is stored in the hashed value
m_instanceConstraintHash
protected java.util.HashMap m_instanceConstraintHash
- holds the ([instance i] -> [Arraylist of constraints involving i])
mapping. Note that the instance pairs stored in the Arraylist
have the actual link type
m_AdjacencyList
protected java.util.HashSet[] m_AdjacencyList
- adjacency list for neighborhoods
m_SeedHash
protected java.util.HashSet m_SeedHash
- holds the points involved in the constraints
m_CannotLinkWeight
protected double m_CannotLinkWeight
- weight to be given to each constraint
m_MustLinkWeight
protected double m_MustLinkWeight
- weight to be given to each constraint
m_MaxConstraintsAllowed
protected static final int m_MaxConstraintsAllowed
- the maximum number of cannot-link constraints allowed
- See Also:
- Constant Field Values
m_verbose
protected boolean m_verbose
- verbose?
m_metric
protected Metric m_metric
- distance Metric
m_metricBuilt
protected boolean m_metricBuilt
- has the metric has been constructed? a fix for multiple buildClusterer's
m_isSparseInstance
protected boolean m_isSparseInstance
- indicates whether instances are sparse
m_objFunDecreasing
protected boolean m_objFunDecreasing
- Is the objective function increasing or decreasing? Depends on type
of metric used: for similarity-based metric - increasing,
for distance-based - decreasing
m_Seedable
protected boolean m_Seedable
- Seedable or not (true by default)
m_PhaseTwoRandom
protected boolean m_PhaseTwoRandom
- Round robin or Random in active Phase Two
m_AllExplore
protected boolean m_AllExplore
- Two-phase active learning or All Explore
m_Iterations
protected int m_Iterations
- keep track of the number of iterations completed before convergence
ALGORITHM_SIMPLE
public static final int ALGORITHM_SIMPLE
- Define possible algorithms
- See Also:
- Constant Field Values
ALGORITHM_SPHERICAL
public static final int ALGORITHM_SPHERICAL
- See Also:
- Constant Field Values
TAGS_ALGORITHM
public static final Tag[] TAGS_ALGORITHM
m_Algorithm
protected int m_Algorithm
- algorithm, by default spherical
m_ObjFunConvergenceDifference
protected double m_ObjFunConvergenceDifference
- min difference of objective function values for convergence
m_Objective
protected double m_Objective
- value of objective function
m_TotalTrainWithLabels
protected Instances m_TotalTrainWithLabels
- training instances with labels
m_Instances
protected Instances m_Instances
- training instances
m_checksumHash
protected java.util.HashMap m_checksumHash
- A hash where the instance checksums are hashed
m_checksumCoeffs
protected double[] m_checksumCoeffs
m_StartingIndexOfTest
protected int m_StartingIndexOfTest
- test data -- required to make sure that test points are not
selected during active learning
m_NumActive
protected int m_NumActive
- number of pairs to seed with
m_Active
protected boolean m_Active
- active mode?
m_NumClusters
protected int m_NumClusters
- number of clusters to generate, default is -1 to get it from labeled data
m_NumCurrentClusters
protected int m_NumCurrentClusters
- Number of clusters in the process
m_FastMode
protected boolean m_FastMode
- m_FastMode = true => fast computation of meanOrMode in centroid calculation, useful for high-D data sets
m_FastMode = false => usual computation of meanOrMode in centroid calculation
m_ClusterCentroids
protected Instances m_ClusterCentroids
- holds the cluster centroids
m_GlobalCentroid
protected Instance m_GlobalCentroid
- holds the global centroids
m_DefaultPerturb
protected double m_DefaultPerturb
- holds the default perturbation value for randomPerturbInit
m_MergeThreshold
protected double m_MergeThreshold
- holds the default merge threshold for matchMergeStep
m_ClusterAssignments
protected int[] m_ClusterAssignments
- temporary variable holding cluster assignments while iterating
m_SumOfClusterInstances
protected Instance[] m_SumOfClusterInstances
- temporary variable holding cluster sums while iterating
m_RandomSeed
protected int m_RandomSeed
- holds the random Seed used to seed the random number generator
m_RandomNumberGenerator
protected java.util.Random m_RandomNumberGenerator
- holds the random number generator used in various parts of the code
ORDERING_DEFAULT
public static final int ORDERING_DEFAULT
- Define possible orderings
- See Also:
- Constant Field Values
ORDERING_RANDOM
public static final int ORDERING_RANDOM
- See Also:
- Constant Field Values
ORDERING_SORTED
public static final int ORDERING_SORTED
- See Also:
- Constant Field Values
TAGS_ORDERING
public static final Tag[] TAGS_ORDERING
m_InstanceOrdering
protected int m_InstanceOrdering
m_MovePointsTillAssignmentStabilizes
protected boolean m_MovePointsTillAssignmentStabilizes
- Move points in assignment step till stabilization?
m_NeighborSets
protected java.util.HashSet[] m_NeighborSets
- neighbor list for active learning: points in each cluster neighborhood
PCKMeans
public PCKMeans()
PCKMeans
public PCKMeans(Metric metric)
objectiveFunction
public double objectiveFunction()
- returns objective function
- Specified by:
objectiveFunction
in interface SemiSupClusterer
getThisClusterer
public Clusterer getThisClusterer()
- We always want to implement SemiSupClusterer from a class extending Clusterer.
We want to be able to return the underlying parent class.
- Specified by:
getThisClusterer
in interface SemiSupClusterer
- Returns:
- parent Clusterer class
buildClusterer
public void buildClusterer(Instances labeledData,
Instances unlabeledData,
int classIndex,
int numClusters,
int startingIndexOfTest)
throws java.lang.Exception
- Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds
- Specified by:
buildClusterer
in interface SemiSupClusterer
- Parameters:
labeledData
- labeled instances to be used as seedsunlabeledData
- unlabeled instancesclassIndex
- attribute index in labeledData which holds class infonumClusters
- number of clustersstartingIndexOfTest
- from where test data starts in unlabeledData, useful if clustering is transductive
- Throws:
java.lang.Exception
- if something goes wrong.
buildClusterer
public void buildClusterer(Instances data,
int num_clusters)
throws java.lang.Exception
- Cluster given instances to form the specified number of clusters.
- Parameters:
data
- instances to be clusterednum_clusters
- number of clusters to create
- Throws:
java.lang.Exception
- if something goes wrong.
buildClusterer
public void buildClusterer(java.util.ArrayList labeledPair,
Instances unlabeledData,
Instances labeledTrain,
int startingIndexOfTest)
throws java.lang.Exception
- Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds
- Parameters:
unlabeledData
- unlabeled training (+ test for transductive) instanceslabeledTrain
- labeled training instancesstartingIndexOfTest
- starting index of test set in unlabeled data
- Throws:
java.lang.Exception
- if something goes wrong.
buildClusterer
public void buildClusterer(Instances labeledData,
Instances unlabeledData,
int classIndex,
int numClusters)
throws java.lang.Exception
- Clusters unlabeledData and labeledData (with labels removed),
using labeledData as seeds -- NOT USED FOR PCKMeans!!!
- Parameters:
labeledData
- labeled instances to be used as seedsunlabeledData
- unlabeled instancesclassIndex
- attribute index in labeledData which holds class infonumClusters
- number of clusters
- Throws:
java.lang.Exception
- if something goes wrong.
resetClusterer
public void resetClusterer()
throws java.lang.Exception
- Reset all values that have been learned
- Specified by:
resetClusterer
in interface SemiSupClusterer
- Throws:
java.lang.Exception
setDefaultPerturb
public void setDefaultPerturb(double p)
- Set default perturbation value
- Parameters:
p
- perturbation fraction
getDefaultPerturb
public double getDefaultPerturb()
- Get default perturbation value
- Returns:
- perturbation fraction
setSeedable
public void setSeedable(boolean seedable)
- Turn seeding on and off
- Parameters:
seedable
- should seeding be done?
getSeedable
public boolean getSeedable()
- Is seeding performed?
- Returns:
- is seeding being done?
seedable
public boolean seedable()
- We can have clusterers that don't utilize seeding
activePhaseOne
protected int activePhaseOne(int numQueries)
throws java.lang.Exception
- Phase 1 code for active learning
- Throws:
java.lang.Exception
activePhaseTwoRoundRobin
protected void activePhaseTwoRoundRobin(int numQueries)
throws java.lang.Exception
- Phase 2 code for active learning, with round robin
- Throws:
java.lang.Exception
activePhaseTwoRandom
protected void activePhaseTwoRandom(int numQueries)
throws java.lang.Exception
- Phase 2 code for active learning, random
- Throws:
java.lang.Exception
createCentroids
protected void createCentroids()
throws java.lang.Exception
- Creates the global cluster centroid
- Throws:
java.lang.Exception
addMLAndCLTransitiveClosure
protected void addMLAndCLTransitiveClosure(int[] indices)
throws java.lang.Exception
- adding other inferred ML and CL links to m_ConstraintsHash, from
m_NeighborSets
- Throws:
java.lang.Exception
DFS
protected void DFS()
throws java.lang.Exception
- Main Depth First Search routine
- Throws:
java.lang.Exception
DFS_VISIT
protected void DFS_VISIT(int u,
int[] vertexColor)
throws java.lang.Exception
- Recursive subroutine for DFS
- Throws:
java.lang.Exception
nonActivePairwiseInit
protected void nonActivePairwiseInit()
throws java.lang.Exception
- Initialization routine for non-active algorithm
- Throws:
java.lang.Exception
askOracle
protected int askOracle(int X,
int Y)
normalizeByWeight
protected void normalizeByWeight(Instance inst)
- This function divides every attribute value in an instance by
the instance weight -- useful to find the mean of a cluster in
Euclidean space
- Parameters:
inst
- Instance passed in for normalization (destructive update)
sumInstances
protected Instance sumInstances(Instance inst1,
Instance inst2)
throws java.lang.Exception
- Finds sum of 2 instances (handles sparse and non-sparse)
- Throws:
java.lang.Exception
updateClusterAssignments
protected void updateClusterAssignments()
throws java.lang.Exception
- Updates the clusterAssignments for all points after clustering.
Map assignments from [0,numInstances-1] to [0,numClusters-1]
i.e. from [0 2 2 0 6 6 2] -> [0 1 1 0 2 2 0]
**** NOTE: THIS FUNCTION IS NO LONGER USED!!! ****
- Throws:
java.lang.Exception
printIndexClusters
public void printIndexClusters()
throws java.lang.Exception
- Outputs the current clustering
- Throws:
java.lang.Exception
- if something goes wrong
findBestAssignments
protected void findBestAssignments()
throws java.lang.Exception
- E-step of the KMeans clustering algorithm -- find best cluster assignments
- Throws:
java.lang.Exception
assignInstanceToClusterWithConstraints
public int assignInstanceToClusterWithConstraints(int instIdx)
throws java.lang.Exception
- Classifies the instance using the current clustering considering
constraints, updates cluster assignments
- Returns:
- 1 if the point is moved, 0 otherwise
- Throws:
java.lang.Exception
- if instance could not be classified
successfully
updateClusterCentroids
protected void updateClusterCentroids()
throws java.lang.Exception
- M-step of the KMeans clustering algorithm -- updates cluster centroids
- Throws:
java.lang.Exception
calculateObjectiveFunction
protected void calculateObjectiveFunction()
throws java.lang.Exception
- calculates objective function
- Throws:
java.lang.Exception
buildClusterer
public void buildClusterer(Instances data)
throws java.lang.Exception
- Generates a clusterer. Instances in data have to be
either all sparse or all non-sparse
- Specified by:
buildClusterer
in interface SemiSupClusterer
- Specified by:
buildClusterer
in class Clusterer
- Parameters:
data
- set of instances serving as training data
- Throws:
java.lang.Exception
- if the clusterer has not been
generated successfully
runKMeans
protected void runKMeans()
throws java.lang.Exception
- Actual KMeans function
- Throws:
java.lang.Exception
bestInstancesForActiveLearning
public int[] bestInstancesForActiveLearning(int numActive)
throws java.lang.Exception
- Dummy: not implemented for PCKMeans
- Specified by:
bestInstancesForActiveLearning
in interface ActiveLearningClusterer
- Throws:
java.lang.Exception
bestPairsForActiveLearning
public InstancePair[] bestPairsForActiveLearning(int numActive)
throws java.lang.Exception
- Returns the indices of the best numActive instances for active learning
- Specified by:
bestPairsForActiveLearning
in interface ActiveLearningClusterer
- Throws:
java.lang.Exception
clusterInstance
public int clusterInstance(Instance instance)
throws java.lang.Exception
- Checks if instance has to be normalized and classifies the
instance using the current clustering
- Specified by:
clusterInstance
in class Clusterer
- Parameters:
instance
- the instance to be assigned to a cluster
- Returns:
- the number of the assigned cluster as an integer
if the class is enumerated, otherwise the predicted value
- Throws:
java.lang.Exception
- if instance could not be classified
successfully
lookupInstanceCluster
protected int lookupInstanceCluster(Instance instance)
- lookup the instance in the checksum hash
- Parameters:
instance
- instance to be looked up
- Returns:
- the index of the cluster to which the instance was assigned, -1 if the instance has not bee clustered
assignInstanceToCluster
public int assignInstanceToCluster(Instance instance)
throws java.lang.Exception
- Classifies the instance using the current clustering, without considering constraints
- Parameters:
instance
- the instance to be assigned to a cluster
- Returns:
- the number of the assigned cluster as an integer
if the class is enumerated, otherwise the predicted value
- Throws:
java.lang.Exception
- if instance could not be classified
successfully
setCannotLinkWeight
public void setCannotLinkWeight(double w)
- Set the cannot link constraint weight
getCannotLinkWeight
public double getCannotLinkWeight()
- Return the cannot link constraint weight
setMustLinkWeight
public void setMustLinkWeight(double w)
- Set the must link constraint weight
getMustLinkWeight
public double getMustLinkWeight()
- Return the must link constraint weight
getPhaseTwoRandom
public boolean getPhaseTwoRandom()
- Return m_PhaseTwoRandom
setPhaseTwoRandom
public void setPhaseTwoRandom(boolean w)
- Set m_PhaseTwoRandom
getAllExplore
public boolean getAllExplore()
- Return m_AllExplore
setAllExplore
public void setAllExplore(boolean b)
- Set m_AllExplore
getNumClusters
public int getNumClusters()
- Return the number of clusters
- Specified by:
getNumClusters
in interface SemiSupClusterer
numberOfClusters
public int numberOfClusters()
- A duplicate function to conform to Clusterer abstract class.
- Specified by:
numberOfClusters
in class Clusterer
- Returns:
- the number of clusters generated for a training dataset.
setSeedHash
public void setSeedHash(java.util.HashMap seedhash)
- Set the m_SeedHash
setRandomSeed
public void setRandomSeed(int s)
- Set the random number seed
- Parameters:
s
- the seed
getRandomSeed
public int getRandomSeed()
- Return the random number seed
setMovePointsTillAssignmentStabilizes
public void setMovePointsTillAssignmentStabilizes(boolean b)
- Set m_MovePointsTillAssignmentStabilizes
- Parameters:
b
- truth value
getMovePointsTillAssignmentStabilizes
public boolean getMovePointsTillAssignmentStabilizes()
- Return m_MovePointsTillAssignmentStabilizes
setObjFunConvergenceDifference
public void setObjFunConvergenceDifference(double objFunConvergenceDifference)
- Set the minimum value of the objective function difference required for convergence
- Parameters:
objFunConvergenceDifference
- the minimum value of the objective function difference required for convergence
getObjFunConvergenceDifference
public double getObjFunConvergenceDifference()
- Get the minimum value of the objective function difference required for convergence
setInstances
public void setInstances(Instances instances)
- Sets training instances
getInstances
public Instances getInstances()
- Return training instances
- Specified by:
getInstances
in interface SemiSupClusterer
- Returns:
- Instances used for clustering, or null
setNumClusters
public void setNumClusters(int n)
- Set the number of clusters to generate
- Specified by:
setNumClusters
in interface SemiSupClusterer
- Parameters:
n
- the number of clusters to generate
setMetric
public void setMetric(Metric m)
- Set the distance metric
- Specified by:
setMetric
in interface SemiSupClusterer
getMetric
public Metric getMetric()
- Get the distance metric
setAlgorithm
public void setAlgorithm(SelectedTag algo)
- Set the KMeans algorithm. Values other than
ALGORITHM_SIMPLE or ALGORITHM_SPHERICAL will be ignored
- Parameters:
algo
- algorithm type
getAlgorithm
public SelectedTag getAlgorithm()
- Get the KMeans algorithm type. Will be one of
ALGORITHM_SIMPLE or ALGORITHM_SPHERICAL
setInstanceOrdering
public void setInstanceOrdering(SelectedTag order)
- Set the instance ordering
- Parameters:
order
- instance ordering
getInstanceOrdering
public SelectedTag getInstanceOrdering()
- Get the instance ordering
seedClusterer
public void seedClusterer(java.util.HashMap seedHash)
- Read the seeds from a hastable, where every key is an instance and every value is:
the cluster assignment of that instance
seedVector vector containing seeds
- Specified by:
seedClusterer
in interface SemiSupClusterer
- Parameters:
seedHash
- HashMap of seeding parameters
printClusters
public void printClusters()
throws java.lang.Exception
- Prints clusters
- Throws:
java.lang.Exception
getClusters
public java.util.ArrayList getClusters()
throws java.lang.Exception
- Computes the clusters from the cluster assignments, for external access
- Specified by:
getClusters
in interface SemiSupClusterer
- Throws:
java.lang.Exception
- if clusters could not be computed successfully
getIndexClusters
public java.util.HashSet[] getIndexClusters()
throws java.lang.Exception
- Computes the clusters from the cluster assignments, for external access
- Throws:
java.lang.Exception
- if clusters could not be computed successfully
listOptions
public java.util.Enumeration listOptions()
- Description copied from interface:
OptionHandler
- Returns an enumeration of all the available options..
- Specified by:
listOptions
in interface OptionHandler
- Returns:
- an enumeration of all available options.
getOptions
public java.lang.String[] getOptions()
- Description copied from interface:
OptionHandler
- Gets the current option settings for the OptionHandler.
- Specified by:
getOptions
in interface OptionHandler
- Returns:
- the list of current option settings as an array of strings
setOptions
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
- Parses a given list of options.
- Specified by:
setOptions
in interface OptionHandler
- Parameters:
options
- the list of options as an array of strings
- Throws:
java.lang.Exception
- if an option is not supported
toString
public java.lang.String toString()
- return a string describing this clusterer
- Returns:
- a description of the clusterer as a string
setActive
public void setActive(boolean active)
- set the active level of the clusterer
- Parameters:
active
-
getActive
public boolean getActive()
- get the active level of clusterer
- Returns:
- active
setVerbose
public void setVerbose(boolean verbose)
- set the verbosity level of the clusterer
- Specified by:
setVerbose
in interface SemiSupClusterer
- Parameters:
verbose
- messages on(true) or off (false)
getVerbose
public boolean getVerbose()
- get the verbosity level of the clusterer
- Returns:
- messages on(true) or off (false)
trainClusterer
public void trainClusterer(Instances instances)
throws java.lang.Exception
- Train the clusterer using specified parameters
- Specified by:
trainClusterer
in interface SemiSupClusterer
- Parameters:
instances
- Instances to be used for training
- Throws:
java.lang.Exception
normalize
public void normalize(Instance inst)
throws java.lang.Exception
- Normalizes Instance or SparseInstance
- Parameters:
inst
- Instance to be normalized
- Throws:
java.lang.Exception
normalizeInstance
public void normalizeInstance(Instance inst)
throws java.lang.Exception
- Normalizes the values of a normal Instance in L2 norm
- Parameters:
inst
- Instance to be normalized
- Throws:
java.lang.Exception
normalizeSparseInstance
public void normalizeSparseInstance(Instance inst)
throws java.lang.Exception
- Normalizes the values of a SparseInstance in L2 norm
- Parameters:
inst
- SparseInstance to be normalized
- Throws:
java.lang.Exception
meanOrMode
protected double[] meanOrMode(Instances insts)
- Fast version of meanOrMode - streamlined from Instances.meanOrMode for efficiency
Does not check for missing attributes, assumes numeric attributes, assumes Sparse instances
getTimeStamp
public static java.lang.Double getTimeStamp()
- Gets a Double representing the current date and time.
eg: 1:46pm on 20/5/1999 -> 19990520.1346
- Returns:
- a value of type Double
main
public static void main(java.lang.String[] args)
- Main method for testing this class.
testCase
protected static void testCase()