weka.classifiers.meta
Class QBoost

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.DistributionClassifier
          extended byweka.classifiers.EnsembleClassifier
              extended byweka.classifiers.meta.QBoost
All Implemented Interfaces:
ActiveLearner, AdditionalMeasureProducer, java.lang.Cloneable, OptionHandler, java.io.Serializable, Sourcable, WeightedInstancesHandler

public class QBoost
extends EnsembleClassifier
implements ActiveLearner, OptionHandler, WeightedInstancesHandler, Sourcable

This class implements Query-by-Boosting based on Abe and Mamitsuka (ICML 98). Built on class for boosting a classifier using Freund & Schapire's Adaboost M1 method. For more information, see

Yoav Freund and Robert E. Schapire (1996). Experiments with a new boosting algorithm. Proc International Conference on Machine Learning, pages 148-156, Morgan Kaufmann, San Francisco.

Valid options are:

-D
Turn on debugging output.

-W classname
Specify the full class name of a classifier as the basis for boosting (required).

-I num
Set the number of boost iterations (default 10).

-P num
Set the percentage of weight mass used to build classifiers (default 100).

-Q
Use resampling instead of reweighting.

-S seed
Random number seed for resampling (default 1).

Options after -- are passed to the designated classifier.

See Also:
Serialized Form

Field Summary
protected  double[] m_Betas
          Array for storing the weights for the votes.
protected  Classifier m_Classifier
          The model base classifier to use
protected  Classifier[] m_Classifiers
          Array for storing the generated base classifiers.
protected  boolean m_Debug
          Debugging mode, gives extra output if true
protected  int m_MaxIterations
          The maximum number of boost iterations
protected  int m_NumClasses
          The number of classes
protected  int m_NumIterations
          The number of successfully generated base classifiers.
protected  int m_Seed
          Seed for boosting with resampling.
protected  boolean m_UseResampling
          Use boosting with reweighting?
protected  int m_WeightThreshold
          Weight Threshold.
 
Fields inherited from class weka.classifiers.EnsembleClassifier
m_EnsembleWts, m_SumEnsembleWts, m_TrainEnsembleDiversity, m_TrainEnsembleError, m_TrainError
 
Constructor Summary
QBoost()
           
 
Method Summary
 void buildClassifier(Instances data)
          Boosting method.
protected  void buildClassifierUsingResampling(Instances data)
          Boosting method.
protected  void buildClassifierWithWeights(Instances data)
          Boosting method.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 Classifier getClassifier()
          Get the classifier used as the classifier
 boolean getDebug()
          Get whether debugging is turned on
 double[] getEnsemblePredictions(Instance instance)
          Returns class predictions of each ensemble member
 double getEnsembleSize()
          Returns size of ensemble
 double[] getEnsembleWts()
          Returns vote weights of ensemble members.
 int getMaxIterations()
          Get the maximum number of boost iterations
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 int getSeed()
          Get seed for resampling.
 boolean getUseResampling()
          Get whether resampling is turned on
 int getWeightThreshold()
          Get the degree of weight thresholding
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 int[] selectInstances(Instances unlabeledActivePool, int num)
          Given a set of unlabeled examples, select a specified number of examples to be labeled.
protected  Instances selectWeightQuantile(Instances data, double quantile)
          Select only instances with weights that contribute to the specified quantile of the weight distribution
 void setClassifier(Classifier newClassifier)
          Set the classifier for boosting.
 void setDebug(boolean debug)
          Set debugging mode
 void setMaxIterations(int maxIterations)
          Set the maximum number of boost iterations
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSeed(int seed)
          Set seed for resampling.
 void setUseResampling(boolean r)
          Set resampling mode
 void setWeightThreshold(int threshold)
          Set weight threshold
 java.lang.String toSource(java.lang.String className)
          Returns the boosted model as Java source code.
 java.lang.String toString()
          Returns description of the boosted classifier.
 
Methods inherited from class weka.classifiers.EnsembleClassifier
computeEnsembleMeasures, enumerateMeasures, getMeasure, initMeasures, measureTrainEnsembleDiversity, measureTrainEnsembleError, measureTrainError, updateEnsembleStats
 
Methods inherited from class weka.classifiers.DistributionClassifier
calculateEntropy, calculateLabeledInstanceMargin, calculateMargin, classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Classifier

protected Classifier m_Classifier
The model base classifier to use


m_Classifiers

protected Classifier[] m_Classifiers
Array for storing the generated base classifiers.


m_Betas

protected double[] m_Betas
Array for storing the weights for the votes.


m_MaxIterations

protected int m_MaxIterations
The maximum number of boost iterations


m_NumIterations

protected int m_NumIterations
The number of successfully generated base classifiers.


m_WeightThreshold

protected int m_WeightThreshold
Weight Threshold. The percentage of weight mass used in training


m_Debug

protected boolean m_Debug
Debugging mode, gives extra output if true


m_UseResampling

protected boolean m_UseResampling
Use boosting with reweighting?


m_Seed

protected int m_Seed
Seed for boosting with resampling.


m_NumClasses

protected int m_NumClasses
The number of classes

Constructor Detail

QBoost

public QBoost()
Method Detail

selectWeightQuantile

protected Instances selectWeightQuantile(Instances data,
                                         double quantile)
Select only instances with weights that contribute to the specified quantile of the weight distribution

Parameters:
data - the input instances
quantile - the specified quantile eg 0.9 to select 90% of the weight mass
Returns:
the selected instances

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D
Turn on debugging output.

-W classname
Specify the full class name of a classifier as the basis for boosting (required).

-I num
Set the number of boost iterations (default 10).

-P num
Set the percentage of weight mass used to build classifiers (default 100).

-Q
Use resampling instead of reweighting.

-S seed
Random number seed for resampling (default 1).

Options after -- are passed to the designated classifier.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setClassifier

public void setClassifier(Classifier newClassifier)
Set the classifier for boosting.

Parameters:
newClassifier - the Classifier to use.

getClassifier

public Classifier getClassifier()
Get the classifier used as the classifier

Returns:
the classifier used as the classifier

setMaxIterations

public void setMaxIterations(int maxIterations)
Set the maximum number of boost iterations


getMaxIterations

public int getMaxIterations()
Get the maximum number of boost iterations

Returns:
the maximum number of boost iterations

setWeightThreshold

public void setWeightThreshold(int threshold)
Set weight threshold


getWeightThreshold

public int getWeightThreshold()
Get the degree of weight thresholding

Returns:
the percentage of weight mass used for training

setSeed

public void setSeed(int seed)
Set seed for resampling.

Parameters:
seed - the seed for resampling

getSeed

public int getSeed()
Get seed for resampling.

Returns:
the seed for resampling

setDebug

public void setDebug(boolean debug)
Set debugging mode

Parameters:
debug - true if debug output should be printed

getDebug

public boolean getDebug()
Get whether debugging is turned on

Returns:
true if debugging output is on

setUseResampling

public void setUseResampling(boolean r)
Set resampling mode


getUseResampling

public boolean getUseResampling()
Get whether resampling is turned on

Returns:
true if resampling output is on

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Boosting method.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the boosted classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

buildClassifierUsingResampling

protected void buildClassifierUsingResampling(Instances data)
                                       throws java.lang.Exception
Boosting method. Boosts using resampling

Parameters:
data - the training data to be used for generating the boosted classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

buildClassifierWithWeights

protected void buildClassifierWithWeights(Instances data)
                                   throws java.lang.Exception
Boosting method. Boosts any classifier that can handle weighted instances.

Parameters:
data - the training data to be used for generating the boosted classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Specified by:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if instance could not be classified successfully

selectInstances

public int[] selectInstances(Instances unlabeledActivePool,
                             int num)
                      throws java.lang.Exception
Given a set of unlabeled examples, select a specified number of examples to be labeled.

Specified by:
selectInstances in interface ActiveLearner
Parameters:
unlabeledActivePool - pool of unlabeled examples
num - number of examples to selcted for labeling
Throws:
java.lang.Exception - if selective sampling fails

getEnsemblePredictions

public double[] getEnsemblePredictions(Instance instance)
                                throws java.lang.Exception
Returns class predictions of each ensemble member

Specified by:
getEnsemblePredictions in class EnsembleClassifier
Throws:
java.lang.Exception

getEnsembleWts

public double[] getEnsembleWts()
Returns vote weights of ensemble members.

Specified by:
getEnsembleWts in class EnsembleClassifier
Returns:
vote weights of ensemble members

getEnsembleSize

public double getEnsembleSize()
Returns size of ensemble

Specified by:
getEnsembleSize in class EnsembleClassifier

toSource

public java.lang.String toSource(java.lang.String className)
                          throws java.lang.Exception
Returns the boosted model as Java source code.

Specified by:
toSource in interface Sourcable
Parameters:
className - the name that should be given to the source class.
Returns:
the tree as Java source code
Throws:
java.lang.Exception - if something goes wrong

toString

public java.lang.String toString()
Returns description of the boosted classifier.

Returns:
description of the boosted classifier as a string

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options