weka.classifiers.meta
Class QBag

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.DistributionClassifier
          extended byweka.classifiers.EnsembleClassifier
              extended byweka.classifiers.meta.QBag
All Implemented Interfaces:
ActiveLearner, AdditionalMeasureProducer, java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class QBag
extends EnsembleClassifier
implements ActiveLearner, OptionHandler, WeightedInstancesHandler

This class implements Query-by-Bagging based on Abe and Mamitsuka (ICML 98). Built on class for bagging a classifier. For more information, see

Leo Breiman (1996). QBag predictors. Machine Learning, 24(2):123-140.

Valid options are:

-W classname
Specify the full class name of a weak classifier as the basis for bagging (required).

-I num
Set the number of bagging iterations (default 10).

-S seed
Random number seed for resampling (default 1).

-P num
Size of each bag, as a percentage of the training size (default 100).

Options after -- are passed to the designated classifier.

See Also:
Serialized Form

Field Summary
protected  int m_BagSizePercent
          The size of each bag sample, as a percentage of the training size
protected  Classifier m_Classifier
          The model base classifier to use
protected  Classifier[] m_Classifiers
          Array for storing the generated base classifiers.
protected  boolean m_Debug
          Set to true to get debugging output.
protected  boolean m_HardVoteAssignment
          Set true to use hard assignment for ensemble member votes
protected  int m_NumIterations
          The number of iterations.
protected  int m_Seed
          The seed for random number generation.
 
Fields inherited from class weka.classifiers.EnsembleClassifier
m_EnsembleWts, m_SumEnsembleWts, m_TrainEnsembleDiversity, m_TrainEnsembleError, m_TrainError
 
Constructor Summary
QBag()
           
 
Method Summary
 void buildClassifier(Instances data)
          QBag method.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 int getBagSizePercent()
          Gets the size of each bag, as a percentage of the training set size.
 Classifier getClassifier()
          Get the classifier used as the classifier
 double[] getEnsemblePredictions(Instance instance)
          Returns class predictions of each ensemble member
 double getEnsembleSize()
          Returns size of ensemble
 double[] getEnsembleWts()
          Returns vote weights of ensemble members.
 boolean getHardVoteAssignment()
          Get the value of m_HardVoteAssignment.
 int getNumIterations()
          Gets the number of bagging iterations
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 int getSeed()
          Gets the seed for the random number generations
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 int[] selectInstances(Instances unlabeledActivePool, int num)
          Given a set of unlabeled examples, select a specified number of examples to be labeled.
 void setBagSizePercent(int newBagSizePercent)
          Sets the size of each bag, as a percentage of the training set size.
 void setClassifier(Classifier newClassifier)
          Set the classifier for bagging.
 void setHardVoteAssignment(boolean v)
          Set the value of m_HardVoteAssignment.
 void setNumIterations(int numIterations)
          Sets the number of bagging iterations
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSeed(int seed)
          Set the seed for random number generation.
 java.lang.String toString()
          Returns description of the bagged classifier.
 
Methods inherited from class weka.classifiers.EnsembleClassifier
computeEnsembleMeasures, enumerateMeasures, getMeasure, initMeasures, measureTrainEnsembleDiversity, measureTrainEnsembleError, measureTrainError, updateEnsembleStats
 
Methods inherited from class weka.classifiers.DistributionClassifier
calculateEntropy, calculateLabeledInstanceMargin, calculateMargin, classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_HardVoteAssignment

protected boolean m_HardVoteAssignment
Set true to use hard assignment for ensemble member votes


m_Debug

protected boolean m_Debug
Set to true to get debugging output.


m_Classifier

protected Classifier m_Classifier
The model base classifier to use


m_Classifiers

protected Classifier[] m_Classifiers
Array for storing the generated base classifiers.


m_NumIterations

protected int m_NumIterations
The number of iterations.


m_Seed

protected int m_Seed
The seed for random number generation.


m_BagSizePercent

protected int m_BagSizePercent
The size of each bag sample, as a percentage of the training size

Constructor Detail

QBag

public QBag()
Method Detail

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-W classname
Specify the full class name of a weak classifier as the basis for bagging (required).

-I num
Set the number of bagging iterations (default 10).

-S seed
Random number seed for resampling (default 1).

-P num
Size of each bag, as a percentage of the training size (default 100).

Options after -- are passed to the designated classifier.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

getHardVoteAssignment

public boolean getHardVoteAssignment()
Get the value of m_HardVoteAssignment.

Returns:
value of m_HardVoteAssignment.

setHardVoteAssignment

public void setHardVoteAssignment(boolean v)
Set the value of m_HardVoteAssignment.

Parameters:
v - Value to assign to m_HardVoteAssignment.

setClassifier

public void setClassifier(Classifier newClassifier)
Set the classifier for bagging.

Parameters:
newClassifier - the Classifier to use.

getClassifier

public Classifier getClassifier()
Get the classifier used as the classifier

Returns:
the classifier used as the classifier

getBagSizePercent

public int getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.

Returns:
the bag size, as a percentage.

setBagSizePercent

public void setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.

Parameters:
newBagSizePercent - the bag size, as a percentage.

setNumIterations

public void setNumIterations(int numIterations)
Sets the number of bagging iterations


getNumIterations

public int getNumIterations()
Gets the number of bagging iterations

Returns:
the maximum number of bagging iterations

setSeed

public void setSeed(int seed)
Set the seed for random number generation.

Parameters:
seed - the seed

getSeed

public int getSeed()
Gets the seed for the random number generations

Returns:
the seed for the random number generation

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
QBag method.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the bagged classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Specified by:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance to be classified
Returns:
preedicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed successfully

selectInstances

public int[] selectInstances(Instances unlabeledActivePool,
                             int num)
                      throws java.lang.Exception
Given a set of unlabeled examples, select a specified number of examples to be labeled.

Specified by:
selectInstances in interface ActiveLearner
Parameters:
unlabeledActivePool - pool of unlabeled examples
num - number of examples to selected for labeling
Throws:
java.lang.Exception - if selective sampling fails

getEnsemblePredictions

public double[] getEnsemblePredictions(Instance instance)
                                throws java.lang.Exception
Returns class predictions of each ensemble member

Specified by:
getEnsemblePredictions in class EnsembleClassifier
Throws:
java.lang.Exception

getEnsembleWts

public double[] getEnsembleWts()
Returns vote weights of ensemble members.

Specified by:
getEnsembleWts in class EnsembleClassifier
Returns:
vote weights of ensemble members

getEnsembleSize

public double getEnsembleSize()
Returns size of ensemble

Specified by:
getEnsembleSize in class EnsembleClassifier

toString

public java.lang.String toString()
Returns description of the bagged classifier.

Returns:
description of the bagged classifier as a string

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options