weka.classifiers.bayes
Class NaiveBayes

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.DistributionClassifier
          extended byweka.classifiers.bayes.NaiveBayes
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler
Direct Known Subclasses:
NaiveBayesUpdateable

public class NaiveBayes
extends DistributionClassifier
implements OptionHandler, WeightedInstancesHandler

Class for a Naive Bayes classifier using estimator classes. Numeric estimator precision values are chosen based on analysis of the training data. For this reason, the classifier is not an UpdateableClassifier (which in typical usage are initialized with zero training instances) -- if you need the UpdateableClassifier functionality, use the NaiveBayesUpdateable classifier. The NaiveBayesUpdateable classifier will use a default precision of 0.1 for numeric attributes when buildClassifier is called with zero training instances.

For more information on Naive Bayes classifiers, see

George H. John and Pat Langley (1995). Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. pp. 338-345. Morgan Kaufmann, San Mateo.

Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

See Also:
Serialized Form

Field Summary
protected static double DEFAULT_NUM_PRECISION
          The precision parameter used for numeric attributes
protected  Estimator m_ClassDistribution
          The class estimator.
protected  Estimator[][] m_Distributions
          The attribute estimators.
protected  Instances m_Instances
          The dataset header for the purposes of printing out a semi-intelligible model
protected  int m_NumClasses
          The number of classes (or 1 for numeric class)
protected  boolean m_UseKernelEstimator
          Whether to use kernel density estimator rather than normal distribution for numeric attributes
 
Constructor Summary
NaiveBayes()
           
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 boolean getUseKernelEstimator()
          Gets if kernel estimator is being used.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setUseKernelEstimator(boolean v)
          Sets if kernel estimator is to be used.
 java.lang.String toString()
          Returns a description of the classifier.
 void updateClassifier(Instance instance)
          Updates the classifier with the given instance.
 
Methods inherited from class weka.classifiers.DistributionClassifier
calculateEntropy, calculateLabeledInstanceMargin, calculateMargin, classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Distributions

protected Estimator[][] m_Distributions
The attribute estimators.


m_ClassDistribution

protected Estimator m_ClassDistribution
The class estimator.


m_UseKernelEstimator

protected boolean m_UseKernelEstimator
Whether to use kernel density estimator rather than normal distribution for numeric attributes


m_NumClasses

protected int m_NumClasses
The number of classes (or 1 for numeric class)


m_Instances

protected Instances m_Instances
The dataset header for the purposes of printing out a semi-intelligible model


DEFAULT_NUM_PRECISION

protected static final double DEFAULT_NUM_PRECISION
The precision parameter used for numeric attributes

See Also:
Constant Field Values
Constructor Detail

NaiveBayes

public NaiveBayes()
Method Detail

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Updates the classifier with the given instance.

Parameters:
instance - the new training instance to include in the model
Throws:
java.lang.Exception - if the instance could not be incorporated in the model.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Specified by:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

getUseKernelEstimator

public boolean getUseKernelEstimator()
Gets if kernel estimator is being used.

Returns:
Value of m_UseKernelEstimatory.

setUseKernelEstimator

public void setUseKernelEstimator(boolean v)
Sets if kernel estimator is to be used.

Parameters:
v - Value to assign to m_UseKernelEstimatory.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options