weka.classifiers.bayes
Class NaiveBayesSimple

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.DistributionClassifier
          extended byweka.classifiers.bayes.NaiveBayesSimple
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler
Direct Known Subclasses:
NaiveBayesSimpleSoft

public class NaiveBayesSimple
extends DistributionClassifier
implements OptionHandler, WeightedInstancesHandler

Class for building and using a simple Naive Bayes classifier. Numeric attributes are modelled by a normal distribution. For more information, see

Richard Duda and Peter Hart (1973).Pattern Classification and Scene Analysis. Wiley, New York.

See Also:
Serialized Form

Field Summary
protected  double[][][] m_Counts
          All the counts for nominal attributes.
protected  double[][] m_Devs
          The standard deviations for numeric attributes.
protected  Instances m_Instances
          The instances used for training.
protected  double m_m
          m parameter for Laplace m estimate, corresponding to size of pseudosample
protected  double[][] m_Means
          The means for numeric attributes.
protected  double m_minStdDev
          default minimum standard deviation
protected  double[] m_Priors
          The prior probabilities of the classes.
protected static double NORM_CONST
          Constant for normal distribution.
 
Constructor Summary
NaiveBayesSimple()
           
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 double getM()
          Get Laplace m parameter that controls amouont of smoothing
 double getMinStdDev()
          Get the minimum allowable standard deviation.
 java.lang.String[] getOptions()
          Gets the current settings.
 java.lang.String globalInfo()
          Returns a string describing this clusterer
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options..
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String minStdDevTipText()
          Returns the tip text for this property
 java.lang.String mTipText()
          Returns the tip text for this property
protected  double normalDens(double x, double mean, double stdDev)
          Density function of normal distribution returning log of probability
static void normalizeLogs(double[] logProbs)
          Converts an unormalized vector of logs of probabilities into a normalized distribution that sums to one
protected  void resetOptions()
          Reset to default options
 void setM(double m)
          Set Laplace m parameter that controls amouont of smoothing
 void setMinStdDev(double m)
          Set the minimum value for standard deviation when calculating normal density.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Returns a description of the classifier.
 double[] unNormalizedDistributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 
Methods inherited from class weka.classifiers.DistributionClassifier
calculateEntropy, calculateLabeledInstanceMargin, calculateMargin, classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Counts

protected double[][][] m_Counts
All the counts for nominal attributes.


m_Means

protected double[][] m_Means
The means for numeric attributes.


m_Devs

protected double[][] m_Devs
The standard deviations for numeric attributes.


m_Priors

protected double[] m_Priors
The prior probabilities of the classes.


m_Instances

protected Instances m_Instances
The instances used for training.


NORM_CONST

protected static double NORM_CONST
Constant for normal distribution.


m_minStdDev

protected double m_minStdDev
default minimum standard deviation


m_m

protected double m_m
m parameter for Laplace m estimate, corresponding to size of pseudosample

Constructor Detail

NaiveBayesSimple

public NaiveBayesSimple()
Method Detail

resetOptions

protected void resetOptions()
Reset to default options


globalInfo

public java.lang.String globalInfo()
Returns a string describing this clusterer

Returns:
a description of the evaluator suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options..

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

minStdDevTipText

public java.lang.String minStdDevTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMinStdDev

public void setMinStdDev(double m)
Set the minimum value for standard deviation when calculating normal density. Reducing this value can help prevent arithmetic overflow resulting from multiplying large densities (arising from small standard deviations) when there are many singleton or near singleton values.

Parameters:
m - minimum value for standard deviation

getMinStdDev

public double getMinStdDev()
Get the minimum allowable standard deviation.

Returns:
the minumum allowable standard deviation

mTipText

public java.lang.String mTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getM

public double getM()
Get Laplace m parameter that controls amouont of smoothing


setM

public void setM(double m)
Set Laplace m parameter that controls amouont of smoothing


getOptions

public java.lang.String[] getOptions()
Gets the current settings.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

unNormalizedDistributionForInstance

public double[] unNormalizedDistributionForInstance(Instance instance)
                                             throws java.lang.Exception
Calculates the class membership probabilities for the given test instance. Returns vector of unnormalized logs of probabilities for computational reasons.

Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Specified by:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed

normalizeLogs

public static void normalizeLogs(double[] logProbs)
Converts an unormalized vector of logs of probabilities into a normalized distribution that sums to one


toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

normalDens

protected double normalDens(double x,
                            double mean,
                            double stdDev)
Density function of normal distribution returning log of probability


main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options