|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.Classifier
weka.classifiers.DistributionClassifier
weka.classifiers.functions.SMO
Implements John C. Platt's sequential minimal optimization algorithm for training a support vector classifier using polynomial or RBF kernels. This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes by default. (Note that the coefficients in the output are based on the normalized/standardized data, not the original data.) Multi-class problems are solved using pairwise classification. To obtain proper probability estimates, use the option that fits logistic regression models to the outputs of the support vector machine. In the multi-class case the predicted probabilities will be coupled using Hastie and Tibshirani's pairwise coupling method. Note: for improved speed standardization should be turned off when operating on SparseInstances.
For more information on the SMO algorithm, see
J. Platt (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. Advances in Kernel Methods - Support Vector Learning, B. Sch?lkopf, C. Burges, and A. Smola, eds., MIT Press.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy, Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation, 13(3), pp 637-649, 2001.
Valid options are:
-C num
The complexity constant C. (default 1)
-E num
The exponent for the polynomial kernel. (default 1)
-G num
Gamma for the RBF kernel. (default 0.01)
-N <0|1|2>
Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-F
Feature-space normalization (only for non-linear polynomial kernels).
-O
Use lower-order terms (only for non-linear polynomial kernels).
-R
Use the RBF kernel. (default poly)
-A num
Sets the size of the kernel cache. Should be a prime number.
(default 1000003)
-T num
Sets the tolerance parameter. (default 1.0e-3)
-P num
Sets the epsilon for round-off error. (default 1.0e-12)
-M
Fit logistic models to SVM outputs.
-V num
Number of runs for cross-validation used to generate data
for logistic models. (default -1, use training data)
-W num
Random number seed for cross-validation. (default 1)
Field Summary | |
static int |
FILTER_NONE
|
static int |
FILTER_NORMALIZE
The filter to apply to the training data |
static int |
FILTER_STANDARDIZE
|
static Tag[] |
TAGS_FILTER
|
Constructor Summary | |
SMO()
|
Method Summary | |
void |
buildClassifier(Instances insts)
Method for building the classifier. |
double[] |
distributionForInstance(Instance inst)
Estimates class probabilities for given instance. |
boolean |
getBuildLogisticModels()
Get the value of buildLogisticModels. |
double |
getC()
Get the value of C. |
int |
getCacheSize()
Get the size of the kernel cache |
double |
getEpsilon()
Get the value of epsilon. |
double |
getExponent()
Get the value of exponent. |
boolean |
getFeatureSpaceNormalization()
Check whether feature spaces is being normalized. |
SelectedTag |
getFilterType()
Gets how the training data will be transformed. |
double |
getGamma()
Get the value of gamma. |
boolean |
getLowerOrderTerms()
Check whether lower-order terms are being used. |
int |
getNumFolds()
Get the value of numFolds. |
java.lang.String[] |
getOptions()
Gets the current settings of the classifier. |
int |
getRandomSeed()
Get the value of randomSeed. |
double |
getToleranceParameter()
Get the value of tolerance parameter. |
boolean |
getUseRBF()
Check if the RBF kernel is to be used. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
int[] |
obtainVotes(Instance inst)
Returns an array of votes for the given instance. |
double[] |
pairwiseCoupling(double[][] n,
double[][] r)
Implements pairwise coupling. |
void |
setBuildLogisticModels(boolean newbuildLogisticModels)
Set the value of buildLogisticModels. |
void |
setC(double v)
Set the value of C. |
void |
setCacheSize(int v)
Set the value of the kernel cache. |
void |
setEpsilon(double v)
Set the value of epsilon. |
void |
setExponent(double v)
Set the value of exponent. |
void |
setFeatureSpaceNormalization(boolean v)
Set whether feature space is normalized. |
void |
setFilterType(SelectedTag newType)
Sets how the training data will be transformed. |
void |
setGamma(double v)
Set the value of gamma. |
void |
setLowerOrderTerms(boolean v)
Set whether lower-order terms are to be used. |
void |
setNumFolds(int newnumFolds)
Set the value of numFolds. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setRandomSeed(int newrandomSeed)
Set the value of randomSeed. |
void |
setToleranceParameter(double v)
Set the value of tolerance parameter. |
void |
setUseRBF(boolean v)
Set if the RBF kernel is to be used. |
java.lang.String |
toString()
Prints out the classifier. |
void |
turnChecksOff()
Turns off checks for missing values, etc. |
void |
turnChecksOn()
Turns on checks for missing values, etc. |
FastVector |
weights()
Returns the coefficients in sparse format. |
Methods inherited from class weka.classifiers.DistributionClassifier |
calculateEntropy, calculateLabeledInstanceMargin, calculateMargin, classifyInstance |
Methods inherited from class weka.classifiers.Classifier |
forName, makeCopies |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final int FILTER_NORMALIZE
public static final int FILTER_STANDARDIZE
public static final int FILTER_NONE
public static final Tag[] TAGS_FILTER
Constructor Detail |
public SMO()
Method Detail |
public void turnChecksOff()
public void turnChecksOn()
public void buildClassifier(Instances insts) throws java.lang.Exception
buildClassifier
in class Classifier
insts
- the set of training instances
java.lang.Exception
- if the classifier can't be built successfullypublic double[] distributionForInstance(Instance inst) throws java.lang.Exception
distributionForInstance
in class DistributionClassifier
inst
- the instance to be classified
java.lang.Exception
- if distribution could not be
computed successfullypublic double[] pairwiseCoupling(double[][] n, double[][] r)
n
- the sum of weights used to train each modelr
- the probability estimate from each model
public int[] obtainVotes(Instance inst) throws java.lang.Exception
inst
- the instance
java.lang.Exception
- if something goes wrongpublic FastVector weights() throws java.lang.Exception
java.lang.Exception
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-C num
The complexity constant C. (default 1)
-E num
The exponent for the polynomial kernel. (default 1)
-G num
Gamma for the RBF kernel. (default 0.01)
-N <0|1|2>
Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
-F
Feature-space normalization (only for non-linear polynomial kernels).
-O
Use lower-order terms (only for non-linear polynomial kernels).
-R
Use RBF kernel (default poly).
-A num
Sets the size of the kernel cache. Should be a prime number. (default 1000003)
-T num
Sets the tolerance parameter. (default 1.0e-3)
-P num
Sets the epsilon for round-off error. (default 1.0e-12)
-M
Fit logistic models to SVM outputs.
-V num
Number of runs for cross-validation used to generate data
for logistic models. (default -1, use training data)
-W num
Random number seed for cross-validation. (default 1)
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public double getExponent()
public void setExponent(double v)
v
- Value to assign to exponent.public double getGamma()
public void setGamma(double v)
v
- Value to assign to gamma.public double getC()
public void setC(double v)
v
- Value to assign to C.public double getToleranceParameter()
public void setToleranceParameter(double v)
v
- Value to assign to tolerance parameter.public double getEpsilon()
public void setEpsilon(double v)
v
- Value to assign to epsilon.public int getCacheSize()
public void setCacheSize(int v)
v
- Size of kernel cache.public SelectedTag getFilterType()
public void setFilterType(SelectedTag newType)
newType
- the new filtering modepublic boolean getUseRBF()
public void setUseRBF(boolean v)
v
- true if RBFpublic boolean getFeatureSpaceNormalization() throws java.lang.Exception
java.lang.Exception
public void setFeatureSpaceNormalization(boolean v) throws java.lang.Exception
v
- true if feature space is to be normalized.
java.lang.Exception
public boolean getLowerOrderTerms()
public void setLowerOrderTerms(boolean v)
v
- Value to assign to lowerOrder.public boolean getBuildLogisticModels()
public void setBuildLogisticModels(boolean newbuildLogisticModels)
newbuildLogisticModels
- Value to assign to buildLogisticModels.public int getNumFolds()
public void setNumFolds(int newnumFolds)
newnumFolds
- Value to assign to numFolds.public int getRandomSeed()
public void setRandomSeed(int newrandomSeed)
newrandomSeed
- Value to assign to randomSeed.public java.lang.String toString()
public static void main(java.lang.String[] argv)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |