ir.vsr
Class TextStringDocument

java.lang.Object
  |
  +--ir.vsr.Document
        |
        +--ir.vsr.TextStringDocument

public class TextStringDocument
extends Document

A simple document represented by a String


Field Summary
protected  java.util.StringTokenizer tokenizer
          The tokenizer for this document.
static java.lang.String tokenizerDelim
          StringTokenizer delim for tokenizing only alphabetic strings.
 
Fields inherited from class ir.vsr.Document
nextToken, numStopWords, numTokens, stem, stemmer, stopWords, stopWordsFile
 
Constructor Summary
TextStringDocument(java.lang.String string, boolean stem)
          Create a simple Document for this string
 
Method Summary
protected  java.lang.String getNextCandidateToken()
          Get the next token from this string
static void main(java.lang.String[] args)
          For testing, print the bag-of-words vector for the given string
 
Methods inherited from class ir.vsr.Document
hashMapPosVector, hashMapVector, hasMoreTokens, loadStopWords, nextToken, numberOfTokens, positionOrderedTokenVector, prepareNextToken, printVector
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

tokenizerDelim

public static final java.lang.String tokenizerDelim
StringTokenizer delim for tokenizing only alphabetic strings.

tokenizer

protected java.util.StringTokenizer tokenizer
The tokenizer for this document.
Constructor Detail

TextStringDocument

public TextStringDocument(java.lang.String string,
                          boolean stem)
Create a simple Document for this string
Method Detail

getNextCandidateToken

protected java.lang.String getNextCandidateToken()
Get the next token from this string
Overrides:
getNextCandidateToken in class Document

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
For testing, print the bag-of-words vector for the given string