Class Summary |
Document |
Docment is an abstract class that provides for tokenization
of a document with stop-word removal and an iterator-like interface
similar to StringTokenizer. |
DocumentIterator |
An object for iterating over a set of documents in a directory. |
DocumentReference |
A simple data structure for storing a reference to a document file
that includes information on the length of its document vector. |
Feedback |
Gets and stores information about relevance feedback from the user and computes
an updated query based on original query and retrieved documents that are
rated relevant and irrelevant. |
FileDocument |
A Document stored as a file. |
HashMapPosVector |
A data structure for a "positional" term vector for a document stored
as a HashMap that maps tokens to ArrayList's of Integer's
which are the positions of the token in the document. |
HashMapVector |
A data structure for a term vector for a document stored
as a HashMap that maps tokens to Weight's that store the
weight of that token in the document. |
HTMLFileDocument |
An HTML file document where HTML commands are removed
from the token stream. |
InvertedIndex |
An inverted index for vector-space information retrieval. |
InvertedPosIndex |
An inverted index for vector-space information retrieval. |
Retrieval |
A lightweight object for storing information about a retrieved Document. |
RetrievalPosInfo |
A lightweight object for storing information about a retrieved Document
for a positional inverted index that includes vector-space and proximity |
TextFileDocument |
A normal ASCII text file Document |
TextStringDocument |
A simple document represented by a String |
TokenInfo |
A lightweight object for storing information about a token (a.k.a word, term)
in an inverted index. |
TokenOccurrence |
A lightweight object for storing information about an occurrence of a token (a.k.a word, term)
in a Document. |
TokenPositionInfo |
A lightweight object for storing information about positions of a token (a.k.a word, term)
in some document. |
TokenPosOccurrence |
A lightweight object for storing information about an occurrence of a token (a.k.a word, term)
in a Document, including an array of the positions at which it occurs. |