Returns the average closest positional distance between an occurrence of the current query token and
an occurrence of a specified previous query token.
Docment is an abstract class that provides for tokenization
of a document with stop-word removal and an iterator-like interface
similar to StringTokenizer.
Gets and stores information about relevance feedback from the user and computes
an updated query based on original query and retrieved documents that are
rated relevant and irrelevant.
Finalize the proximity score for a RetrievalPosInfo by averaging over all possible pairs of
query tokens, the average closest distance score for each pair
A data structure for a "positional" term vector for a document stored
as a HashMap that maps tokens to ArrayList's of Integer's
which are the positions of the token in the document.
Returns a hashmap version of the term-vector with positional info for this
document, where each token is a key whose value is an ArrayList of Integers
of the token positions (not counting stopwords) of this word in the document.
A data structure for a term vector for a document stored
as a HashMap that maps tokens to Weight's that store the
weight of that token in the document.
Returns a hashmap version of the term-vector (bag of words) for this
document, where each token is a key whose value is the number of times
it occurs in the document as stored in a Weight.
A lightweight object for storing information about an occurrence of a token (a.k.a word, term)
in a Document, including an array of the positions at which it occurs.
Update the proximity score of the RetrievalPosInfo of this retrieved document based on
how close the current query token appears to each of the previous query tokens found in this
document.