A B C D F G H I L M N O P Q R S T U V W

A

add(HashMapVector) - Method in class ir.vsr.HashMapVector
Destructively add the given vector to the current vector
add(HashMapVector) - Method in class ir.vsr.HashMapPosVector
Destructively add the given vector to the current vector
addBad(DocumentReference) - Method in class ir.vsr.Feedback
Add a document to the list of those deemed irrelevant
addGood(DocumentReference) - Method in class ir.vsr.Feedback
Add a document to the list of those deemed relevant
addPosition(String, int) - Method in class ir.vsr.HashMapPosVector
Add a new position occurence of a token to the vector
ALPHA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter
arrayListToIntArray(ArrayList) - Static method in class ir.vsr.TokenPosOccurrence
Convert an ArrayList of Integers into an array of ints
averageClosestDistance(int[], int[]) - Static method in class ir.vsr.InvertedPosIndex
Returns the average closest positional distance between an occurrence of the current query token and an occurrence of a specified previous query token.

B

badDocRefs - Variable in class ir.vsr.Feedback
The list of DocumentReference's that were rated irrelevant
BETA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter

C

compareTo(Object) - Method in class ir.vsr.Retrieval
Compares this Retrieval to another for sorting from best to worst.
compareTo(Object) - Method in class ir.vsr.TokenPositionInfo
Compares this TokenPositionInfo to another for sorting from tokens whose first appearance in the document is earlier to those that first appear later.
computeIDFandDocumentLengths() - Method in class ir.vsr.InvertedIndex
Compute the IDF factor for every token in the index and the length of the document vector for every document referenced in the index.
copy() - Method in class ir.vsr.HashMapVector
Produce a copy of this HashMapVector with a new HashMap and new Weight's
copy() - Method in class ir.vsr.HashMapPosVector
Produce a copy of this HashMapVector with a new HashMap and new Weight's
cosineDistanceTo(HashMapVector) - Method in class ir.vsr.HashMapVector
 
count - Variable in class ir.vsr.TokenOccurrence
The number of times it occurs in the Document

D

dirFile - Variable in class ir.vsr.InvertedIndex
The directory from which the indexed documents come.
docRef - Variable in class ir.vsr.Retrieval
A reference to the Document being retrieved
docRef - Variable in class ir.vsr.TokenOccurrence
A reference to the Document where it occurs
docRefs - Variable in class ir.vsr.InvertedIndex
A list of all indexed documents.
docType - Variable in class ir.vsr.InvertedIndex
The type of Documents (text, HTML).
docType - Variable in class ir.vsr.DocumentIterator
The type of documents to be created
Document - class ir.vsr.Document.
Docment is an abstract class that provides for tokenization of a document with stop-word removal and an iterator-like interface similar to StringTokenizer.
Document(boolean) - Constructor for class ir.vsr.Document
Creates a new Document making sure that the stopwords are loaded, indexed, and ready for use.
DocumentIterator - class ir.vsr.DocumentIterator.
An object for iterating over a set of documents in a directory.
DocumentIterator(File) - Constructor for class ir.vsr.DocumentIterator
Create an iterator for TexFileDocuments
DocumentIterator(File, short, boolean) - Constructor for class ir.vsr.DocumentIterator
Create an iterator with these attributes
DocumentIterator(File, short, boolean, FilenameFilter) - Constructor for class ir.vsr.DocumentIterator
Create an iterator with these attributes
DocumentReference - class ir.vsr.DocumentReference.
A simple data structure for storing a reference to a document file that includes information on the length of its document vector.
DocumentReference(FileDocument) - Constructor for class ir.vsr.DocumentReference
Create a reference to this document, initializing its length to 0
DocumentReference(File, double) - Constructor for class ir.vsr.DocumentReference
 

F

feedback - Variable in class ir.vsr.InvertedIndex
Whether relevance feedback using the Ide_regular algorithm is used
Feedback - class ir.vsr.Feedback.
Gets and stores information about relevance feedback from the user and computes an updated query based on original query and retrieved documents that are rated relevant and irrelevant.
Feedback(HashMapVector, Retrieval[], InvertedIndex) - Constructor for class ir.vsr.Feedback
Create a feedback object for this query with initial retrievals to be rated
file - Variable in class ir.vsr.FileDocument
The name of the file
file - Variable in class ir.vsr.DocumentReference
The file where the referenced document is stored.
FileDocument - class ir.vsr.FileDocument.
A Document stored as a file.
FileDocument(File, boolean) - Constructor for class ir.vsr.FileDocument
Creates a FileDocument and initializes its name and reader.
files - Variable in class ir.vsr.DocumentIterator
An array of files in the directory
finalizeProximityScore(RetrievalPosInfo, int) - Static method in class ir.vsr.InvertedPosIndex
Finalize the proximity score for a RetrievalPosInfo by averaging over all possible pairs of query tokens, the average closest distance score for each pair
foundPositions - Variable in class ir.vsr.RetrievalPosInfo
A list of position vectors (int[]) for the positions in which each previously found query token were found in the document
foundTokens - Variable in class ir.vsr.RetrievalPosInfo
A list of previously found query tokens in order

G

GAMMA - Static variable in class ir.vsr.Feedback
A Rochio/Ide algorithm parameter
getDocument(short, boolean) - Method in class ir.vsr.DocumentReference
Get the full Document for this Document reference by recreating it with the given docType and stemming
getFeedback(int) - Method in class ir.vsr.Feedback
Prompt the user for feedback on this numbered retrieval
getNextCandidateToken() - Method in class ir.vsr.Document
Return the next possible token in the document.
getNextCandidateToken() - Method in class ir.vsr.TextStringDocument
Get the next token from this string
getNextCandidateToken() - Method in class ir.vsr.HTMLFileDocument
Return the next non-HTML-command token in the document, or null if none left.
getNextCandidateToken() - Method in class ir.vsr.TextFileDocument
Return the next purely alpha-character token in the document, or null if none left.
goodDocRefs - Variable in class ir.vsr.Feedback
The list of DocumentReference's that were rated relevant

H

hashMap - Variable in class ir.vsr.HashMapVector
The HashMap that stores the mapping of tokens to Weight's
HashMapPosVector - class ir.vsr.HashMapPosVector.
A data structure for a "positional" term vector for a document stored as a HashMap that maps tokens to ArrayList's of Integer's which are the positions of the token in the document.
hashMapPosVector() - Method in class ir.vsr.Document
Returns a hashmap version of the term-vector with positional info for this document, where each token is a key whose value is an ArrayList of Integers of the token positions (not counting stopwords) of this word in the document.
HashMapPosVector() - Constructor for class ir.vsr.HashMapPosVector
 
HashMapVector - class ir.vsr.HashMapVector.
A data structure for a term vector for a document stored as a HashMap that maps tokens to Weight's that store the weight of that token in the document.
hashMapVector() - Method in class ir.vsr.Document
Returns a hashmap version of the term-vector (bag of words) for this document, where each token is a key whose value is the number of times it occurs in the document as stored in a Weight.
HashMapVector() - Constructor for class ir.vsr.HashMapVector
 
hasMoreDocuments() - Method in class ir.vsr.DocumentIterator
Returns true iff there are more documents in this directory
hasMoreTokens() - Method in class ir.vsr.Document
Returns true iff the document contains more tokens
haveFeedback(int) - Method in class ir.vsr.Feedback
Has the user already provided feedback on this numbered retrieval?
HTMLFileDocument - class ir.vsr.HTMLFileDocument.
An HTML file document where HTML commands are removed from the token stream.
HTMLFileDocument(File, boolean) - Constructor for class ir.vsr.HTMLFileDocument
Create a new HTML document for the given file.
HTMLFileDocument(String, boolean) - Constructor for class ir.vsr.HTMLFileDocument
Create a new text document for the given file name.

I

idf - Variable in class ir.vsr.TokenInfo
The IDF (inverse document frequency) factor for this token which indicates how much to weight an occurence.
incorporateToken(String, double, HashMap) - Method in class ir.vsr.InvertedIndex
Retrieve the documents indexed by this token in the inverted index, add it to the retrievalHash if needed, and update its running total score.
incorporateToken(String, int, HashMap) - Method in class ir.vsr.InvertedPosIndex
Retrieve the documents indexed by this token in the inverted index, add it to the retrievalHash if needed, and update its running scores.
increment(String) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by 1.
increment(String, double) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by the given amount.
increment(String, int) - Method in class ir.vsr.HashMapVector
Increment the weight for the given token in the vector by the given int
increment(String, int) - Method in class ir.vsr.HashMapPosVector
Increment the count for the given token in the vector by the given amount.
indexDocuments() - Method in class ir.vsr.InvertedIndex
Index the documents in dirFile.
indexDocuments() - Method in class ir.vsr.InvertedPosIndex
Index the documents in dirFile.
indexDocuments(Vector) - Method in class ir.vsr.InvertedIndex
Index the training files in dirVector.
indexToken(String, ArrayList, DocumentReference) - Method in class ir.vsr.InvertedPosIndex
Add a token occurrence to the index.
indexToken(String, int, DocumentReference) - Method in class ir.vsr.InvertedIndex
Add a token occurrence to the index.
invertedIndex - Variable in class ir.vsr.Feedback
The current InvertedIndex
InvertedIndex - class ir.vsr.InvertedIndex.
An inverted index for vector-space information retrieval.
InvertedIndex(File, short, boolean, boolean) - Constructor for class ir.vsr.InvertedIndex
Create an inverted index of the documents in a directory.
InvertedIndex(Vector, short, boolean, boolean) - Constructor for class ir.vsr.InvertedIndex
Create an inverted index of the documents in a directory.
InvertedPosIndex - class ir.vsr.InvertedPosIndex.
An inverted index for vector-space information retrieval.
InvertedPosIndex(File, short, boolean) - Constructor for class ir.vsr.InvertedPosIndex
 
ir.vsr - package ir.vsr
 
isEmpty() - Method in class ir.vsr.Feedback
Has the user rated any documents yet?
iterator() - Method in class ir.vsr.HashMapVector
Returns an iterator over the MapEntries in the hashMap

L

length - Variable in class ir.vsr.DocumentReference
The length of the corresponding Document vector.
loadStopWords() - Static method in class ir.vsr.Document
Load the stopwords from file to the hashtable where they are indexed.

M

main(String[]) - Static method in class ir.vsr.TextStringDocument
For testing, print the bag-of-words vector for the given string
main(String[]) - Static method in class ir.vsr.HTMLFileDocument
For testing, print the bag-of-words vector for a given HTML file
main(String[]) - Static method in class ir.vsr.TextFileDocument
For testing, print the bag-of-words vector for a given file
main(String[]) - Static method in class ir.vsr.InvertedIndex
Index a directory of files and then interactively accept retrieval queries.
main(String[]) - Static method in class ir.vsr.InvertedPosIndex
Index a directory of files and then interactively accept retrieval queries.
main(String[]) - Static method in class ir.vsr.DocumentIterator
Test by printing the bag-of-words for each file in the given directory
MAX_DISTANCE - Static variable in class ir.vsr.InvertedPosIndex
The maximum measurable distance that can separate two query words.
MAX_RETRIEVALS - Static variable in class ir.vsr.InvertedIndex
The maximum number of retrieved documents for a query to present to the user at a time
maxWeight() - Method in class ir.vsr.HashMapVector
Returns the maximum weight of any token in the vector.
multiply(double) - Method in class ir.vsr.HashMapVector
Destructively multiply the vector by a constant
multiply(int) - Method in class ir.vsr.HashMapPosVector
Destructively multiply the vector by a constant

N

newQuery() - Method in class ir.vsr.Feedback
Use the Ide_regular algorithm to compute a new revised query.
nextDocument() - Method in class ir.vsr.DocumentIterator
Get the next document
nextToken - Variable in class ir.vsr.Document
The next token in the document
nextToken() - Method in class ir.vsr.Document
Returns the next token in the document or null if there are none
numberOfTokens() - Method in class ir.vsr.Document
Returns the total number of tokens in the document or -1 if there are still more tokens to be read and the total count is not yet available.
numStopWords - Static variable in class ir.vsr.Document
The number of stopwords in this file
numTokens - Variable in class ir.vsr.Document
The number of tokens currently read from document

O

occList - Variable in class ir.vsr.TokenInfo
A list of TokenOccurences giving documents where this token occurs

P

position - Variable in class ir.vsr.DocumentIterator
The current position of the iterator in this array
positionOrderedTokenVector() - Method in class ir.vsr.Document
Returns an array of TokenPositionInfo's for each token in the Document ordered by their first appearance in the Document.
positions - Variable in class ir.vsr.TokenPosOccurrence
The positions where it occurs in the Document
positions - Variable in class ir.vsr.TokenPositionInfo
A list of positions (Integers) where the token occurs
prepareNextToken() - Method in class ir.vsr.Document
The nextToken slot is always precomputed and stored by this method.
presentRetrievals(HashMapVector, Retrieval[]) - Method in class ir.vsr.InvertedIndex
Print out a ranked set of retrievals.
print() - Method in class ir.vsr.HashMapVector
Print out the vector showing the tokens and their weights
print() - Method in class ir.vsr.HashMapPosVector
Print out the vector showing the tokens and their positions
print() - Method in class ir.vsr.InvertedIndex
Print out an inverted index by listing each token and the documents it occurs in.
printExtraTokenOccurrenceInfo(TokenOccurrence) - Method in class ir.vsr.InvertedPosIndex
TokenOccurence in an InvertedPosIndex should be a TokenPosOccurrence, so print the positional information for this occurrence.
printRetrievals(Retrieval[], int) - Method in class ir.vsr.InvertedIndex
Print out at most MAX_RETRIEVALS ranked retrievals starting at given starting rank number.
printVector() - Method in class ir.vsr.Document
Compute and print out (one line per term) the term-vector (bag of words) for this document
processQueries() - Method in class ir.vsr.InvertedIndex
Enter an interactive user-query loop, accepting queries and showing the retrieved documents in ranked order.
processQueries() - Method in class ir.vsr.InvertedPosIndex
Enter an interactive user-query loop, accepting queries and showing the retrieved documents in ranked order.
proximityScore - Variable in class ir.vsr.RetrievalPosInfo
The current proximity score component.

Q

queryVector - Variable in class ir.vsr.Feedback
The original query vector for this query

R

reader - Variable in class ir.vsr.FileDocument
The I/O reader for accessing the file
Retrieval - class ir.vsr.Retrieval.
A lightweight object for storing information about a retrieved Document.
Retrieval(DocumentReference, double) - Constructor for class ir.vsr.Retrieval
Create a retrieval with these values
RetrievalPosInfo - class ir.vsr.RetrievalPosInfo.
A lightweight object for storing information about a retrieved Document for a positional inverted index that includes vector-space and proximity
RetrievalPosInfo() - Constructor for class ir.vsr.RetrievalPosInfo
 
retrievals - Variable in class ir.vsr.Feedback
The current list of ranked retrievals
retrieve(Document) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query Document.
retrieve(Document) - Method in class ir.vsr.InvertedPosIndex
Perform ranked retrieval on this input query Document.
retrieve(HashMapVector) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query Document vector.
retrieve(String) - Method in class ir.vsr.InvertedIndex
Perform ranked retrieval on this input query.
retrieve(TokenPositionInfo[]) - Method in class ir.vsr.InvertedPosIndex
Perform ranked retrieval on this input query Document vector.

S

score - Variable in class ir.vsr.Retrieval
The score given to this document by a retrieval engine.
showRetrievals(Retrieval[]) - Method in class ir.vsr.InvertedIndex
Show the top retrievals to the user if there are any.
size() - Method in class ir.vsr.HashMapVector
Returns the number of tokens in the vector.
size() - Method in class ir.vsr.InvertedIndex
Return the number of tokens indexed.
stem - Variable in class ir.vsr.Document
Whether to stem tokens with the Porter stemmer
stem - Variable in class ir.vsr.InvertedIndex
Whether tokens should be stemmed with Porter stemmer
stem - Variable in class ir.vsr.DocumentIterator
Whether tokens should be stemmed with Porter stemmer
stemmer - Static variable in class ir.vsr.Document
The Porter stemmer
stopWords - Static variable in class ir.vsr.Document
The hashtable where stopwords are indexed
stopWordsFile - Static variable in class ir.vsr.Document
The file where a list of stopwords, 1 per line, are stored
subtract(HashMapVector) - Method in class ir.vsr.HashMapVector
Destructively subtract the given vector from the current vector
subtract(HashMapVector) - Method in class ir.vsr.HashMapPosVector
Destructively subtract the given vector from the current vector

T

TextFileDocument - class ir.vsr.TextFileDocument.
A normal ASCII text file Document
TextFileDocument(File, boolean) - Constructor for class ir.vsr.TextFileDocument
Create a new text document for the given file.
TextFileDocument(String, boolean) - Constructor for class ir.vsr.TextFileDocument
Create a new text document for the given file name.
TextStringDocument - class ir.vsr.TextStringDocument.
A simple document represented by a String
TextStringDocument(String, boolean) - Constructor for class ir.vsr.TextStringDocument
Create a simple Document for this string
token - Variable in class ir.vsr.TokenPositionInfo
The token itself
tokenHash - Variable in class ir.vsr.InvertedIndex
A HashMap where tokens are indexed.
TokenInfo - class ir.vsr.TokenInfo.
A lightweight object for storing information about a token (a.k.a word, term) in an inverted index.
TokenInfo() - Constructor for class ir.vsr.TokenInfo
Create an initially empty data structure
tokenizer - Variable in class ir.vsr.TextStringDocument
The tokenizer for this document.
tokenizer - Variable in class ir.vsr.HTMLFileDocument
The tokenizer for lines read from this document.
tokenizer - Variable in class ir.vsr.TextFileDocument
The tokenizer for lines read from this document.
tokenizerDelim - Static variable in class ir.vsr.TextStringDocument
StringTokenizer delim for tokenizing only alphabetic strings.
tokenizerDelim - Static variable in class ir.vsr.HTMLFileDocument
StringTokenizer delim for tokenizing only alphabetic strings.
tokenizerDelim - Static variable in class ir.vsr.TextFileDocument
StringTokenizer delim for tokenizing only alphabetic strings.
TokenOccurrence - class ir.vsr.TokenOccurrence.
A lightweight object for storing information about an occurrence of a token (a.k.a word, term) in a Document.
TokenOccurrence(DocumentReference, int) - Constructor for class ir.vsr.TokenOccurrence
Create an occurrence with these values
TokenPositionInfo - class ir.vsr.TokenPositionInfo.
A lightweight object for storing information about positions of a token (a.k.a word, term) in some document.
TokenPositionInfo(String, ArrayList) - Constructor for class ir.vsr.TokenPositionInfo
Create an initially empty data structure
TokenPosOccurrence - class ir.vsr.TokenPosOccurrence.
A lightweight object for storing information about an occurrence of a token (a.k.a word, term) in a Document, including an array of the positions at which it occurs.
TokenPosOccurrence(DocumentReference, ArrayList) - Constructor for class ir.vsr.TokenPosOccurrence
Create an occurrence with these values
toString() - Method in class ir.vsr.HashMapVector
Return String of the vector showing the tokens and their weights
toString() - Method in class ir.vsr.DocumentReference
 
TYPE_HTML - Static variable in class ir.vsr.DocumentIterator
docType for HTML files
TYPE_TEXT - Static variable in class ir.vsr.DocumentIterator
docType for ASCII text files

U

updateProximityScore(RetrievalPosInfo, TokenPosOccurrence, String) - Static method in class ir.vsr.InvertedPosIndex
Update the proximity score of the RetrievalPosInfo of this retrieved document based on how close the current query token appears to each of the previous query tokens found in this document.

V

vectorScore - Variable in class ir.vsr.RetrievalPosInfo
The current vector-space score component.

W

WRONG_ORDER_PENALTY_FACTOR - Static variable in class ir.vsr.InvertedPosIndex
The multiplicative penalty factor for distance that is incurred when query terms are in the opposite order in the document

A B C D F G H I L M N O P Q R S T U V W