ir.eval
Class Experiment
java.lang.Object
|
+--ir.eval.Experiment
- Direct Known Subclasses:
- ExperimentRelFeedback
- public class Experiment
- extends java.lang.Object
Contains methods for running evaluation experiments for information
retrieval, specifically the generation of recall-precision curves
for a given test corpus of query/relevant-documents pairs.
Field Summary |
java.io.File |
corpusDir
The directory from which the indexed documents come. |
java.io.File |
outFile
The output file where final recall/precision result data is printed. |
java.io.File |
queryFile
The file with the list of queries and results to be tested. |
static double[] |
RECALL_LEVELS
The standard recall levels for which we want to plot precision values |
Constructor Summary |
Experiment(java.io.File corpusDir,
java.io.File queryFile,
java.io.File outFile,
short docType,
boolean stem)
Create an Experiment object for generating Recall/Precision curves |
Method Summary |
static void |
main(java.lang.String[] args)
Evaluate retrieval preformance on a given query test corpus and
generate a recall/precision graph. |
void |
makeRpCurve()
Process and evaluate all queries and generate recall-precision curve |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RECALL_LEVELS
public static final double[] RECALL_LEVELS
- The standard recall levels for which we want to plot precision values
corpusDir
public java.io.File corpusDir
- The directory from which the indexed documents come.
queryFile
public java.io.File queryFile
- The file with the list of queries and results to be tested.
Assumes this file consists of 3 lines for each query:
1) A line of text for the query.
2) A line of filenames from corpusDir that are relevant to this
query, filenames must be separated by a space.
3) A blank line as a separator from the next query.
outFile
public java.io.File outFile
- The output file where final recall/precision result data is printed.
Experiment
public Experiment(java.io.File corpusDir,
java.io.File queryFile,
java.io.File outFile,
short docType,
boolean stem)
throws java.io.IOException
- Create an Experiment object for generating Recall/Precision curves
- Parameters:
corpusDir
- The directory of files to index.queryFile
- The file of query/relevant-docs pairs to evaluate.outFile
- File for output precision/recall data.docType
- The type of documents to index (See docType in DocumentIterator).stem
- Whether tokens should be stemmed with Porter stemmer.
makeRpCurve
public void makeRpCurve()
throws java.io.IOException
- Process and evaluate all queries and generate recall-precision curve
main
public static void main(java.lang.String[] args)
throws java.io.IOException
- Evaluate retrieval preformance on a given query test corpus and
generate a recall/precision graph.
Command format: "Experiment [OPTION]* [DIR] [QUERIES] [OUTFILE]" where:
DIR is the name of the directory whose files should be indexed.
QUERIES is a file of queries paired with relevant docs (see queryFile).
OUTFILE is the name of the file to put the output. The plot
data for the recall precision curve is stored in this file and a
gnuplot file for the graph is the same name with a ".gplot" extension.
OPTIONs can be
"-html" to specify HTML files whose HTML tags should be removed, and
"-stem" to specify tokens should be stemmed with Porter stemmer.