CS 395T- Large-Scale Data Mining

Homework 3

Query retrieval with vector-space model

    The main goal of this homework is to use the vector-space model for query retrieval and evaluate its effectiveness.
      Answer the following questions:

    1. Submit the code for your subroutines.
    2. What is the time complexity of your subroutines? (Hint: they should take time preportional to the number of nonzeros in A) Give exact operation counts.
    3. What is the R-precision for each of your query retrieval results?
    4. Plot the average precision-recall curves for the queries assigned to you (scaling scheme tfn.tfn)
    5. Are you satisfied with the output of your query retrieval programs?
    6. What scaling scheme worked best in your results?

Due date: Oct. 11, 2001