ir.webutils
Class AnchoredDirectorySpider

java.lang.Object
  |
  +--ir.webutils.Spider
        |
        +--ir.webutils.AnchoredSpider
              |
              +--ir.webutils.AnchoredDirectorySpider

public class AnchoredDirectorySpider
extends AnchoredSpider

Anchored spider that limits itself to the directory it started in.


Fields inherited from class ir.webutils.AnchoredSpider
urlMap
 
Fields inherited from class ir.webutils.Spider
count, linksToVisit, maxCount, saveDir, slow, visited, webpr
 
Constructor Summary
AnchoredDirectorySpider()
           
 
Method Summary
 java.util.List getNewLinks(HTMLPage page)
          Gets links from the page that are in or below the starting directory.
protected  void handleUCommandLineOption(java.lang.String value)
          Sets the initial URL from the "-u" argument, then calls the corresponding superclass method.
static void main(java.lang.String[] args)
          Spider the web according to the following command options, but only below the start URL directory and include anchor text of links to page.
 
Methods inherited from class ir.webutils.AnchoredSpider
addAnchorText, doCrawl, processPage
 
Methods inherited from class ir.webutils.Spider
go, handleCCommandLineOption, handleDCommandLineOption, handleSafeCommandLineOption, handleSlowCommandLineOption, linkToHTMLPage, processArgs
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AnchoredDirectorySpider

public AnchoredDirectorySpider()
Method Detail

getNewLinks

public java.util.List getNewLinks(HTMLPage page)
Gets links from the page that are in or below the starting directory.
Overrides:
getNewLinks in class Spider
Returns:
The links on page that are in or below the directory of the first page.

handleUCommandLineOption

protected void handleUCommandLineOption(java.lang.String value)
Sets the initial URL from the "-u" argument, then calls the corresponding superclass method.
Overrides:
handleUCommandLineOption in class AnchoredSpider
Parameters:
value - The value of the "-u" command line argument.

main

public static void main(java.lang.String[] args)
Spider the web according to the following command options, but only below the start URL directory and include anchor text of links to page.