ir.webutils
Class RobotExclusionSet
java.lang.Object
|
+--java.util.AbstractCollection
|
+--java.util.AbstractSet
|
+--ir.webutils.RobotExclusionSet
- All Implemented Interfaces:
- java.util.Collection, java.util.Set
- public class RobotExclusionSet
- extends java.util.AbstractSet
- implements java.util.Set
RobotExclusionSet provides support for the Robots Exclusion
Protocol. This class provides the ability to parse a robots.txt
file and to check files to make sure that access to them has not
been disallowed by the robots.txt file. This class can also be
used to exclude files linked to on a page that specifies NOFOLLOW
in its Robots META tag.
Constructor Summary |
RobotExclusionSet()
Constructs an empty set. |
RobotExclusionSet(java.lang.String site)
Constructs a set containing the paths in the robots.txt file
for this site. |
Method Summary |
boolean |
add(java.lang.Object o)
|
boolean |
contains(java.lang.Object o)
Checks to see if a path is prohibited by this set. |
java.util.Iterator |
iterator()
|
int |
size()
|
Methods inherited from class java.util.AbstractSet |
equals, hashCode, removeAll |
Methods inherited from class java.util.AbstractCollection |
addAll, clear, containsAll, isEmpty, remove, retainAll, toArray, toArray, toString |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface java.util.Set |
addAll, clear, containsAll, equals, hashCode, isEmpty, remove, removeAll, retainAll, toArray, toArray |
RobotExclusionSet
public RobotExclusionSet()
- Constructs an empty set.
RobotExclusionSet
public RobotExclusionSet(java.lang.String site)
- Constructs a set containing the paths in the robots.txt file
for this site. The robots.txt
file should conform to the Robots Exclusion Protocol
specification, available at
http://www.robotstxt.org/wc/norobots.html.
- Parameters:
site
- The name of the site
size
public int size()
- Specified by:
size
in interface java.util.Set
- Overrides:
size
in class java.util.AbstractCollection
add
public boolean add(java.lang.Object o)
- Specified by:
add
in interface java.util.Set
- Overrides:
add
in class java.util.AbstractCollection
iterator
public java.util.Iterator iterator()
- Specified by:
iterator
in interface java.util.Set
- Overrides:
iterator
in class java.util.AbstractCollection
contains
public boolean contains(java.lang.Object o)
- Checks to see if a path is prohibited by this set. A path is
prohibited if it starts with an entry in this set.
- Specified by:
contains
in interface java.util.Set
- Overrides:
contains
in class java.util.AbstractCollection
- Parameters:
o
- A String
object representing the path.- Returns:
true
iff. o
is a
String
object, o
is not
null
, and for each element e in this set
!o.startsWith(e)
.