|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object javax.swing.text.html.HTMLEditorKit.ParserCallback ir.webutils.LinkExtractor ir.webutils.AnchoredLinkExtractor
public class AnchoredLinkExtractor
Extractor for AnchoredLink's. Modifies the HTML parser callback routines to also extract and store anchor text for all links.
Field Summary | |
---|---|
protected java.lang.StringBuffer |
anchorText
Buffer to store anchor text encountered between an "a" start tag and end tag. |
protected AnchoredLink |
currentLink
The current link being processed |
Fields inherited from class ir.webutils.LinkExtractor |
---|
links, page, url |
Fields inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback |
---|
IMPLIED |
Constructor Summary | |
---|---|
AnchoredLinkExtractor(HTMLPage page)
Create an anchored link extractor for the given page |
Method Summary | |
---|---|
protected void |
addLink(javax.swing.text.MutableAttributeSet attributes,
javax.swing.text.html.HTML.Attribute attr)
Retrieves a link from an attribute set and completes it against the base URL. |
static void |
appendTag(java.lang.StringBuffer buffer,
javax.swing.text.html.HTML.Tag tag,
javax.swing.text.MutableAttributeSet attributes)
Write this tag with attributes out to the buffer |
void |
handleEndTag(javax.swing.text.html.HTML.Tag tag,
int position)
Executed when a closing HTML tag is found in the document. |
void |
handleSimpleTag(javax.swing.text.html.HTML.Tag tag,
javax.swing.text.MutableAttributeSet attributes,
int position)
Executed when an HTML tag that has no closing tag is found in the document. |
void |
handleStartTag(javax.swing.text.html.HTML.Tag tag,
javax.swing.text.MutableAttributeSet attributes,
int position)
Executed when an opening HTML tag is found in the document. |
void |
handleText(char[] text,
int position)
Executed when a block of text is encountered. |
static void |
main(java.lang.String[] args)
|
Methods inherited from class ir.webutils.LinkExtractor |
---|
extractLinks |
Methods inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback |
---|
flush, handleComment, handleEndOfLineString, handleError |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected java.lang.StringBuffer anchorText
protected AnchoredLink currentLink
Constructor Detail |
---|
public AnchoredLinkExtractor(HTMLPage page)
Method Detail |
---|
public void handleText(char[] text, int position)
handleText
in class LinkExtractor
text
- A char
array representation of the
text.position
- The position of the text in the document.public void handleStartTag(javax.swing.text.html.HTML.Tag tag, javax.swing.text.MutableAttributeSet attributes, int position)
handleStartTag
in class LinkExtractor
tag
- The tag that caused this function to be executed.attributes
- The attributes of tag
.position
- The start of the tag in the document. If the
tag is implied (filled in by the parser but not actually
present in the document) then position
will
correspond to that of the next encountered tag.public static void appendTag(java.lang.StringBuffer buffer, javax.swing.text.html.HTML.Tag tag, javax.swing.text.MutableAttributeSet attributes)
public void handleEndTag(javax.swing.text.html.HTML.Tag tag, int position)
handleEndTag
in class LinkExtractor
tag
- The tag found.position
- The position of the tag in the document.public void handleSimpleTag(javax.swing.text.html.HTML.Tag tag, javax.swing.text.MutableAttributeSet attributes, int position)
handleSimpleTag
in class LinkExtractor
tag
- The tag that caused this function to be executed.attributes
- The attributes of tag
.position
- The start of the tag in the document. If the
tag is implied (filled in by the parser but not actually
present in the document) then position
will
correspond to that of the next encountered tag.protected void addLink(javax.swing.text.MutableAttributeSet attributes, javax.swing.text.html.HTML.Attribute attr)
addLink
in class LinkExtractor
attributes
- The attribute set.attr
- The attribute that should be treated as a URL. For
example, attr
should be
HTML.Attribute.HREF
if attributes
is
from an anchor tag.public static void main(java.lang.String[] args) throws java.lang.Exception
java.lang.Exception
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |