p5-WWW-Robot
configurable web traversal engine
WWW CVSWeb GITHub-
Package versionp5-WWW-Robot-0.026p0
-
MaintainerThe OpenBSD ports mailing-list
This module implements a configurable web traversal engine, for a robot
or other web agent. Given an initial web page (URL), the Robot will get
the contents of that page, and extract all links on the page, adding
them to a list of URLs to visit.
Features of the Robot module include:
* Follows the Robot Exclusion Protocol.
* Supports the META element proposed extensions to the Protocol.
* Implements many of the Guidelines for Robot Writers.
* Configurable.
* Builds on standard Perl 5 modules for WWW, HTTP, HTML, etc.
- www/p5-HTML-Parser
- www/p5-HTML-Tree
- www/p5-URI
- www/p5-libwww
- www/p5-HTML-Parser
- www/p5-HTML-Tree
- www/p5-URI
- www/p5-libwww