p5-WWW-Robot

configurable web traversal engine

WWW CVSWeb GITHub
  1. Package version
    p5-WWW-Robot-0.026p0
  2. Maintainer
    The OpenBSD ports mailing-list

This module implements a configurable web traversal engine, for a robot
or other web agent. Given an initial web page (URL), the Robot will get
the contents of that page, and extract all links on the page, adding
them to a list of URLs to visit.

Features of the Robot module include:

* Follows the Robot Exclusion Protocol.
* Supports the META element proposed extensions to the Protocol.
* Implements many of the Guidelines for Robot Writers.
* Configurable.
* Builds on standard Perl 5 modules for WWW, HTTP, HTML, etc.

  • www/p5-HTML-Parser
  • www/p5-HTML-Tree
  • www/p5-URI
  • www/p5-libwww

  • www/p5-HTML-Parser
  • www/p5-HTML-Tree
  • www/p5-URI
  • www/p5-libwww