The last time I checked, several months ago, only 6 or 7 of the 15 engines
we use had a robots.txt file, and even fewer disallowed access to /cgi-bin
(something like 3 if I remember correctly).  Cyber411 currently ignores
robots.txt, since (if I may rationalize things here) it only grabs one page
(the results page) and then, only under the direction of a human using
Cyber411 (at that time).  In this reguard, it is a human controlled agent
acting on behalf of a human, who is sitting there looking at the results as
they come in [1].
  I ask this because it isn't that inconceivable to see a plug-in (or a
separate program) being written that does what Cyber411 does (maybe without
the ads).  At what point does an agent NEED to follow the (n)robots.txt
convention?  Since I think current versions of Lynx allow the following:
	lynx -traverse http://www.cyber411.com/
(I think I have the correct option) which can be just as bad as a rogue
robot.
> Apart from that, every other robot/crawler/..  has behaved.
> 
  For what it's worth, Cyber 411 sends the following:
	Agent: Cyber411/version OS/version
	From: www.cyber411.com
(version is currently 0.9.10C and currently will either be run from a
IRIX/5.3.1, Linux/1.2.13 or Linux/2.0.0 system)
  Oh, and while I'm here, is there anyway the Powers that Be that run this
list can have a Reply-To: header added?  If I'm not careful, I'll end up
sending mail to an individual when it was intended for this list (and it's
happened a few times).
  -spc (Working on this has piqued my interest in robots though ... )
[1]	I am unaware of anyone using Cyber411 to conduct searches
        autonomously or along the same method of us using the various
        engines.  I personally would be amused at such a thought, although
        the company that hired us to do this would probably see things
        differently 8-)