Welcome to MobyThreads.com!
FAQFAQ      ProfileProfile    Private MessagesPrivate Messages   Log inLog in
All support for the MobyThreads Threaded phpBB MOD can now be found on welsolutions at this forum

How to block Wget/PHP/Perl bots?

 
   Web Hosting and Web Master Forums (Home) -> Webmaster RSS
Next:  Help With CGI Site Search Engine  
Author Message
user535

External


Since: Jul 23, 2004
Posts: 34



(Msg. 1) Posted: Sat Jul 24, 2004 10:39 am
Post subject: How to block Wget/PHP/Perl bots?
Archived from groups: alt>www>webmaster, others (more info?)

It is so easy to change user-agent string, so don't even bother. I don't
know how yahoo blocks all. Here is what they did it

Please try to user Perl's useragent even changed the useragent to Mozzila to
get this url http://news.yahoo.com

If I use IE, it takes me 5 seconds, but with perl, it takes me 15 minutes.
Yahoo is first sending a cookie, but there is more, because using LYNX it is
ok to view the page.

So, I hope I can implement something similar. I am thinking using Javascript
to test it, since all PHP/Perl/Wget will not activate javascripts.

However, when googlebot come, I goto welcome it.

So is there a way not to display content to googlebot and human eyes, but
block all PHP/Perl/Wget even when they id themselves as Mozzila?

 >> Stay informed about: How to block Wget/PHP/Perl bots? 
Back to top
Login to vote
usenet200407

External


Since: Jul 12, 2004
Posts: 88



(Msg. 2) Posted: Sat Jul 24, 2004 12:57 pm
Post subject: Re: How to block Wget/PHP/Perl bots? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Wow wrote:

 > If I use IE, it takes me 5 seconds, but with perl, it takes me 15 minutes.
 > Yahoo is first sending a cookie, but there is more, because using LYNX it is
 > ok to view the page.

Lynx does support cookies you know?

 > So is there a way not to display content to googlebot and human eyes,
 > but block all PHP/Perl/Wget even when they id themselves as Mozzila?

No.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ <a style='text-decoration: underline;' href="http://tobyinkster.co.uk/contact" target="_blank">http://tobyinkster.co.uk/contact</a>
Now Playing ~ ./bruce_springsteen/greatest_hits/02_thunder_road.ogg<!-- ~MESSAGE_AFTER~ -->

 >> Stay informed about: How to block Wget/PHP/Perl bots? 
Back to top
Login to vote
user535

External


Since: Jul 23, 2004
Posts: 34



(Msg. 3) Posted: Sat Jul 24, 2004 12:57 pm
Post subject: Re: How to block Wget/PHP/Perl bots? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Toby Inkster" <usenet200407.TakeThisOut@tobyinkster.co.uk> дÈëÓʼþ
news:pan.2004.07.24.08.57.09.190484@tobyinkster.co.uk...
 > Wow wrote:
 >
  > > If I use IE, it takes me 5 seconds, but with perl, it takes me 15
minutes.
  > > Yahoo is first sending a cookie, but there is more, because using LYNX
it is
  > > ok to view the page.
 >
 > Lynx does support cookies you know?

YES, i know and I rejected the cookie, still able to view news.yahoo
 >
  > > So is there a way not to display content to googlebot and human eyes,
  > > but block all PHP/Perl/Wget even when they id themselves as Mozzila?
 >
 > No.
 >
 > --
 > Toby A Inkster BSc (Hons) ARCS
<font color=purple> > Contact Me ~ <a style='text-decoration: underline;' href="http://tobyinkster.co.uk/contact</font" target="_blank">http://tobyinkster.co.uk/contact</font</a>>
 > Now Playing ~ ./bruce_springsteen/greatest_hits/02_thunder_road.ogg
 ><!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: How to block Wget/PHP/Perl bots? 
Back to top
Login to vote
nospam34

External


Since: Oct 20, 2003
Posts: 294



(Msg. 4) Posted: Sun Jul 25, 2004 12:42 pm
Post subject: Re: How to block Wget/PHP/Perl bots? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Wow wrote:

 > It is so easy to change user-agent string, so don't even bother. I don't
 > know how yahoo blocks all. Here is what they did it
 >

By block them? I don't alienate customers just because they want to use
wget.

gtoomey<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: How to block Wget/PHP/Perl bots? 
Back to top
Login to vote
ealfert

External


Since: Sep 14, 2004
Posts: 96



(Msg. 5) Posted: Sun Jul 25, 2004 12:42 pm
Post subject: Re: How to block Wget/PHP/Perl bots? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Gregory Toomey <nospam.RemoveThis@bigpond.com> wrote in news:2mgadlFmd3uuU2@uni-
berlin.de:

 > Wow wrote:
 >
  >> It is so easy to change user-agent string, so don't even bother. I don't
  >> know how yahoo blocks all. Here is what they did it
  >>
 >
 > By block them? I don't alienate customers just because they want to use
 > wget.
 >
 > gtoomey


As long as they are respectful and throttle their requests.


--
Edward Alfert
<a style='text-decoration: underline;' href="http://www.rootmode.com/" target="_blank">http://www.rootmode.com/</a>
Multiple Domain Hosting and Reseller Hosting Plans
Coupon Code (Recurring $5/month Discount): newsgroup<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: How to block Wget/PHP/Perl bots? 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Bots and bandwidth - I noticced bots that come to my site (MSNBot, Googlebot, etc) acounted for .4 GB out of my 20 GB. Is it worth the bandwith to make sure your pages are indexed up to date? Also, anyone know of a good page that lists which bots are good and bad?

Battling Email Bots / Harvestors - What can be done to battle the bots and harvestors? I have had sites up for less than a day and email accounts getting slammed with spam. How do we prevent the bots from picking up the email addresses out of our web pages. Tom

Wget and Curl - Hi folks, After taking a look at my logs from last month, I noticed that Wget and Curl accounted for 391 file views. Now this isn't too much, but I don't want people stealing my content especially not in an automated manner! Firstly, are these hits..

Need Help: wget/archive.org - I'm trying to use wget to pull down a mirror of this site archive... http://web.archive.org/web/20030413185309/www.gizmology.net/lovecraft/ ....suitable for burning to CD and keeping. (The original site has gone by the wayside and I want to snag thi...

block ip? - I just noticed this link in my Cpanel. Why would I want to do this? Does this work like a robots.txt to block crawlers etc. If I know their IP? Heidi -- Now playing: Winamp stopped Recommended Hosting: http://www.page-zone.com/..
   Web Hosting and Web Master Forums (Home) -> Webmaster All times are: Pacific Time (US & Canada) (change)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]