logoalt Hacker News

marginalia_nuyesterday at 11:15 PM1 replyview on HN

Robots.txt is great if you're trying to run an above board operation. Much easier than trying to guess how a webmaster wishes the crawler to behave, and then getting angry emails when you guess wrong.


Replies

tardedmemetoday at 2:36 AM

It's not great. It used to be very common that robots.txt would Disallow *, Allow GoogleBot which just entrenches the search engine monopoly. In response to this other search engines just used the rules for GoogleBot instead of the rules for their own crawlers.