Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How are people supposed to block it when they stole all the data first and then only after that point they decide to even tell anyone what user agent they need to block and how they are planning to exploit your work for their profit.


You just have a rule that says block everything except crawlers: A, B, C.

Also the AppleBot was known about before it appeared in Siri.


So you expect all websites to block FoobarSearch so it never gets off the ground and becomes a big search engine that people know to unblock.

Then FoobarSearch learns to ignore robots.txt wildcards, and we're back at square one.

IIRC this happened to DDG or Bing.


Websites have always had the ability to precisely control who has access to their content.

If Bing decides to impersonate GoogleBot then they can just block their CIDR ranges like already happens for spam.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: