Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> allowing Googlebot to crawl my sites

As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.

 help



> They have no obligation to respect your wishes

I have no obligation to not send all scraper-looking traffic to a black hole full of zip bombs.


There's always poison fountain - deliberately wrong source code.

You do have an obligation because what you are describing is illegal, at least in the US under the CFAA.

Everything worth doing is illegal. Advice for startups is don't worry about it. Worry about not being caught instead.

Okay, nix the zip bombs. What's my obligation to treat bot-shaped traffic as something I should reply to?

Your obligation is not to cause harm to the requester.

Sending them a zip bomb didn't cause harm. It was their choice to unzip it. Is jwz liable if a child sees his testicle eggcup macro when visiting via HN?

Why send a zip bomb if your goal is not to cause some amount of a denial of service attack to the crawler.

Showing porn to a minor is not legal either.


Spreading malware to your website's visitors is wild and illegal in most jurisdictions. I certainly wouldn't confess about it online.

Is AI a visitor or malware? It certainly steals paid resources (bandwidth).

Disclaimer: his website is for hosting malware for "testing" purposes. Testing how well AI can't deal with it.


Malware? It's just a large file. A very, very large file.

But fine. How about I just...don't respond to those requests at all. I have no obligation to send them data period.


except google does respect robots.txt so you do have a choice?

still respects robots.txt



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: