From your own link: > Controlling data usage > In addition to following all robo...

ziml77 · on June 10, 2024

But it also says that Applebot-Extended doesn't crawl webpages and instead this marker is only used to determine what can be done with the pages that were visited by Applebot.

Not that I like an opt-out system, but based on the wording of the docs it is true that if you blocked Applebot then blocking Applebot-Extended isn't necessary.

fotta · on June 10, 2024

Yeah that is true, but I suspect that most publishers that want their content to appear in search but not used for model training will not have blocked Applebot to date (hence the original commenter's argument)

threeseed · on June 10, 2024

Might want to actually read it:

Applebot-Extended does not crawl webpages.

They gave this as an additional control to allow crawling for search but blocking for use in models.

fotta · on June 10, 2024

> There is no AppleBot-Extended. And if you blocked it in the past it remains blocked.

You said there is no Applebot-Extended. The link says otherwise.

ziml77 · on June 10, 2024

It's still true that there's no Applebot-Extended if it isn't crawling pages. Rather it's a marker to ask Applebot to limit what it does with your pages.

thomasahle · on June 10, 2024

Isn't it still true that if people wanted to have their website show up in search in the past (so they didn't block Applebot), then it's too late to mark it as "no training" now, since it's already been scraped?

I guess it can be useful for data published in the future.