Same though I do wish there was a way to enforce copyright against the giant megacorps (specifically on training AI) that see everything on the Internet as just part of their profit making empire.
Though if I copied one of their things they'd bury me in court until I was either broke or dead.
I don't even think it needs to be in the open. I think the endgame for things like Windows Recall is to train on data on your local machine, and I'm sure they train on things in the cloud whether its openly available or not.
Apple Intelligence seems well positioned to provide some of the best functionality for radical personalization. The potential for new devices to come with additional unified memory intended to run its LLM and vector databasing / additional training, it could use your specific writing style, time certain notifications based on when you’re least/most productive, etc. My guess is that they’re going to make this advanced functionality subscription based, since the vast majority of cases will require Private Cloud instances (unless _maybe_ you’re using a device with a significantly high amount of memory and strong enough M-series processor).
Many people seem to have skewed expectations, but posting on X is no different from publishing a blog post. Unless they're taking similar actions for private posts, this isn’t too surprising. In fact, X is arguably more transparent about it. (Other platforms might not explicitly mention AI, but often include terms in their ToS that allow similar practices.)
It wouldn’t be surprising if Facebook is doing the same, provided it only applies to public posts. Ultimately, if you don’t want your content scraped from the internet, the best defense is not to post it at all.
If I prepend “by reading this message, you agree to not use it for AI training purposes” to my Tweet, why is that any less legitimate that the ToS I implicitly agree to by using Twitter?
I wonder what the ratio of "real human" posts vs mass-produced botspam is like in that dataset. Probably looks like the inside of a mortgage-backed security in 2006.
Model collapse. "No longer acquiring any real new intelligence" would actually be a big breakthrough, I think - with current techniques we don't just stop improving, but start degrading. If LLMs are blurry jpegs of the entire corpus of human knowledge, then it's easy to imagine what happens when you start making a jpeg from a jpeg.
X's new terms of service, effective November 15, 2024, now allow the platform to use public posts to train its AI models. Users' content can be collected and adapted for various uses, which has raised privacy concerns.
Is it only for public posts, or also private ones ?
I wouldn't post private information in a public area, but I happen to exchange adresses or account numbers in private messages, as I would do in emails. Not on X since I'm not on the platform, but any other one will do the same if not already done (e.g. Reddit).