That people actually entertain these ideas is SO WEIRD to me. The last sentence is just obviously false, precisely because the models are software that anyone can trade around. LoRA's, in other words. The fact that original models "might" -- and even that's a big might -- be hard to produce from scratch, so what? Operating systems are hard to produce from scratch, but anyone can grab Linux and do anything.
The problem to me is another. Yes, there are people willingly to spend time and computing resources to train a model and improve it.
The thing is, with software it's easy to verify (or, quite easy) that a contribution actually improves the code and thus gets accepted in the project.
A modification to an AI model, for how models are made these day, it's completely opaque, a black box. How we can evaluate the modification to ensure that it actually improves the model and does not harm it, or worse introduce malicious behavior (since AI are also used to write code, for example train the AI to write malicious code).
This is the real problem to solve, and to this day it's not solved. Probably the solution is to NOT have LARGE language model but rather have a multitude of small models (small enough that we can retrain them in hours with a normal PC) that are trained and tested individually, maintained by a group of persons in the open source community, and then merged together in a big model, just like a kernel is made up of thousands of individual modules that are then assembled together in a single software.
1) Why wouldn't Facebook (who looks like they're kind of doing that now) or anyone trying to compete with whoever the big monolith is threatening to be (which looks like OpenAI/Microsoft?)
2) Do they even need to, or is/can LLM development be incremental? I haven't heard anything about Google Fuchsia (sp?) lately either.