Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That people actually entertain these ideas is SO WEIRD to me. The last sentence is just obviously false, precisely because the models are software that anyone can trade around. LoRA's, in other words. The fact that original models "might" -- and even that's a big might -- be hard to produce from scratch, so what? Operating systems are hard to produce from scratch, but anyone can grab Linux and do anything.


Your opinion doesn't make any sense to me.

If it takes a hundred million dollars to train a SOTA LLM, who is going to pay for and put that in open source?

Even Stability is starting to put restrictions on their releases (and is rumored to be searching for an acquirer).


The problem to me is another. Yes, there are people willingly to spend time and computing resources to train a model and improve it.

The thing is, with software it's easy to verify (or, quite easy) that a contribution actually improves the code and thus gets accepted in the project.

A modification to an AI model, for how models are made these day, it's completely opaque, a black box. How we can evaluate the modification to ensure that it actually improves the model and does not harm it, or worse introduce malicious behavior (since AI are also used to write code, for example train the AI to write malicious code).

This is the real problem to solve, and to this day it's not solved. Probably the solution is to NOT have LARGE language model but rather have a multitude of small models (small enough that we can retrain them in hours with a normal PC) that are trained and tested individually, maintained by a group of persons in the open source community, and then merged together in a big model, just like a kernel is made up of thousands of individual modules that are then assembled together in a single software.


But at this stage:

why a big model AT ALL?

Which is to say, a useful LLM is probably not going to be complex like an Operating System.

(Now, an LLM that looks fancy and can fool dumb investors, sure, maybe -- but an actually useful one?)


Two possibilities:

1) Why wouldn't Facebook (who looks like they're kind of doing that now) or anyone trying to compete with whoever the big monolith is threatening to be (which looks like OpenAI/Microsoft?)

2) Do they even need to, or is/can LLM development be incremental? I haven't heard anything about Google Fuchsia (sp?) lately either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: