Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> LLMs currently statistically regurgitate existing data.

This is clearly not true in any meaningful sense - c.f. the Othello paper, examples from the top of this very comment thread, etc.

> Can it come up with a corridor when it has no idea that such a concept exists?

Unless I'm missing something, the person I replied to is claiming that it categorically cannot come up with a concept it hasn't been trained on. I'm disagreeing - if a model knows about rooms and doors and floorplans, there's no obvious reason why it mightn't think up an arrangement of those things that would be novel to the people who trained it. If you think the matter remains to be seen, then I'm not sure what you disagree with me about.



In my experience, it can certainly be coaxed into discussing novel concepts that transcend existing knowledge. I'm having fun getting it to explain what a hybrid of a Nelson Enfilade data structure combined with a tensegrity data structure is and if that system is novel and brings any benefits, very interesting and novel afaik.


It seems like every time someone says that it's doing something novel, they present an example of interpolation between existing concepts.

This is useful, but the source of novelty here is the prompt; the rest is the work of interpolation.

This is all very reminiscent of image generation. There too, novelty is limited to interpolation.


yes, but isn't that in itself novel, what is it that you want the system to do?


> if a model knows about rooms and doors and floorplans, there's no obvious reason why it mightn't think up an arrangement of those things that would be novel to the people who trained it.

Once again, you're missing the point.

In 16th century people also knew about floors, and rooms, and floorpalns. And yet, the first architect to use a coridor used it for the first time in 1597.

What other "corridors" are missing from LLMs' training data? And we're sure it can come up with such a missing concept?

The Othello paper and the examples (are you referring to the example of coming up with new words?) are doing the same thing: they feed the model well-defined pre-established rules that can be statistically combined. The "novel ideas" are not even nearly novel because, well, they follow the established rules.

Could the model invent reversi/othello had it not known about it beforehand? Could the model invent new words (or a new language) had it not known about how to do that beforehand (there's plenty of research on both)? Can it satisfactorily do either even now (for some definition of satisfactorily)?

People believe it can only because the training set is quite vast and the work done is beyond any shadow of the doubt brilliant. That is why the invention of new words seems amazing and novel to many people while others even with a superficial armchair knowledge of linguistics are nonplussed. And so on.


> Could the model invent reversi/othello had it not known about it beforehand?

You've practically restated the paper's findings! :D The LLM knew nothing about othello; it wasn't shown any rules to be recombined. It was shown only sequences of 60 distinct tokens - effectively sentences in an unknown language. The LLM then inferred a model to predict the grammar of that language, and the authors demonstrated that its model functioned like an othello board.


> You've practically restated the paper's findings! :D The LLM knew nothing about othello; it wasn't shown any rules to be recombined.

Literal quote from the paper:

"As a first step, we train a language model (a GPT variant we call Othello-GPT) to extend partial game transcripts (a list of moves made by players) with legal moves."

And then:

"Nonetheless, our model is able to generate legal Othello moves with high accuracy".

So:

- it knows about the game because it was literally shown the game with only the legal moves

- it doesn't produce legal moves all the time (even though it does so with high accuracy)

That's why I say "the work done is beyond any shadow of the doubt brilliant". Because this is a definite leap forward from the status quo. However, it doesn't imply that the models can invent/predict/come up with novel ways of doing something. This is still strictly within the realm of "given existing data, give back a statistically relevant response".

Could it actually invent Reversi/Othello had it not known about it beforehand?


> it was literally shown the game with only the legal moves

It's shown token sequences only. It has no idea they represent a game, or that the game has legal and illegal moves. And more importantly, it has no idea that each token modifies the state of a gameboard, or that simulating how that gameboard changes after every token is the only way to understand the token's grammar. It invents all that.

> Could it actually invent Reversi/Othello had it not known about it beforehand?

You mean, could an LLM invent othello even if its training material made no mention of the game or its rules? Presumptively, of course - why not? Suppose you go make up an arbitrary board game right now. If you then ask ChatGPT-4 to invent a boardgame of its own, nothing excludes the possibility that it will describe a game isomorphic to yours. Obviously the odds are very low, but why imagine that it's not possible?


You're presenting an example of inference of rules from given data as a counterexample for novelty. They're not even in the same category of thing. Invention is not learning. Sometimes invention is interpolation, but sometimes it isn't: corridors is an interesting example, because they are not obviously a remix of anything.


No, I presented it as a counterexample to the claim that LLMs just statistically regurgitate existing data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: