> but you assume that your opinions aren't formed from a preexisting body of kno...

lordnacho · on April 14, 2023

But this is only an effect of reward maximization. If I write a poem the way everyone writes it, I won't be found out as a fraud. Which is what we've told AI to do, just do whatever you need to get a passing grade like all the human kids in the class.

Why couldn't we create truly novel things by allowing for some failures?

It's a bit like how there was a bunch of new musical styles after the war. Terrible flops mixed in there, but we got some interesting things as well.

throwaway4aday · on April 14, 2023

I think this is something OpenAI is working on, updating their RLHF practices so that there is a wider variety of input shaping the selected responses. There's a huge difference between 3.5 and 4 due to this. Sam Altman has mentioned it a few times in podcasts he's appeared on so they are aware that it is a problem to have too narrow of a selection of reviewers. Ideally, you'd have a really broad selection of people from many different areas of expertise and tacit knowledge train a generalist model. You could then have more specialist models as well that were trained by different professions although you'd probably still want some amount of training from other areas for cross-pollination.

It's kind of crazy how the model learns to act this way, we funnel in more knowledge than any one person would ever be exposed to in their lifetime and it learns some latent structures in language and knowledge from this, then we teach it to interact with humans in a more natural way. It's both different and similar to how humans learn language and then social skills. Makes me wonder what else we could teach it to do through interacting with it.

TeMPOraL · on April 14, 2023

> I think the OP's argument is, the ChatGPT can only average inputs into an output, as opposed to a human mind that can extrapolate.

Well, it doesn't. It determines where the input should be in an absurdly high dimensional vector space, goes there and looks around for what else is there, then picks one of the closest things and returns it as output.

This is not averaging input. If anything, it's averaging training data. But it's not working in the space of all things ever written in the training data - it's working in a much larger space of all things people could have written, given the conceptual relationships learned from everything they wrote that end up in the training set.

XorNot · on April 14, 2023

This hits on the point exactly: latent space isn't a defined by training inputs, it's defined by structure.

If you have the numbers 1=2, 2=4 and 3=6 as training inputs then latent space builds an axis which reflects something like "2x".

That axis is not bound by those inputs - you can stretch right along it to infinity. Which is extrapolation.

TeMPOraL · on April 14, 2023

I take it as the structure being anchored in the training data. But while current models may not extrapolate beyond the boundaries of the training data[0], my understanding is that the training data itself defines points in the latent space, and those points cluster together over time (the point of doing this in the first place), and otherwise the space is quite sparse.

The latent space doesn't represent the set of things in the training data, but rather a (sub)set of things possible to express using bits of training data. That's a very large space, and full of areas corresponding to thoughts never thought or expressed by humans, yet still addressable - still able to be interpolated. Now, there's a saying that all creativity is just novel way of mashing up things that came before. To the extent that is true, an ML model exploring the latent space is creative.

So I guess what I'm saying is, in your "2x" example, you can ask AI what is between 2 and 3, and it will interpolate you f(2.5)=5, and this was not in the training data, and this is creativity, because almost all human inventiveness boils down to poking at fractions between 1 and 3, and only rarely someone manages to expand those boundaries.

It may not sound impressive, but that's because the example is a bunch of numbers on a number line. Current LLMs are dealing with points in couple hundred thousand dimensional space, where pretty much any idea, any semantic meaning you could identify in the training data, is represented as point proximity along some some of those dimensions. Interpolating values in this space is pretty much guaranteed to yield novelty; the problem is that most of the points in that space are, by definition, useless nonsense, so you can't just pick points at random.

--

[0] - Is it really impossible? In pedestrian-level math I'm used to, the difference between interpolation and extrapolation boil down to a parameter taking arbitrary values, instead of being confined to the [0...1] range.