The idea of a recursive LLM is discussed at length as an AI safety issue: https:...

andyk · on March 21, 2023

Thanks for the pointer! I hadn't read this before. I enjoyed it and yeah it's definitely relevant. I knew many folks have been thinking about this stuff, and it is great to accumulate more pointers to any related work.

I added a section called "Big picture goal and related work" to the readme in my repo and my blog post (which is a copy-paste of the readme) and cited this article by `veedrac`:

>Also, the idea of recursive prompts was explored in detail in Optimality is the tiger, and agents are its teeth[6] (thanks to mitthrowaway2 on Hackernews for the pointer).

mitthrowaway2 · on March 22, 2023

Haha, thank you! There's no need to credit me, but I appreciate it anyway. =)

pka · on March 21, 2023

I'm still reading it, but something caught my eye:

> I interpret there to typically be hand waving on all sides of this issue; people concerned about AI risks from limited models rarely give specific failure cases, and people saying that models need to be more powerful to be dangerous rarely specify any conservative bound on that requirement.

I think these are two sides of the same coin - on one hand, AI safety researchers can very well give very specific failure cases of alignment that don't have any known solutions so far, and take this issue seriously (and have been for years while trying to raise awareness). On the other, finding and specifying that "conservative bound" precisely and in a foolproof way is exactly the holy grail of safety research.

mitthrowaway2 · on March 21, 2023

I think the holy grail of safety research is widely understood to be a recipe for creating a friendly AGI (or, perhaps, a proof that dangerous AGI cannot be made, but that seems even more unlikely). Asking for a conservative lower bound is more like "at least prove that this LLM, which has finite memory and can only answer queries, is not capable of devising and executing a plan to kill all humans", and that turns out to be more difficult than you'd think even though it's not an AGI.