While the base of your argument is true, it’s also a bit dishonest. LLMs are significantly different than any of these other abstractions because they can’t be reasoned about or meaningfully examined/debugged. They’re also the first of these advances which anyone has claimed would eliminate the need for programmers at all. I don’t believe the C compiler was meant to do my whole job for me.
Cobol and other early high-level languages were designed with the intention of allowing businesspeople to write their own programs so programmers wouldn't be needed. Some people really believed that!
I'd really like to have everything written in Rust, not C. Rust does a lot of verification, verification that is very hard to understand. I'd like to be able to specify a function with a bunch of invariants about the inputs and outputs and have a computer come up with some memory-safe code that satisfies all those invariants and is very optimized, and also have a list of alternative algorithms (maybe you discard this invariant and you can make it O(nLog(n)) instead of O(n^2), maybe you can make it linear in memory and constant in time or vice versa...)
Maybe you can't examine what the LLM is doing, but as things get more advanced we can generate code to do things, and also have it generate executable formal proofs that the code works as advertised.
I agree with the second part of your argument, regarding the assertion that LLMs may eventually replace programmers.
However, I don't understand your claim that an LLM acting as a programming assistant "...can’t be reasoned about or meaningfully examined/debugged."
I type something, and Copilot or whatever generates code which I can then examine directly, and choose to accept or reject. That seems much easier to reason about than what's happening inside a compiler, for example.
If using an LLM meant carefully crafting a complex, precise, formal prompt that specified only one possible output, I might be interested. But then I wonder if the prompt would be very much shorter.
Thinking about it, this depends on which differences we consider aspects of the output program, and which ones we consider trivial differences that don't count. If you say "build an RPG about dragons with a party of magic using heroes" and the LLM spits one out, you reached a level of abstraction where many choices relating to taste and feeling and atmosphere (and gameplay too) are waved aside as trivial details. You might extend the prompt to add a few more, but the whole point of creating a program this way is not to care about most of the details of the resulting experience. Those can be allowed to be generic and bland, right? Unless you care about leaving your personal touch on, say, all of them.
But this localization makes it computationally possible, and has limits.
The qualification and frame problems, combined with the very limited computational power of transformers is another lens.
LLMs being formalized doesn't solve the problem. Fine tuning and RAG can help with domain specificity, but hallucinations are a fundamental feature of LLMs, not a bug.
Either a use case accepts the LLM failure mode (competent, confident, and inevitably wrong) or another model must be found.
Gödel showed us the limits of formalization, unless we find he was wrong, that won't change.
Thanks for your insightful comment. I'll read the links later.
I had just assumed that RNNs were TC, didn't think of limitation put on by bounded precision since I assumed that any bounded precision could be compensated by growing memory module.
> As discussed in the paper and pointed out by the reviewer, the growing memory module is non-differentiable, and so it cannot be trained directly by SGD. We acknowledge this observation.
Two stack FSA/RNN are interesting, but as of now, not usable in practice.
I don't buy that you're actually examining compiled programs. Very few people do. Theoretically you could, but the whole point of the compiler is to find optimizations that you wouldn't think of yourself.
The point of an optimizing compiler is to find optimizations which, crucially, are semantics-preserving. This is the contract that we have with compilers, is the reason that we trust them to transform our code, and is the reason why people get up in arms every time some C compiler starts leveraging undefined behavior in new and exciting ways.
We have no such contract with LLMs. The comparison to compilers is highly mistaken, and feels like how the cryptocurrency folks used to compare cryptocurrency to gestures vaguely "the internet" in an attempt to appropriate legitimacy.
A big feature of compilers is to find optimizations you wouldn't think of. I tried to make the point that compiled output is typically not read by humans
> I don't buy that you're actually examining compiled programs. Very few people do
I take it you don't write C, C++, or any language at that level? It is very common to examine compiled programs to ensure the compiler made critical optimizations. I have done that many times, there are plenty of tools to help you do that.
I think you’re assuming your reference is the correct one. I can’t reason about the assembly language that the compiler spits out, the microcode in the CPU kernel or any of the electronics on the motherboard. That anyone can or not doesn’t change things in my opinion. It’s an arbitrary distinction to say _this_ abstraction is uniquely different in this very specific way.
LLMs are deterministic if you force a seed or disable sampling. They however do not guarantee that small input changes will cause small output changes.
So as the OP said, all the parts are deterministic in this stack. Their behavior is fixed for a given input, and all the parts are interpretable, readable, verifiable and observable.
This is entirely different from LLMs which are opaque even to their designers, and have unpredictable flaws and hallucinations, they are probability machines based on what data they have been exposed to, which means they are not a reliable way to generate programs.
Maybe one day we'll fix this, but the current generation is not very useful for programming because of this.