You seem to be talking about giving the LLM the responsibility to do both strategy and tactics. In my experience, it can be useful in tactics, but usually fails to understand the concepts of strategy.
My personal feeling is that an LLM is insufficient for strategy and is not the only technology suitable for implementing the tactics. I think it makes more sense to treat each as a module in a system, and build different modules to compete against each other.
Now will it write correct tests? Maybe?