I totally get the fear regarding probabilistic changes being reviewed by probabilistic tools. It's a trap. If we trust AI to write the code and then another AI to review it, we end up with perfectly functioning software that does precisely the wrong thing.
Diffs are still necessary, but they should act as a filter. If a diff is too complex for a human to parse in 5 minutes, it’s bad code, even if it runs. We need to force AI to write "atomically" and clearly; otherwise we're building legacy code that's unmaintainable without that same AI
Agreed - that trap is very real.
The open question for me is what we do when atomic, 5min readable diffs are the right goal but not realistically achievable always.
My gut says we need better deterministic signals to reduce noise before human review. Not to replace it.
Diffs are still necessary, but they should act as a filter. If a diff is too complex for a human to parse in 5 minutes, it’s bad code, even if it runs. We need to force AI to write "atomically" and clearly; otherwise we're building legacy code that's unmaintainable without that same AI